a. General Organization of the SMS Standard. The first part of the SMS functional requirements (SMS Standard) included as Appendix 1 of this AC follows the general organization of ISO 9000-2000 and ISO 14001. The first three clauses describe scope and applicability, references, and definitions. The following four clauses address each of the four pillars of SMS, as described previously in paragraph 7b(2).
b. Policy: Setting the Framework.
(1) Safety and Quality: Striking a Balance. As discussed above, the SMS standard uses quality management principles, but the requirements to be managed by the system are based on an objective assessment of safety risk, rather than customer satisfaction with products or other conventional commercial goals. However, management of process quality, with emphasis on those characteristics of those processes that affect safety, is an important aspect of safety management. The standard specifies that the aviation service provider should prescribe both quality and safety policies. The coverage of quality policies is limited in scope to quality in support of safety, although operators are encouraged to integrate their management systems as much as feasible. However, safety objectives should receive primacy where conflicts are identified.
(2) Roles, Responsibilities, and Relationships: The “Three Rs” of Safety Management. Figures 1 and 2 show the relationship between the productive processes of the aviation service provider as well as the joint protective processes of the regulator (FAA) in the form of an oversight system (SMS-O) and the aviation service provider’s SMS (SMS-P). As before, it is important to recognize that the two aviation service provider systems shown (Protection and Production) are functional rather than departmental or organizational depictions. One of the principal roles of the oversight system (SMS-O) is to promulgate risk controls in the form of regulations, standards, and policies. It follows that regulatory compliance, in a manner that accomplishes the regulations’ safety objectives, is also part of the aviation service provider’s role in safety management.
(3) Importance of Executive Management Involvement. The standard specifies that top management is primarily responsible for safety management. Managements must plan, organize, direct, and control employees’ activities and allocate resources to make safety controls effective. A key factor in both quality and safety management is top management’s personal, material involvement in quality and safety activities. The standard also specifies that top management must further clearly delineate safety responsibilities throughout the organization. While it is true that top management must take overall responsibility for safe operations, it also is true that all members of the organization must know their responsibilities and be both empowered and involved with respect to safety.
(4) Procedures and Controls. Two key attributes of systems are procedures and controls. Policies must be translated into procedures in order for them to be applied and organizational controls must be in place to ensure that critical steps are accomplished as designed. Organizations must develop, document, and maintain procedures to carry out their safety policies and objectives. The standard also requires organizations to ensure that employees understand their roles. Moreover, supervisory controls must be used to monitor the accomplishment of the procedures.
c. Safety Risk Management: Setting Requirements for Safety Management. The safety risk management process is used to examine the operational functions of the company and their operational environment to identify hazards and to analyze associated risk. The safety risk management process follows the same sequence of steps as the system safety process model that is used in the FAA’s System Safety training course at the FAA Academy. These are also the same general steps that are used in operational risk management programs within several of the military services.
(1) Systems and Task Analysis. Safety risk management begins with system design. This is true whether the system in question is a physical system, such as an aircraft, or an organizational system such as an operator, maintenance or training establishment. These systems consist of the organizational structures, processes, and procedures, as well as the people, equipment, and facilities used to accomplish the organization’s mission. The system or task descriptions should completely explain the interactions among the hardware, software, people, and environment that make up the system in sufficient detail to identify hazards and perform risk analyses. While systems should be documented, no particular format or is required. System documentation would normally include the operator’s manual system, checklists, organizational charts, and personnel position descriptions. A suggested breakdown of operational and support processes for air operators includes:
(a) Flight operations;
(b) Dispatch/flight following;
(c) Maintenance and inspection;
(d) Cabin safety;
(e) Ground handling and servicing;
(f) Cargo handling; and
(g) Training.
NOTE: Long and excessively detailed system or task descriptions are not necessary as long as they are sufficiently detailed to perform hazard and risk analyses. While sophisticated process development tools and methods are available, simple brainstorming sessions with managers, supervisors, and other employees are often most effective.
(2) Hazard Identification. Hazards in the system and its operating environment must be identified, documented, and controlled. It also requires that the analysis process used to define hazards consider all components of the system, based on the system description described above. The key question to ask during analysis of the system and its operation is “what if?” As with system and task descriptions, judgment is required to determine the adequate level of detail. While identification of every conceivable hazard would be impractical, aviation service providers are expected to exercise due diligence in identifying significant and reasonably foreseeable hazards related to their operations.
(3) Risk Analysis and Assessment. The standard’s risk analysis and risk assessment clauses use a conventional breakdown of risk by its two components: likelihood of occurrence of an injurious mishap and severity of the mishap related to an identified hazard, should it occur. A common tool for risk decision-making and acceptance is a risk matrix similar to those in the U.S. Military Standard (MIL STD 882) and the ICAO Safety Management Manual. Figure 4 shows an example of one such matrix. Operators should develop a matrix that best represents their operational environment. Separate matrices with different risk acceptance criteria may also be developed for long-term versus short-term operations.
(4) Severity and Likelihood Criteria. The definitions and final construction of the matrix is left to the aviation service provider’s organization to design. The definitions of each level of severity and likelihood will be defined in terms that are realistic for the operational environment. This ensures each organization’s decision tools are relevant to their operations and operational environment, recognizing the extensive diversity in this area. An example of severity and likelihood definitions is shown in Table 1 below. Each operator’s specific definitions for severity and likelihood may be qualitative but quantitative measures are preferable, where possible.
TABLE 1. SAMPLE SEVERITY AND LIKELIHOOD CRITERIA12
Put in table here....
(c) Acceptable with Mitigation (Yellow). Where the risk assessment falls into the yellow area, the risk may be accepted under defined conditions of mitigation. An example of this situation would be an assessment of the impact of a non-operational aircraft component for inclusion on a Minimum Equipment List. Defining an Operational (“O”) or Maintenance (“M”) procedure in the MEL would constitute a mitigating action that could make an otherwise unacceptable risk acceptable, as long as the defined procedure was implemented. These situations may also require continued special emphasis in the safety assurance function.
FIGURE 4. SAFETY RISK MATRIXPut safety matrix here....
(6) Other Risk Assessment Tools for Flight and Operational Risk Management. Other tools can also be used for flight or operational risk assessment such as the Controlled Flight into Terrain (CFIT), Approach and Landing Accident Reduction (ALAR), operational control, and ground operations risk assessment tools available from the Flight Safety Foundation (http://www.flightsafety.org/technical_initiatives.html) or the Medallion Foundation (http://www.medallionfoundation.org).
(7) Causal Analysis. Risk analyses should concentrate not only on assigning levels of severity and likelihood but on determining why these particular levels were selected. This is often called “root cause analysis,” and is the first step in developing effective controls to reduce risk to lower levels. Several structured software systems are available to perform root cause analysis. However, in many cases, simple brainstorming sessions among the company’s pilots, mechanics, or dispatchers other experienced subject matter experts is the most effective and affordable method of finding ways to reduce risk. This also has the advantage of involving employees who will ultimately be required to implement the controls developed.
(8) Controlling Risk. After hazards and risk are fully understood though the preceding steps, risk controls must be designed and implemented. These may be additional or changed procedures, new supervisory controls, addition of organizational, hardware, or software aids, changes to training, additional or modified equipment, changes to staffing arrangements, or any of a number of other system changes.
(9) Hierarchy of Controls. The process of selecting or designing controls should be approached in a structured manner. System safety technology and practice has provided a hierarchy or preferred order of control actions that range from most to least effective. Depending on the hazard under scrutiny and its complexity there may be more than one action or strategy that may be applied. Further, the controls may be applied at different times depending on the immediacy of the required action and the complexity of developing more effective controls. For example, it may be appropriate to post warnings while a more effective elimination of the hazard is developed. The hierarchy of controls is:
(a) Design the hazard out – modify the system (this includes hardware/software systems involving physical hazards as well as organizational systems).
(b) Physical guards or barriers – reduce exposure to the hazard or reduce the severity of consequences.
(c) Warnings, advisories, or signals of the hazard.
(d) Procedural changes to avoid the hazard or reduce likelihood or severity of associated risk
(e) Training to avoid the hazard or reduce the likelihood of an associated risk.
(10) Residual and Substitute Risk. It is seldom possible to entirely eliminate risk, even when highly effective controls are used. After these controls are designed but before the system is placed back on line, an assessment must be made of whether the controls are likely to be effective and/or if they introduce new hazards to the system. The latter condition is referred to as “substitute risk,” a situation where “the cure is worse than the disease.” The loop seen in Figure 3 back to the top of the diagram depicts the use of the preceding systems analysis, hazard identification, risk analysis, and risk assessment processes to determine if the modified system is acceptable.
(11) System Operation. When the controls are acceptable, the system is placed into operation. The next process, safety assurance, uses auditing, analysis, and review systems that are familiar from similar quality management systems. These processes are used to monitor the risk controls to ensure they continue to be implemented as designed and continue to be effective in a changing operational environment.
d. Safety Assurance: Managing the Requirements. The safety assurance function applies the processes of quality assurance and internal evaluation to the process of making sure that risk controls, once designed, continue to conform to their requirements and that they continue to be effective in maintaining risk within acceptable levels. These assurance and evaluation functions also provide a basis for continuous improvement.
(1) Relationship between Safety Risk Management, Safety Assurance, and Internal Evaluation. Quality assurance processes concentrate on proving, through collection and analysis of objective evidence, that process requirements have been met. In an SMS, the system’s requirements are based on assessment of risk in the organization’s operation or in the products that it produces, as discussed above. Quality assurance techniques, including internal auditing and evaluation, can be used to determine if risk controls that are designed into the operator’s processes are being practiced and that they perform as designed. The process is, therefore, appropriately termed “safety assurance.” If an operator already has an IEP, it should be reviewed to ensure that it conforms to the SMS safety assurance standards.
NOTE: the safety assurance function does not need to be extensive or complex to be effective. Smaller organizations may find available tools such as the Internal Evaluation Program Audit tools produced by the Medallion Foundation (http://www.medallionfoundation.org) to be a good foundation for their organization’s safety assurance processes.
(2) Role of Other Management Systems. As discussed above, safety assurance uses many of the same practices as those used in quality management systems (QMS). In an SMS however the requirements being managed relate to ensuring risk controls, once designed and put into place, perform in a way that continues to meet their safety objectives. While operators may find it beneficial to integrate their management systems for these other areas, such as quality, employee health and safety, or environmental protection with the SMS, it is beyond the scope of the safety management standard to address these areas directly. Appendix 2 to this AC contains a table of cross-references between ISO standards and other recognized standards for quality (ISO 9000:2000), environmental protection (ISO 14001), and employee health and safety management (BSI OHSAS 18001). These are provided for convenience for organizations that desire to develop integrated management systems or that may already have existing systems in one or more of these areas.
(3) Information for Decisionmaking. Information for safety assurance comes from a variety of sources, including formal program auditing and evaluation, investigations of safety-related events, and continuous process monitoring of day-to-day activities and inputs from employees through employee reporting systems. While each of these types of information sources exist to some degree in every organization, the standard formalizes requirements for each. Specifications for these and other related safety assurance processes are left at a functional level, allowing individual organizations to tailor them to the scope and scale appropriate for their size and type of organization.(4) Internal Audits by Operating Departments. The primary responsibility for safety management rests with those who “own” the operator’s technical processes. It is here where hazards are most directly encountered, where deficiencies in processes contribute to risk, and where direct supervisory control and resource allocation can mitigate the risk to acceptable levels. The standard specifies a responsibility for internal auditing of the operator’s productive processes (the Production/Operation side of Figures 1 and 2). As with other requirements, the standard’s auditing requirements are left at a functional level, allowing for a broad range of complexity, commensurate with the complexity of the organization.
(a) Line Management Responsibilities. Line managers of operational departments have the direct responsibility for quality control and for ensuring that the processes in their areas of responsibility function as designed. Moreover, line organizations are the domain technical experts in any organization and thus the most knowledgeable about the technical processes involved. Line managers of the operational departments should be given the responsibility for monitoring these processes and periodically assessing the status of risk controls though an internal auditing and evaluation program.
(b) Audit Programs and Tools. In order to promote system integration and a minimum of duplication, operators may want to consider using available technical system audit tools such as those provided by the Air Transportation Oversight System (ATOS) or third party tools such as those in the IATA Operational Safety Audit (IOSA). This can be particularly advantageous if the operator is already involved with using these programs.
(5) Internal Evaluation. This function involves evaluation of the technical processes of the operator as well as the SMS-specific functions. Audits conducted for the purpose of this requirement must be conducted by persons or organizations that are functionally independent of the technical process being evaluated. A specialist safety or quality assurance department or another sub-organization as directed by top management may accomplish it. The internal evaluation function also requires auditing and evaluation of the safety management functions, policymaking, safety risk management, safety assurance, and safety promotion. These audits provide the management officials designated responsibility for the SMS to inventory the processes of the SMS itself.
NOTE: In very small organizations, the top management may elect to conduct the internal evaluation function themselves, in conjunction with the management review function.
(6) Integration of Regulatory and Voluntary Programs. The provisions of the SMS standard are not intended to duplicate the functions of required CASS (required for operators under part 121 or part 135 of Title 14 of the Code of Federal Regulations) (14 CFR) or IEPs. In fact, the FAA encourages an integrated approach where these programs are all part of a comprehensive SMS.
(7) External Audits. External audits of the SMS may be conducted by the regulator (FAA), code-share partners, customer organizations, or other third parties selected by the operator. These audits not only provide a strong interface with the oversight system (SMS-O) but also a secondary assurance system. Organizations may elect to have third-party audits of their SMS from organizations such as the IATA or other consultant organizations.
Available at: http://www.faa.gov/safety/programs_initiatives/oversight/atos/library/data_collection
(8) Analysis and Assessment. Audits and other information-gathering activities are useful to management only if the information is distilled into a meaningful form and conclusions are drawn to form a bottom line. Recall that the primary purpose of the safety assurance process is to assess the continued effectiveness of risk controls put into place by the safety risk management process. Where significant deviations to existing controls are discovered, the standard requires a structured, documented process for preventive and corrective action to place the controls back on track.
(9) Corrective Action and Followup. The safety assurance process should include procedures that ensure that corrective actions are developed in response to findings of audits and evaluations and to verify their timely and effective implementation. Organizational responsibility for the development and implementation of corrective actions should reside with the operational departments cited in audit and evaluation findings. If new hazards are discovered, the safety risk management process should be employed to determine if new risk controls should be developed.
(10) Monitoring the Environment. As part of the safety assurance function, the analysis and assessment functions must alert the organization to significant changes in the operating environment, possibly indicating a need for system change to maintain effective risk control. When this occurs, the results of the assessment start the safety risk management process, as depicted in Figure 3.
e. Safety Promotion: Supporting the Culture. An organizational safety effort cannot succeed by mandate or strictly though a mechanistic implementation of policy. As in the case of attitudes where individual people are concerned, organizational cultures set the tone that predisposes the organization’s behavior. An organization’s culture consists of the values, beliefs, mission, goals, and sense of responsibility held by the organization’s members. The culture fills in the blank spaces in the organization’s policies, procedures, and processes and provides a sense of purpose to safety efforts.
(1) Safety Cultures. Cultures consist of psychological (how people think and feel), behavioral (how people and groups act and perform) and structural (the programs, procedures, and organization of the enterprise) elements. Many of the processes specified in the policy, risk management, and assurance components of the SMS provide the framework for the structural element. However, the organization must also set in place processes that allow for communication among employees and with the organization’s management. The aviation service provider must make every effort to communicate its goals and objectives, as well as the current status of the organization’s activities and significant events. Likewise, the aviation service provider must supply a means of upward communication in an environment of openness.
(2) Communication: A Two Way Street. Dr. James Reason, among other current organizational system safety theorists, stresses the need for a “reporting culture” as an important aspect of safety culture. The organization must do what it can to cultivate the willingness of its members to contribute to the organization’s knowledge base. Dr. Reason further stresses the need for a “just culture,” where employees have the confidence that, while they will be held accountable for their actions, the organization will treat them fairly. The standard specifies that the aviation service provider must provide for a means of employee communication that allows for timely submission of reports on safety deficiencies without fear of reprisal. Many certificated operators already have invested in ASAP. ASAP is a collaborative, reporting, analysis, and problem solving effort among the FAA, operators, and employee unions. This program is another example of a voluntary program that could be integrated into the SMS, having a strong potential to contribute to the safety assurance and safety promotion.
(3) Organizational Learning. Another of Dr. Reason’s principles of organizational safety culture is that of a “learning culture.” The information in reports, audits, investigation, and other data sources does no good if the organization does not learn from it. The standard also requires a means of analysis of this information and a linkage to the safety assurance process. The standard requires an analysis process, a preventive/corrective action process, and a path to the safety risk management process for the development of new safety controls, as environments change and new hazards are identified. It further requires that the organization provide training and information about risk controls and lessons learned.
9. CONTACT. For additional information or suggestions, please contact AFS-800 at (202) 267-8212, or AFS-900 at (703) 661-0526.
ORIGINAL SIGNED BY
John M. Allen (for)
James J. Ballough
Director, Flight Standards Service
Reason. Managing the Risks of Organizational Accidents.
Ibid.