![]() ·Table of Contents ·Workshop - Reliability | NDE Reliability - Human Factors - Basic ConsiderationsHenry M. Stephens, Jr.EPRI NDE Center,1300 W. Harris Blvd.,Charlotte, NC 28262,704-547-6128,hstephen@epri.com Contact |
As identified in the first two European-American Workshops on NDE Reliability, human factors is one of the principal elements affecting the reliability of nondestructive examinations. In the second workshop, September 1999, the term "Human Factors" was defined as, "the mental and physical make of the individual, the individual's training and experience, and the conditions under which the individual must operate that influence the ability of the NDE system to achieve its intended purpose."
This paper provides insight into the basic attributes of human factors as applied to nondestructive examination and proposes a methodology to measure these attributes. It follows that only when the human factor attributes can be determined with some degree of confidence can the overall reliability of nondestructive examinations be determined.
Failure to detect, correctly analyze or correctly size the through-wall dimension of steam generator (SG) tubing in pressurized water reactor (PWR) nuclear plants is another definitive example of the importance of effective NDE. There have been a number of examples over the years of operation of these commercial electric generating plants where tubes have failed or leaked. Again, significant measures have been implemented to help improve the NDE performance of SG eddy current data analysis including comprehensive performance demonstrations, correlating performance demonstration and field performance results, and establishing a methodology to estimate future performance.
As documented in the National Transportation Safety Board (NTSB) report (N75B/AAR-98-01)[1] two passengers died on Delta Flight 1288 as a result of an uncontained engine failure during take-off from Pensacola, Florida on July 6, 1998. The incident was attributed to a failure of the NDE system to detect a crack in the JT-8D engine hub. Previous reports addressing the issue of NDE reliability include the United Airlines crash at Sioux City, Iowa on July 17, 1989 (NTBS/AAR-90/06)[2], and a Canadian Transportation Safety Board (CTSB) report on a Canadian Airlines B-767 failure at Beijing, China on September 7, 1997. Lack of effective NDE of engine maintenance resulted in failures that took lives. In response to these situations the Federal Aviation Administration (FAA) has focused on increased knowledge of forged titanium defects, quantification of Probability of Detection (POD) curves for primary NDT techniques used, and drafted Advisory Circulars on NDE methods including visual.[3]
The death of 46 people resulted due to the collapse of the Silver Bridge on December 15, 1967. This resulted in the National Bridge Inspection Program (NBIP) and the establishment of standards for highway bridges on public roads in the United States. Efforts are currently in progress to study the effectiveness of these inspections.[4]
These examples emphasize how important effective NDE is to providing a safer world. Even though these examples address relatively rare occurrences when NDE is not as effective as desired, they have happened. The goal through participation in these workshops is to help identify ways to further increase NDE effectiveness in a cost-effective manner.
The first European-American Workshop Determination of Reliability and Validation Method on NDE, June 1997, resulted in a conceptual model that defined NDE Reliability (R) as:
R º f(IC) - g(AP) - h(HF)
where:
IC, the Intrinsic Capability of the system (technique and combination of techniques) generally considered as a upper bound,
AP, the effect of Application Parameters, such as access restrictions, surface state generally reducing the capability of the NDE system, and
HF, the effect of Human Factors, generally reducing the capability or effectiveness.
The second workshop, September 1999, redefined the relationship for NDE reliability as:
R º f(AP,HF)
With the following definitions:
Reliability (R) - NDE reliability is the degree that an NDT system is capable of achieving its purpose regarding detection, characterization and false calls.
NDE System - is the procedures, equipment and personnel that are used in performing NDE inspection.
Application Parameters (AP) - are the factors concerning material conditions, discontinuities, procedure and equipment that influence the ability of an NDE system to meet its intended purpose.
Human Factors - are the mental and physical make of the individual, the individual's training and experience, and the conditions under which the individual must operate that influence the ability of the NDE system to achieve its intended purpose.
Human factor function (HF) - describes the affect of the human interface in the inspection process.
It has been well recognized that equipment, procedure, and personnel are the NDE system elements that comprise NDE reliability. The quantitative extent that each of these elements, as well as their interrelationship, has on the overall NDE reliability is yet to be determined. Currently, one of the most effective methods to make these determinations appears to be through a structured performance demonstration process. From this process the critical elements of equipment and procedure can be demonstrated on samples to verify the capability to detect the required discontinuities. This information can then be used to determine the probability of detection (POD) of the equipment and procedure. With this established, the POD and the false call rate (FCR) of the examiner using the qualified procedure and equipment can be determined. The combined POD can be calculated for the procedure, equipment and examiner. Finally, a correlation of the performance demonstration results to the field results should be performed.
Using this performance demonstration approach where the equipment and procedure are qualified and determined to be technically adequate to detect the "required" discontinuities eliminates the variable elements. This assumes that the examiner correctly implements all the provisions of the procedure and thus isolates the human factor elements.
What are the human factor elements that affect NDE reliability? To what extent do each of the elements affect NDE reliability? Are the elements the same for each method, technique, and application? If they are, to what extent do they affect reliability? Finally, a proposed methodology is presented as to how to effectively measure these human factor elements.
"A "human error" can be characterized as a divergence between an action performed and an action that should have been performed, which has an effect or consequence that is outside specific (safety) tolerances required by the particular system with which the human is interacting."[7] Rasmussen's tripartite (skill-based; rule-based; and knowledge-based errors) has effectively become a market standard within the systems reliability community.
From a psychological perspective the focus is on the underlying causes of the error. While in the probabilistic risk assessment (PRA) community "human error" usually refers to human-caused failures of a system or function with the focus on the consequence of the error. To improve NDE reliability the approach must be to address both the causes and the consequences of the error(s). Also, to some people, the term "error" connotes the placing of blame on those taking the action. As stated by James Reason (1990), "Far from being rooted in irrational or maladaptive tendencies, these recurrent error forms have their origins in fundamentally useful psychological processes. Ernst Mach (1905) put it well: 'Knowledge and error flow from the same mental sources, only success can tell the one from the other.'" [8] Relatively simple NDE tasks offer many opportunities for human performance errors, including omission errors, substitution errors, reversal errors, and procedural errors.
As defined in the second European-American Workshop on NDE Reliability, September 1999, the term "Human Factors" is, "the mental and physical make of the individual, the individual's training and experience, and the conditions under which the individual must operate that influence the ability of the NDE system to achieve its intended purpose."[6] First to apply this definition, the specific human attributes that contribute must be addressed. What are the elements of the mental and physical make of the individual that affect the NDE system?
Physical and Mental Human Factor Elements
Some of the physical elements to consider include motor skills, e.g., eye-hand coordination, dexterity, flexibility, etc.; vision capabilities include color discrimination, near and far field visual acuity, and field of vision; general physical condition and stamina to work for the required periods in a given environment, to climb, kneel, bend, etc. The affect of each of these elements varies greatly depending on the specific NDE method, technique, and application.
Fig 1: Components of human information processing.
|
As noted by Drury, "The functions of search and decision are the most error-prone in general, although for much of NDI, setup can cause its own unique errors." (Drury)[3] Search refers to seeking out the signal of a discontinuity.
In some NDE methods, such as visual, liquid penetrant, magnetic particle, radiography, these are direct visual cues. The examiner must move his eyes around the items to be examined to ensure that any discontinuities or indications of discontinuities are with in the line of sight to have a detection. Research shows that examiners can be trained to perform systematic searches; however, it has proven to be quite difficult. Lack of effective manual scanning using ultrasonic and eddy current probes parallel the visual dependent methods. We must determine the factors that affect search performance, (e.g., speed/accuracy tradeoff, signal-to-noise ratio, implementing risk informed inspection programs to reduce the areas that do not require to be examined) and identify and implement interventions to increase detection of signals (sensations) to apply the decision making process.
The mental or cognitive process in a typical NDE task includes sensation, perception, short-term and long-term memory, decision making and a resulting action. A general model of this process is shown in Figure 1 (Harris, 1992).
Some NDE examination methods and techniques, such as, ultrasonic examination for detection and through-wall sizing of intergranular stress corrosion cracking (IGSCC) or eddy current data analysis of steam generator tubing, place a heavy burden on the human information processing. Information is acquired and assessed; knowledge gained through training and experience is applied in the interpretation of the information; interpretations are supported or refuted through the acquisition of additional information; and findings are combined and weighted to arrive at a conclusion about the condition of the component examined. Although not all NDE tasks are as difficult as the examples above, the information processing operation is fundamentally the same.
Information is brought in to the human brain through various receptors capable of transforming external energy (light, sound, heat, etc.) into internal neural activity. Most typically, this is light energy from a visual signal, e.g., penetrant indication, ultrasonic signal, etc., transformed by the rods and cones of the retina in to neural signals transferred to the brain.
Perception, the recognition and interpretation of the neural signals, require input from information stored in long-term memory such as experience, rules, standards, and relationships. Perception plays an important part in functions such as detection, location, discrimination, comparison and categorization. An example of perception is the determination that signals from two different reflectors are present on an ultrasonic display. Knowledge of the component geometry, potential reflector types, signal characteristics, etc., are required for the interpretation.
Information stored in long-term memory can be durable, extensive, and recalled at will. Additionally, long-term memory information is little affected by new information. However, getting and keeping information in long-term memory typically requires deliberate effort such as forming associations, rehearsal, practice, gaining understanding, and getting feedback. For this reason, a decision-making strategy which must be committed to long-term memory should be as simple and streamlined as possible. It should also be applicable to most situations without the need to learn, recall, and apply a variety of rules that differ from situation to situation.
In contrast, information in short-term memory is very limited in terms of the amount that can be stored and the duration of time that it can be stored. Typically, a maximum of seven items of information can be handled at one time; new information pushes out old information; and if information is not used it is lost within about 15 to 30 seconds. In spite of these limitations, short-term memory is critical to at least the more complex NDE tasks, if not to all of them. This is where mental manipulations such as decision making are performed. For this reason, a decision-making strategy should place a minimum load on short-term memory. Several studies have shown significant improvements in performance using this approach (Harris, 1992).
Most NDE methods and techniques involve a sequential decision process since information is acquired through a serial application of procedural or technique steps. In reaching a decision, an initial hypothesis is formulated and then further evidence is sought to confirm or refute it. Unfortunately, because of the human processing limitations, the process of gathering additional information is beset by potential biases that reduce the effectiveness of decision making, confirmation bias and negative-information bias.
Rasmussen identified a similar decision making process as the one discussed above; however, his model was not linear. He charted the shortcuts that human decision-makers take in real-life situations. His model is analogous to a stepladder, with the skill-based and execution stages at the base, rule-based stages intermediate and the knowledge-based interpretation and evaluation stages at the top.[8] This model more closely replicates the NDE decision-making process used for most applications.
Training and Experience Elements
The prevailing strategy to eliminate errors has been to adapt people to technology. The benefits of effective training and quality experience have been well documented extensively through the years. Studies continue to show however that even well-trained, experienced operators could make mistakes. One such study conducted at the Sandia National Laboratories used twelve experienced visual examiners from major airlines to inspect a simulated fuselage panel containing cracks used for earlier eddy-current studies. The results are shown in Table 1 below.[3]
| Inspector | Probability of Search Failure | Probability of Decision Failure(miss) | Probability of Decision Failure (false alarm) |
| A | 0.31 | 0.27 | 0.14 |
| B | 0.51 | 0.66 | 0.11 |
| C | 0.47 | 0.31 | 0.26 |
| D | 0.44 | 0.07 | 0.42 |
| E | 0.52 | 0.00 | 0.00 |
| F | 0.40 | 0.00 | 1.00 |
| G | 0.47 | 0.00 | 0.00 |
| H | 0.66 | 0.03 | 0.84 |
| I | 0.64 | 0.23 | 0.80 |
| J | 0.64 | 0.07 | 0.17 |
| K | 0.64 | 0.17 | 0.22 |
| Table 1: Search and decision probabilities on simulated fuselage panel inspection (derived from Spencer, Drury, and Schurman, 1996 | |||
Note the relatively consistent, although poor search performance of the inspectors on these relatively small cracks. In contrast, note the wide variability in decision performance shown in the final two columns. Some inspectors (e.g. B) made many misses and few false alarms. Others (e.g. F) made few or no misses by many or even all false alarms. Two inspectors (e.g. E and G) made perfect decisions. These results suggest that the search skills of all inspectors need improvement, whereas specific individual inspectors need specific training to improve the two decision measures.
Based on several studies conducted by EPRI on the relationship of education, training and experience to ultrasonic examination results, the correlation was low at best and in some instances, there was negative correlation documented (experience vs. IGSCC detection results). For example, a group of twelve ultrasonic examiners with approximately one-year of ultrasonic examination experience but with three weeks of quality training had a pass rate of 92.7% on the IGSCC detection practical examination compared to an average success rate of 37.6% with experience averaging in excess of 7.7 years.
As documented by Harris (1993), a strategy-based approach to training may be more beneficial than the conventional training approach, particularly for the more demanding NDE tasks. Using the information gained in the development of a decision-aid for detecting IGSCC in piping welds, a strategy-based approach to training was also evaluated. The IGSCC detection course was revised to incorporate the concepts of strategy-based training into the instructional sequence, content, and practice sessions. The qualification rate increased from 34.4% to 54.9%. In addition, operators who received strategy-based training rated the training effectiveness higher.
One of the major keys to effective training is to perform a detailed task and skills analysis to determine the NDE parameters that impact detection performance. A number of these are addressed in the conventional training course outlines and include such items as illumination levels, calibration procedures, etc. However, most do not address the more subtle parameters such as, visual search procedures, ultrasonic manual scanning techniques to assure coverage and effective beam orientation, evaluation of subtle ultrasonic signal characteristics such as, signal rise and decay time, pulse duration, etc. As appropriate, these must be identified and included in the training provided to examiners. Computer-based training, through the use of animations, simulation, and actual data, is evolving as an effective way to transfer this information.
Operating Conditions
The last element of our definition of human factors addresses the operating conditions that influence the ability of the NDE system to achieve its intended purpose. This element includes such items as working conditions, environment, organization, etc. that can impact the physical and mental state of examiners.
It is agreed that performance shaping factors, e.g., heat, humidity, etc., as well as environmental factors can impact NDE reliability; however, until the NDE application parameters (AP) and other aspects of the human factors (HF) are better understood this aspect should take a lower priority but not be ignored. The author presents this opinion since it is well known that ergonomic, organizational, and an array of related factors, such as fatigue, and circadian rhythm can have a negative impact on human performance. These factors should be acknowledged and included in the data collected when conducting NDE performance studies.
One model to consider is that developed by the EPRI Steam Generator Management Program (SGMP). Appendix H of the Guidelines provides the protocol for qualification of techniques. Results of technique qualifications are detailed in Examination Technique Specification Sheets (ETSS). In addition to essential variables, ETSS contains: flaw set on which the technique was qualified; detection POD; and sizing RMSE, if available. Detailed ETSSs, which include the raw eddy current signals, are available on EPRI Web. Appendix G of the Guidelines provides the protocol for qualification of personnel. It includes field data from recent and past inspection of various plants. Each analyst is given a separate examination on each degradation form and must pass the grading criteria. The results of 286 Qualified Data Analysts (QDA) candidates are analyzed and available representing a "global" view of analyst capabilities for detection and sizing of various degradation mechanisms. Where these results are not adequate for the specific situation, the Guideline provides for and requires the conduct of a Site-Specific Performance Demonstration, which is very similar to Appendix G.[10]
Performance measures (detection, sizing, and orientation of indications attributed to different damage mechanisms) are summarized in 13 tables, one for each combination of eddy-current technique and damage mechanism for which data are available from the QDA program. Performance measures are presented at the 50%, 90% and 95% confidence levels. For measures of detection and orientation performance, lower-bound confidence levels specify the degree of confidence one can have that the true analyst capability is at or above the given value. For example, if the 95% lower-bound confidence level were at a detection percentage of 93.75%, we would expect analysts to perform at or above this level 95% of the time. For measures of sizing performance, upper-bound confidence levels are provided because we are interested in the level of confidence at which performance is at or below a given root mean square error (RMSE). For example, if the 90% upper-bound confidence level were at an RMSE of 6.39% percent through-wall, we would expect analysts to have an RMSE at or below this value 90% of the time.
The specific model described above is not practical nor may not be needed for all NDE applications. However, the concepts of this model can be applied to much less challenging NDE applications. The combination of detailed task analysis and human factors engineering of each NDE method and technique can be used to collect data based on "qualified" personnel using "qualified" procedures to perform various examinations. This data can then be used to establish confidence levels for many other applications. The American Society of Nondestructive Testing Central Certification Program (ACCP) provides an opportunity to collect data on an array of NDE methods and techniques to "standardized" procedures. The goal of this performance-based approach to NDE personnel certification is to increase NDE reliability. This program also provides a path for "industry specific" qualification processes such as the QDA program where more challenging NDE applications are required. Other countries have or are in the process of implementing central certification programs where this data can be more cost effectively acquired.
We are just beginning to have some understanding of the cognitive bases of human error in NDE applications. We still know very little about how these individual tendencies interact within each method or technique.
We should train operators carefully and help them to make use of knowledge and experience. Further, since people are different we should choose for critical tasks those who have good perceptiveness, generally high capacity, maturity and judgement as measured by performance-based demonstration.
As we continue to pursue our understanding of these AP and HF interactions, I would like to emphasize as Ernst Mach so eloquently stated, "Knowledge and error flow from the same mental sources, only success can tell the one from the other." We will never achieve 100% NDE reliability but our continuous striving will benefit mankind.
| © AIPnD , created by NDT.net | |Home| |Top| |