| NDT.net - April 2003, Vol. 8 No.4 |
The reliability investigations are dedicated to throw light on the performance of the NDE system with respect to the required aim. This is especially of interest when it is looked for life damaging objects in the case of mine seeking and when it is even more difficult to distinguish between the different suppliers of methods concerning their actual capability.
In accordance with the conception of EFNDT WG5 and CEN BT 126 CW 07 the authors have the aim to transfer the knowledge in reliability measurement of NDE systems to the reliability measurement of mine detection systems. For this the three basic ways to investigate the reliability of NDE signals will be described. The first way of investigation, the performance demonstration, is preferred e.g. in the US American nuclear power industry. This is an integral consideration of the non destructive test as a system where the whole NDE system is packed in a black box and only the input in terms of the real existing flaws in the component is considered and compared to the output in terms of the indications of the human inspector or of the automated system. The second - the European tradition – relies on a standardized description of physical/technical parameters of the NDE system which are preconditions for successful system performance. The third approach - the modular conception - is a marriage of both: The signal chain is cut into main modules.
Each module is assessed in a most appropriate individual way e.g. via modeling calculations. The single results are joint together according to the reliability theory of systems where the reliability of the total system is composed of the reliability of the subsystems. Separating criteria for the system were proposed through a reliability formula developed during a series of European-American workshops on NDE reliability. A further essential parameter especially for demining systems are the repeatability and reproducibility. Their affect on the reliability and its scatter will be investigated. Examples for the ROC/POD approach in terms of the investigation of the reliability of ultrasonic manual testing and for first attempts for the reliability investigation of demining data will be presented..
It will be strongly recommended to apply the ROC/POD and Modular Conceptions to the assessment and optimization of mine seeking systems.
|
|
Fig 1: | a) Mine Detection initiated safely by a mechanical clearance device b) Death lurks beneath the ground: Even the step of a small child will unconditionally trigger this anti-personnel mine c) This young boy not only lost his leg through an exploding mine, but all hopes of living a normal life. He will depend on the mercy of his fellow men for the rest of his life.. | ||
At the second European-American workshop about NDE reliability, 1999 in Boulder, USA, the following definition of NDE reliability was elaborated [1]: NDE reliability is the degree that an NDT system is capable of achieving its purpose regarding detection, characterization and false calls. Where the NDE system consists of the procedures, equipment and personnel that are used in performing NDE inspection.
In the CEN BT 126 CW07 group a transformation to mine detection was proposed:
The mine detection system is the detector, the procedure and the human being that are used in performing mine searching under specified operational and environmental conditions.
The mine detection reliability is the degree that a mine detection system is capable of achieving its purpose regarding detection and false calls.
The procedure should be shown to be repeatable and reproducible.
Figure 1b shows the subject of concern in the case of mine detection and in figure 1a we see a safely deactivation using mechanical clearance equipment. Figure 1c shows the sad result when the mine clearance was not reliable enough.
According to this definition we consider in the situation in figure 2, where we have on the left hand side the “truth” of the component which is in our case a weld with defects or a volume of soil with a mine and at the end of the right hand side the corresponding inspection report. This means that we have a 100% reliability when we would have a 1:1 correspondence between both sides. Since this is in almost all practical examples not the case we have to set up tools to measure and maintain reliability to raise the acceptance of NDE in between the neighboring engineering sciences and to make global reliability conceptions like risk based inspection or risk based life time management feasible.
Fig 2: The Signal Transfer Chain of a Radiographic System
|
In the middle figure 2 we take a closer look to an NDE signal transfer chain: The signal starts in terms of an energy beam or wave from a source and interacts with matter in terms of a component and its possible defects creating an output signal which is driven by the physics of the method. This output signal is now further influenced by the physical and technical properties of the more or less complex receiver and converter in our case the imaging item for X-rays and the digitization and processing unit. Now we consider how to measure and ensure the reliability. We start with the European approach: This approach is by far the most widely used, and is in use in most, or perhaps all, industries. One or more reference objects (such as DIN image quality indicators for radiography, or ASTME127 reference blocks for ultrasonic inspection) are used, in combination with written procedures controlling details of the inspection method, to reach and demonstrate consistency. The intent is to achieve essentially the same inspection conditions, independent of where, when, or by whom an inspection is conducted.
This type of European approach is often used to provide process control information of a qualitative nature: loss of control at some earlier manufacturing stage is indicated by the unexpected occurrence of numerous or large indications, for example. Capability for detection of “real” (naturally-occurring) defects is not given for all applications, but is sometimes inferred from the size of the simulated defects in the reference objects, or from the size of defects that have been detected in past inspections using the same conditions. This inferential process can often lead to false conclusions about the detectability of real defects but it is assumed that the validation occurred during the accomplishment of the standard and its use over many years. Considering the example of a signal transfer chain of a NDE system in figure 2, the application of standards means that the performance of the NDE system is assured in defining the values of parameters in each module.
The “American Approach”, is the Performance Demonstration. We begin with the Performance Demonstration for empirical applications. In its simplest form this approach involves the use of material samples containing known defects as a basis for studying the effects on detectability of factors such as calibration, changes in inspection equipment, or inspector training programs. For the other inspection parameters that are not deliberately changed, consistency is still pursued through the use of reference objects and control procedures. Test programs of this kind are often used in conjunction with “round-robin” or other interlaboratory data acquisition procedures.
This type of test can be applied equally well to NDE methods producing qualitative (i.e. pass/fail) or quantitative (i.e. signal amplitude) outputs, but in practice they appear to have been used most widely with qualitative methods in terms of blind trials as indicated in figure 3, where input (the true defect situation in the component) and output (defect indication in the inspection report) are compared and the signal transfer chain from figure 2 is treated as black box. To date, the major efforts in this type of Performance Demonstration have been the PISC program (focusing on characterization of all types of ultrasonic testing of nuclear power plant components), and the PDI program at the Electric Power Research Institute (EPRI) NDE center, in Charlotte, NC, (under whose auspices some hundreds of testing companies have already passed examinations in manual ultrasonic testing, and automated testing of pressure vessel nozzles, according to the ASME code, section XI, appendix VIII) [2].
The reliability measurement tools in terms of ROC (Receiver Operating Characteristic) and POD (Probability of Detection) – see also [3] will be described in section 3. The third approach, the modular approach which is described in detail in [5], might be considered also as the scientific basis for the well known ENIQ methodology [6].
The objective of the modular approach for measuring the reliability of NDE is to provide a validated testing system that fulfills the requirements of the client in the most efficient and cost effective manner. This capability is especially important where expensive statistical tests are not possible. In developing this concept we divide a system into appropriate sub modules as indicated in figure 4, and evaluate the discrete reliability of each. The knowledge gained within each of the modules allows an optimization of the total system. The reliability of the total system is then determined by combining the single reliabilities of the modules, including their possible correlation according to e.g. fault tree analysis [7].
Fig 3: Principle of a „Performance Demonstration“.
|
The modular approach facilitates direct integration of the 1st American-European Workshop Reliability Formula [8] as illustrated in figure 5. The expression defines a total reliability R, which consists of: an intrinsic capability IC describing the physics and basic capability of the devices, factors of industrial application such as restricted access in the field, AP, and finally the human factors HF.
Fig 4: Modular Validation - Application of the Reliability Theory of Systems -
Set up Scientific Basis & RULES for the Technical Justification,
|
Fig 5: Combination of the individual Factors within a modular Approach.
|
The Receiver Operating Characteristic (ROC) [10, 11] is deviated from the general theory of signal detection and widely used since over 40 years in fields of evaluation of diagnostic systems like radar techniques, test of human perception and in medical diagnosis and since the eighties also in NDE. The general four possible situations in NDT (Nondestructive Testing) diagnosis are presented in figure 7.
Fig 6: The Principles of ROC (Reciver Operating Characteristic) The Possible
Diagnosis Results in NDT.
|
Fig 7: Characteristic of one NDT-System by an ROC curve.
|
For both “true situations”, defect present or no defect present, we have the possibility to recognize the truth (TP, TN) or to miss the truth with a false indication (FN, FP). The idea of the ROC method is to characterise the accuracy of an inspection system by evaluating the true positive detection rate versus the false positive detection rate for a set of possible decision criteria or recording levels in the language of NDT which represents a varying sensitivity. Following theROC- curve in figure 7 from the lower left corner to the upper right - the sensitivity of the system raises. So - in the lower part of the curve the highest signals (correct indications) are included and only a small amount of noise (false calls). In the higher part more and more all of the defects are taken into account but also a greater amount of false calls has to be paid as price. The underlying mathematical model in terms of the two Gaussian signal distribution curves for the defect signals and the noise respectively are shown on the right hand side. In practice – especially in the case of manual testing with hit miss results - it is not possible to apply continuously growing signal thresholds and to count correct and false call rates for each because it is too much effort.
Fig 8: Comparison of Different NDT-Systems.
|
![]() Fig 9: ROC - POD Connection (POD - Probability of Detection). |
For the fictive systems shown in figure 8 the performance of the system increases from curve 1 to curve 7. Figure 9 illustrates the relation between ROC and POD (Probability of Detection curves): Both types of curves have the same statistical background – only the results are arranged with respect to different variables For illustration we consider one point on the ROC curve with fixed false call probability and take the corresponding POD value in terms of the p(TP). This value has then to be spread off over the different defect sizes as indicated in the lower part of figure 12. When there is a very strong dependence on the defect size – the ROC curve has to be recorded for each defect size separately.
The subject of the example is a manual ultrasonic testing system as shown in figure 10.
Fig 10: Example Testing System for Ultrasonic Manual Testing.
|
As test samples this we used a set of steel test plates containing several types of artificial defects which commonly are applied in education in the railway field. An overview of the defects and experimental outline is given in figure 11. The number of possible flaw indications was 348 and of empty sections 1107. Five unexperienced inspectors and five inspectors with at least ten years practical experience tested the plates for an ROC input. In figure 12 we see the results where we pooled over the indication rates of the five inspectors in each group and fitted the curves. The ROC curves behave as expected: The experienced inspectors are more reliable and have a much smaller variance. Also in figure 12 all the individual operating points of the inspectors of both groups are shown: It is again reasonable that the unexperienced inspectors have a higher variance than their experienced colleagues. From the unexperienced inspectors we can select the most talented candidate with the highest correct indications with the yellow points.
Fig 11: Outline of Experiments.
|
Fig 12: ROC Results: Curves.
|
From these results and diagrams we can conclude that a combined ROC/POD test is well suited to distinguish between testing systems and human beings which are suited for the testing in the field or not. We conclude further that this capability is of high value for performance demonstration- and suitability tests for mine detection systems.
Fig 13: IPPTC result in a ROC diagram (Total number of actually present mines
20 mines with IPPTC-configuration 8 mines with CMAC-configuration.
|
Fig 14: a) Test Results of realistic trials in an highly mine effected country
(Afghanistan 1999/2000) data provided by JRC Ispra
b) ASME-Code (Sect. XI, App. VIII) in a ROC-Diagram (increasing
number of grading units: 15 ® 60).
|
|
|
Fig 15: Preparation of a test trial under real conditions in Afghanistan 2002
| | |||||
Our first attempt for reliability in mine detection was just to put existing data in an ROC diagram for better overview. The data are taken from the well known world trial documented in the IPPCT report [21]. In figure 13 each point stands for the living operation point of one metal detector device. The information of the sensitivity setting was missed here. Though we are able to learn from the diagrams a lot especially the drastic diminishing of detection performance from a normal clay soil to an “uncooperative” laterite soill which causes a lot of noise signals and makes detection more difficult. Results of an other trial under very realistic field conditions are shown in figure 14a. These are very good ROC points but the device with the highest operation point was not stable over time. The conclusion is that the R&R (repeatability and reproduceability) study has to be included in reliability investigations especially of metal detectors. For comparison we show in figure 14b the pass-fail scheme for inspections applied in the ASMEsection XI app.VIII[5](within the upper left corner-> passed and failed otherwise). The idea was born to set up a similar test schema for metal detectors. Some illustrations of practical test preparations in Afghanistan 2002 are shown in figure 15. Here is the very first draft of this performance demonstration idea:
A “blind” performance demonstration test under realistic field conditions is proposed. “Blind” means that the operators do not know where the targets are. Realistic conditions should be selected according to the requirements of the client and the foreseen purpose including mine types and depths, soil types, humidity, temperature, surface status and representative human operator. The subject under test is the system composed of equipment, procedure and personnel and operational and environmental conditions. For this reason test lanes should be set up(according to 7.3) containing grading units of about 1mx1m (adjustable to the mine size under consideration). The minimum number of test mines is 7 of each type in one depth for which the detector is foreseen to be used (the total number should be at least 28). Each of these should be located in one grading unit (with exception of special separability tests). The number of empty grading units shall be at least double the number of occupied grading units. The testmines should be randomly distributed among the grading units. The number of correct detections and false calls shall be counted for each recorded sensitivity. A supervisor need to be always present. Atleast two examples of each detector type, two different operators for each type and two repetitions are recommended. A rough guideline for pass/fail patterns is given in table VIII-S2-1[5] but it must be based on the requirements of the client. The trials should be conducted at accredited test facilities or from an accredited person in the field (IMAS 3.40).
Fig 16: Table VIII-S2-1.
|
Fig 17: Example for test lane layout.
|
The situation on site in real mine fields in Croatia.
Fig 18: Map of mine fields in Croatia(www.hcr.hr).
red - mine field, |
| year of planning | surface area of contaminated land | area planned for level I survey – area reduction, km2 | area planned for level II survey – reduction, km2 | area planned for survey and mine clearence using classic methods (manually), km2 | area planned for survey and mine clearence using machines and subsequent verification by another method, km2 | fencing og mine contaminated and marking of mine suspected areas, km | |
| total km2 | total km2 planned to be achieved | ||||||
| 2000.* | 4.500 | 100 | 60 | 20 | 15 | 5 | 400 |
| 2001. | 4.400 | 500 | 325 | 25 | 30 | 20 | 500 |
| 2002. | 3.900 | 500 | 300 | 25 | 30 | 30 | 500 |
| 2003. | 3.400 | 500 | 300 | 25 | 30 | 30 | 500 |
| 2004. | 2.900 | 500 | 300 | 25 | 30 | 30 | 500 |
| 2005. | 2.400 | 500 | 280 | 30 | 30 | 40 | 500 |
| 2006. | 1.900 | 500 | 280 | 30 | 30 | 40 | 200 |
| 2007. | 1.400 | 500 | 320 | 25 | 20 | 35 | 200 |
| 2008. | 900 | 300 | 150 | 25 | 20 | 35 | 100 |
| 2009. | 600 | 300 | 150 | 25 | 20 | 30 | 50 |
| 2010. | 300 | 300 | 150 | 25 | 20 | 30 | 0 |
| total | 4500 | 2615 | 280 | 275 | 325 | 3450 | |
| Table 1: table X. the plan for overall activities in the process of survey and mine clearence in Croatia for the period of 2000-2010. [The national mine action programme in the Republic of Croatia, brochure published by CroMAC – Croatian Mine Action Center, April 2001.] | |||||||
Fig 19: also taken from www.hcr.hr; number of victims, red (total number) blue
(accidents) white (deadly).
|
Fig 20: also taken from www.hcr.hr; number of victims among pirotechnicians,
red (total number) blue (deadly).
|
Fig 21: Minefield.
|
| Number of mines | type of mine | method | comment |
| 6 | TMRP-6 | manually | antenna was visible from a distance (yellow on the map) positions are marked with dots |
| 2 | PROM-1 | manually | (green on the map) |
| 6 | PROM-1 | manually after clearance by mechanical device | 1 PROM-1 was activated by mechanical device this was indication to check this trace manually after mechanical clearance (yellow on the map) |
| 3 | PMR-2A | manually | (green on the map) |
| 5 | PMR-2A | manually after clearance by mechanical device | (yellow and orange on the map) |
| Table 2: table with results (our example) | |||
We illustrate here the real situation in Croatia on site and our way to prepare the data treating for future performance demonstration. In figure 18 we see the current situation in whole Croatia of mine suspected areas. Table 1 shows the plan of mine clearance. From figures 19 and 20 we see the demand for a high reliability already for the protection of life of the deminer and of course for the civil population. Now we present some figures from an realistic mine clearance activity.
Fig. 21 shows a former mine affected area where different clearing methods were applied and table two shows the finding successes with the different methods. During the work of ScanJack (mechanical clearing device) they have noticed explosion of the mine and ScanJack stopped. Pyrotechnicians went on the site and recognised the type of mine (PROM-1). They have decided to continue with ScanJack because that mine type is very dangerous to detect for deminers, secondly it does not harm the integrity of robust ScanJack and at last the condition of vegetation was so, that work of deminers was practically unavailable. But after ScanJAck has finished his "cleaning" job they have treated that area with "men after machine" and not with the dogs because there was "positive indications" during machine work. That is why 6 pieces of PROM-1 and PMR-2A were found by "men after machine".
Fig 22: Examples for the mine types in table 2.
|
Other two pieces of PROM-1 they have detected at the beginning. We cannot say even prospection because before each prospection (naturally) a visual inspection is performed and this type of mine is mounted in that way which could be easily noted even from a certain distance. Of course and not high vegetation. PROM-1 is usually connected with a wire (14-16m) with another bar (stick) that is put in soil vertically. PROM-1 is activated with pulling a wire. Two PROM-1 mines were situated near the road and they have planned to go there with intention ("prospection").
Some example shapes of the mines under consideration in this clearing activity are shown in figure 22.
The determination of the reliability of diagnostic systems is a complex and strenuous task but in any case a valuable tool for obtaining efficient and reliable testing procedures which is of special high importance for mine seeking systems. The first step of an performance demonstration is to define the essential technical parameters of the system. The ROC and POD methods are appropriate tools to provide a clear measure of integral performance of the system though it has to be paid by high effort in test series with realistic test samples. With POD the user can learn about the detection capability whereas the ROC gives more information about the system’s capability to distinguish between signal and noise. The modular approaches open the door to a promising technique – more efficient and with the capability also to optimize the system. A third European-American Workshop will be held to discover the appropriate reliability formula and example cases [20]. The first attempt to transfer the NDE reliability tools to the demining problem shows promising results.
The authors like to thank Mrs. Gisela Malitte for valuable editing work and the colleagues from the railway institution in St. Petersburg for the accomplishment of the ultrasonic experiments and the colleagues from CROMAC any JRC ISPRA for providing demining data.
| © NDT.net - info@ndt.net | |Top| |