A deep-learning-based approach towards identifying combined faults in structural health monitoring

. Fault diagnosis (FD), encompassing fault detection, isolation, identification and accommodation, is critical for reliable structural health monitoring (SHM) systems, ensuring correction of sensor faults that may corrupt or invalidate monitoring data. While sensor fault identification has received scarce attention within FD of SHM systems, recent methods have been proposed for identifying single sensor faults. Nonetheless, real-world SHM systems are prone to combined faults, i.e. different faults that affect individual sensors simultaneously. Identifying combined sensor faults is essential for improving the quality of FD and for gaining insight into the causes of sensor faults. This paper presents an approach for identifying combined sensor faults, referred to as ICSF approach, aiming to identify sensor faults occurring simultaneously in individual sensors using time-series data, thereby improving the quality of FD in SHM systems. Leveraging a recurrent neural network, specifically a long short-term memory network, a classification algorithm is implemented for mapping time-series data to combined sensor faults. The ICSF approach is validated using acceleration measurements collected by a faulty sensor from a real-world SHM system installed on a pedestrian bridge. The results demonstrate the effectiveness of the ICSF approach in identifying combined sensor faults, enhancing sensor FD in real-world SHM systems.


Introduction
Structural health monitoring (SHM) utilizes data collected by sensors ("sensor data") for nondestructive evaluation of structures, ensuring continuous updates on structural conditions, to enhance user safety and enable cost-efficient maintenance [1].Long-term sensor operation in SHM systems may lead to sensor faults, due to aging and harsh environmental conditions, which need to be eradicated via fault diagnosis [2].
Fault diagnosis in SHM involves fault detection to capture faults, fault isolation to localize the faults, fault identification to determine fault types, and fault accommodation to compensate for the effects of faults [3].Commonly known sensor fault types encompass various deviations from actual sensor data, including (i) bias, characterized by consistent divergence from actual values, (ii) drift, involving gradual deviations of sensor data over time, (iii) gain, expressed as constant scaling of sensor data, (iv) precision degradation, observed when white noise contaminates sensor data, (v) complete failure, manifesting as a constant value ("constant complete failure") or noise ("noisy complete failure") that replaces sensor data over time regardless of actual structural changes, and (vi) outliers, representing discontinuous observations deviating from sensor data at individual time points [4,5].
Fault diagnosis (FD) approaches in SHM systems, reported in literature for decades, have centered around fault detection, isolation, and accommodation.In [6], Kullaa has proposed a detection, isolation, and accommodation approach for sensor faults in SHM systems, while Steiner et al. have proposed support vector regression for decentralized sensor fault detection and isolation [7].Deng et al. have conducted a review of sensor fault detection approaches in SHM, detailing the advantages, disadvantages, and scope of each method [8].Nevertheless, sensor fault identification has received limited attention so far, although knowledge of fault types may be crucial for enhancing the quality of fault diagnosis and understanding the root causes of sensor faults in SHM systems.The reason fault identification is often neglected is the complexity of modeling, caused by the mapping of sensor data to fault types, that increases the computational burden.However, recent studies have started exploring sensor fault identification using various approaches, including rough set theory [9], convolutional neural networks [10], and support vector machine (SVM) [11].The fault identification studies found in literature assume the existence of single fault types, i.e. individual faults happening one at a time.However, combined sensor faults, i.e. multiple sensor faults occurring simultaneously in individual sensors, may occur in real-world SHM systems [12] and have received scarce attention.Sporadic approaches on identifying combined sensor faults have been reported in other disciplines.For instance, Cheng et al. [13] have used adaptive particle swarm optimization and SVM for identifying combined sensor faults, and Abboush et al. [14] have employed ensemble long short-term memory (LSTM) networks and random forests to identify combined sensor faults.Nonetheless, not all commonly known sensor faults have been considered in literature for identifying combined sensor faults.
This study proposes an approach for identification of combined sensor faults (ICSF) in SHM systems.The ICSF approach employs an LSTM network for classifying sensor data into types of combined sensor faults.Validation tests are conducted using real-world SHM data (acceleration measurements) recorded from a pedestrian bridge.The results show that the ICSF approach can accurately and reliably identify combined sensor faults, enabling better understanding of the causes of sensor faults in SHM systems, which leads to improving the quality of SHM systems.In the remainder of the paper, an overview of the ICSF approach is given.Then, the implementation and validation using real-world sensor data are presented.Finally, a discussion of the results and conclusions with future research directions are provided.

Overview of the ICSF approach
This section presents an overview of the ICSF approach comprising preparation of input data and development of the LSTM network, which, upon finishing training, results in a "classification model" capable of identifying combined sensor faults.The ICSF approach builds upon previous work of the authors on fault diagnosis, specifically on the AFDAR approach for fault detection, isolation, and accommodation, presented in [15].Although the fault identification part seems as an add-on to the AFDAR approach, the ICSF approach is designed as standalone, implicitly integrating the fault detection part, as will be shown in the paper.Figure 1 depicts the ICSF approach through an activity diagram, consisting of two main activities, "preparing the input dataset" and "developing the classification model".The former activity involves four actions, while the latter activity comprises three actions, as further described.

Preparing the input dataset
The first activity of the ICSF approach is preparing the input dataset, in which sensor data is first collected over a "data collection period", with each sensor gathering p data points.Next, a correlation analysis determines the number of "correlated sensors" k, which corresponds to the number of inputs for the LSTM input layer.Correlation is essential for distinguishing trends in sensor data attributed to structural behavior from trends indicative of faults; this distinction is necessary for fault detection, as shown in [15].The correlated sensors are labelled as i (i = 1, …, k), and sensor data f1→k is stored in matrix A p×k , as shown in Equation 1.
Sensor faults are artificially injected into the sensor data to train the LSTM networks for identifying combined sensor faults in real-world SHM systems.The data stored in matrix A is replicated into the input dataset Gi (i = 1, …, k).Thereafter, both single and combined sensor faults are artificially injected into the vector fi, representing the data collected from sensor i.The types of the sensor faults injected are stored in the classification output dataset Oi, which also includes a "non-faulty" class, essentially covering the fault detection, rendering the proposed approach self-sufficient.As a result, the input dataset Gi includes non-faulty sensor data from sensors (1, 2, …, i-1, …, k), in addition to sensor data from sensor i consisting of single and combined faults.
The total number of injected sensor fault C depends on the number of single sensor faults N, the number of single sensor faults n included in the combinations, and the number of single sensor faults m in each combination, as shown in Equation 2.

( ) (
) Upon injecting the sensor faults, data normalization is applied to the input dataset Gi using minimum-maximum normalization, shown in Equation 3, to prevent overfitting in the classification models, caused by extreme values in the sensor data.In Equation 3, xnormalized represents the normalized sensor data, x denotes a data point, and xmin and xmax are the minimum and maximum data points, respectively.The normalized dataset G̃i is then split into training (Gt,i), validation (Gv,i), and testing (Gs,i) datasets for training the LSTM networks and developing the classification models.

Developing the classification model
The second activity of the ICSF approach involves developing the classification model.The training dataset Gt,i is used for training the i-th LSTM network, for each correlated sensor i (i = 1, …, k), whichupon completing trainingresults in the classification model Mi.
During training, the training dataset Gt,i is sequentially fed into the LSTM network in batches, and the classification accuracy is computed against the corresponding classes of the output dataset Ot,i.Training iterations adjust the weighted connections between neurons until the Softmax function probabilities achieve a predefined classification accuracy level.At predefined intervals, the validation dataset Gv,i confirms classification accuracy trends and fine-tunes hyperparameters to prevent overfitting.After training, model accuracy is assessed using the testing dataset Gs,i, whose classification accuracy is calculated by measuring correct predictions among the total predictions.If the accuracy is satisfactory, the classification model of sensor i is saved.Otherwise, the LSTM architecture is adjusted, and a new LSTM network is trained, with new hyperparameters.The resulting model Mi recognizes features in the input dataset G̃i and classifies sensor data into combined fault types.The same process is repeated for all sensors.The ICSF approach is implemented and validated in an SHM system on a pedestrian bridge, as described in the following section.

Implementation and validation of the ICSF approach
This section shows the implementation and validation of the ICSF approach.The implementation description is structured in alignment with the activities of the ICSF approach, presented in the previous section, and a validation test is conducted on a pedestrian bridge to showcase the effectiveness of the ICSF approach in identifying combined sensor faults in real-world SHM systems.

Overview of the bridge and the SHM system
The validation test utilizes sensor data, i.e. acceleration measurements, collected by an SHM system installed on a pedestrian overpass bridge in Evosmos, Thessaloniki, Greece, as depicted in Figure 2 [16].The pedestrian bridge was constructed in 2016 and features a composite structure with a steel framework supporting a reinforced-concrete deck.The deck measures 35 m in length and 4.60 m in width and is supported by cylindrical reinforcedconcrete columns connected to two steel girders at each end using elastomeric bearings.The composite structure includes inwardly inclined steel arches, connected by steel cables, and reinforced by steel beams for lateral stability.
The SHM system comprises four accelerometers, namely S1, S2, S3, and S4, previously validated as non-faulty sensors.The accelerometers are evenly distributed along the central axis, spaced 7 m apart.In addition, a "faulty" accelerometer, labeled FS2, i.e. identified as faulty in previous experiments, is placed in close proximity to the non-faulty accelerometer S2. Figure 3 provides a top view of the pedestrian bridge, indicating the positions of both faulty and non-faulty accelerometers.

Implementation of the ICSF approach on the bridge
The implementation involves collecting sensor data from the SHM system of the pedestrian bridge, and developing the classification models using MATLAB [17].The fault identification capabilities of the classification models are verified by artificially generating and injecting combined sensor faults into the sensor data, which is then normalized and split for training, validation, and testing.The LSTM networks are first created and trained until satisfactory accuracy is achieved, and the corresponding classification models are saved for identifying combined sensor faults using newly collected data.
The acceleration measurements are obtained from the non-faulty accelerometers S1, S2, S3, and S4, over a 90-minute data collection period at a sampling rate of 128 Hz.The total number of data points is p = 692,628, recorded by each accelerometer.Next, Pearson correlation analysis is conducted on the acceleration measurements to identify the correlated sensors.The analysis reveals a strong correlation (> 0.90) among all four accelerometers, i.e. k = 4.The lowest correlation coefficient is observed at 0.937 between sensors S1 and S4.Then, the acceleration measurements of the correlated accelerometers f1→k = 4 are stored in matrix A692628×4.Seven types of single sensor faults (N = 7) are considered, comprising bias, drift, gain, precision degradation, complete failure (constant and noisy), and outliers.For combined sensor faults, combinations of two single faults within the same sensor (m = 2) are explored.Complete failure (complete and noisy) inherently cannot be combined with other sensor faults and, is excluded from the combinations (n = 5), resulting in a total of C (n, m) = 17 single and combined sensor faults.
The acceleration measurements, initially stored in matrix A692628×4, are used to generate four input datasets, each divided into 17 subsets corresponding to the sensor fault types.The 17 fault types are artificially injected into the i-th input dataset Gi (i = 1, …, 4), with the corresponding classes stored in the i-th output dataset Oi (i = 1, …, 4).Next, the input dataset i is normalized according to Equation 3, and the normalized dataset G̃i is split into a 70% training dataset (484,838 data points), a 15% validation dataset (103,894 data points), and a 15% testing dataset (103,894 data points).The normalization parameters, xmin and xmax, used during training, are saved for future application to new sensor data, fed into the classification model Mi.The initial architecture for the LSTM network is defined, featuring a sequence input layer with a length equal to the number of correlated sensors (k = 4), and an output layer comprising a fully-connected layer, the Softmax function, and a classification layer with 17 classes, representing one sensor fault per class.After a trial-anderror process, three hidden layers are defined, each consisting of an LSTM layer followed by a dropout layer with the dropout probability set equal to 20 %.Four classification models (M1, M2, M3, and M4) are developed, each dedicated to classifying combined sensor faults in sensors S1, S2, S3, and S4 respectively.The LSTM network architecture is illustrated in Figure 4. Upon training, the model accuracy is assessed using the testing dataset, with an accuracy threshold set to 85%, determined based on previous experience [15].All models exhibit accuracy values higher than 90%, with the lowest accuracy value of 90.3% observed in model M2.The capability of the ICSF approach to identify real-world sensor faults is showcased in the following subsection.

Validation of the ICSF approach on the bridge and results
The validation test uses acceleration measurements from the SHM system of the pedestrian bridge, specifically from the faulty accelerometer FS2.The classification model M2, trained to identify faults in sensor S2, is repurposed to identify faults in sensor FS2.Data recorded by the sensors S1, FS2, S3, and S4 is used as input into the classification model M2.The acceleration measurements have a duration of approximately 7 minutes at a sampling rate of 128 Hz, totaling p = 53,688 data points.The classification model identifies combined sensor faults in four time-windows, indicated in Figure 5 and listed in Table 1.The effectiveness of fault identification by M2 is validated by visually comparing the acceleration measurements from the non-faulty sensor S2 and the faulty sensor FS2.The acceleration measurements from the faulty accelerometer FS2 significantly deviate from those of the non-faulty accelerometer S2 due to the faults occurring in FS2.As shown in Figure 5, window 1 exhibits a linear trend and scaling of the acceleration measurements, indicative of drift and gain.In window 2, a downward shift of the measurements occurs followed by linear deviation, which is identified as bias and drift.A similar trend is observed in window 3, albeit the combined sensor fault starts with drift and is followed by bias.Finally, window 4 exhibits a constant offset combined with isolated peaks in the measurements, which is a characteristic of bias combined with outliers.While the classification model identifies faults in windows 1, 2, and 3 with high confidence, expressed as classification probability, the identification of "bias + outliers" in window 4 is somewhat challenging, due to the intermittent nature of outliers within the data, leading to imbalanced training data for the LSTM network, i.e. fewer occurrences of outliers in the training dataset, compared to the other sensor faults.Nevertheless, identifying outliers in fault diagnosis can be covered by signal pre-processing techniques.The results of the

Softmax function
validation tests show the capability of the ICSF approach in identifying both single and combined sensor faults, indicating the effectiveness of the approach in enhancing the quality of fault diagnosis by providing insights into the causes leading to sensor faults in real-world SHM systems.

Conclusions and future work
Fault diagnosis in SHM systems involves fault detection, isolation, identification, and accommodation.However, fault identification, particularly for combined sensor faults, has received limited attention.Current FD methods often focus on single sensor faults, overlooking real-world scenarios, where combined sensor faults occur.To enhance the quality of fault diagnosis and to gain insight into the causes of sensor faults, this paper has introduced the ICSF approach, which is able to identify combined sensor faults occurring simultaneously within individual sensors using classification models based on LSTM networks.To validate the ICSF approach, a validation test has been conducted using acceleration measurements from a real-world SHM system installed on a pedestrian bridge in Greece.The results demonstrate the effectiveness of the classification models in identifying various combined sensor faults, such as bias, drift, gain, precision degradation, and outliers.However, the classification models have faced challenges in identifying outliers due to the imbalanced training data caused by the intermittent patterns of outliers in continuous signals.Since outliers can be identified and removed through signal preprocessing, the significance of identifying outliers within the FD process is deemed low.Overall, the proposed ICSF approach represents an advancement in improving fault diagnosis in SHM systems.Future research directions include decentralizing the approach, exploring imbalanced-data-handling methods, and enhancing transparency through explainable artificial intelligence interfaces.

Fig. 1 .
Fig. 1.Illustration of the ICSF approach: Activities and actions

Fig. 2 . 3 .
Fig. 2. The pedestrian bridge in Evosmos, Greece Fig. 3. Top view of the pedestrian bridge with the accelerometers

Fig. 4 .
Fig. 4. Network architecture of the classification models

Fig. 5 :
Fig. 5: Acceleration measurements collected by the non-faulty sensor S2 and the faulty sensor FS2

Table 1 :
Results of the classification model M2.