## Abstract

Three-point bending fatigue compliance datasets of multi-layer fiberglass-weave/epoxy test specimens, including 5 and 10 mil interlayers, were analyzed using artificial intelligence (AI) methods along with statistical analysis, revealing the existence of three different compliance-based damage modes. Anomaly detection algorithms helped discover damage indicators observable in short intervals (of 50 cycles) in the compliance data, whose patterns vary with the material and the number of load cycles to which the material is subjected. Machine learning algorithms were applied using the compliance features to assess the likelihood that material failure may occur within a certain number of future loading cycles. High accuracy, precision, and recall rates were achieved in the classification task, for which we evaluated several algorithms, including various variations of neural networks and support vector machines. Thus, our work demonstrates the utility of AI algorithms for discovering a diversity of damage mechanisms and failures.

## Introduction

Mechanical components undergo wear and tear during their usage. Replacing any particular part at earlier stages of its operation may waste the potential work which a component can perform. On the other hand, using a failed mechanical part might compromise the integrity of the whole system. Hence, it is essential to identify whether a particular mechanical part is safe for future usage. In recent years, researchers have explored the applications of machine learning algorithms for various tasks such as damage classification and failure prediction. Characteristics and examples of some of these approaches are detailed later.

*Shallow-feedforward neural networks* such as multi-layer perceptrons (MLPs) are among the earliest neural network models to be explored [1]. Such models often contain only one or two hidden layers in which each node is fully connected to every node in the next layer. These models are usually trained using a variant of gradient descent such as error backpropagation. Feng et al. [2] proposed a backpropagation neural network model to predict the fatigue life of the structures using a fatigue dataset generated by finite element analysis. Kim et al. [3] used a neural network to predict the remaining lifetime of an underlying system. Mortazavi and Ince [4] proposed a radial basis function-based artificial neural network (ANN) model to predict the fatigue crack growth in titanium and aluminum alloys. Tan et al. [5] also presented an ANN model to detect multiple damages locations and severities for steel beams using a vibration dataset.

*Recurrent neural networks* (RNNs) contain cycles of connections between neurons such that the information concerning past data can influence later computations. Such phenomena help RNNs to be useful for analyzing time series-based information. A special case of RNN uses long short-term memory (LSTM) units where each unit contains recurrent connections utilizing gates and memory units [6]. Having such an arrangement facilitates remembering old information in between different time windows. LSTM networks have been very successful in learning unusual patterns in time-series datasets. Choe et al. [7], Zhou et al. [8], and Bao et al. [9] suggested using LSTM-based networks for the detection of structural damages. Yan et al. [10] proposed remaining useful lifetime prediction of a gear using *ordered neurons* LSTM networks.

*Deep neural network* contains more than two hidden layers. Initial layers in such models typically perform convolution operations to extract the most relevant features from the given dataset. Such types of networks are known as convolutional neural networks (CNNs) [11]. They rely on spatial or temporal contiguity to achieve excellent prediction performances in many applications, particularly in image and speech processing. Viola et al. [12] proposed an automated failure detection system for ball-bearing joints using deep learning techniques. Wang et al. [13] and Guo et al. [14] implemented a deep neural network for structural damage identification. Cha et al. [15] proposed a CNN-based approach for detecting cracks on concrete structure images. Ref. [16] presented an autoencoder-based anomaly detection method in which they transformed the structural time series acceleration data into grayscale image vectors to detect multi-pattern anomalies. Khatir et al. [17], Tran-Ngoc et al. [18], and Zenzen et al. [19] used artificial neural networks for damage detection in structures. Lin et al. [20] implemented a one-dimensional convolutional neural network to detect damage locations in a beam. Abdeljaber et al. [21] also proposed a one-dimensional convolutional neural network for real-time damage detection and localization using accelerometers. Bayesian network-based approaches were also proposed by Cantero-Chinchilla et al. [22], O’Dowd et al. [23], and Yan et al. [24] for damage identification. Several other applications of machine learning were proposed by Ritto and Rochinha [25], Rautela et al. [26], Farhan Khan et al. [27], Dhiraj et al. [28], and Zhan et al. [29] for detecting damages in structures.

*Hybrid methods* use a combination of various models, which may include both statistical and artificial intelligence approaches. Agrawal and Choudhary [30] developed an ensemble of different machine learning and regression models to predict the fatigue strength of a given steel alloy. Wang et al. [31] proposed a combination of several neural network models to predict the healing efficiency of hybrid materials. Hou et al. [32] implemented a combination of neural networks and similarity-based models to predict the remaining structural lifetime. Xiang et al. [33] and Rautela and Gopalakrishnan [34] proposed a hybrid approach for structural damage classification using CNNs and LSTMs. Aria et al. [35] presented a hybrid approach for the estimation of structural damage size and remaining lifetime using LSTM and image segmentation with CNN. An et al. [36] developed a machine learning framework to predict the remaining valid lifetime of a milling tool using a combination of CNN, uni- and bi-directional LSTMs. Schwarzer et al. [37] also proposed a hybrid CNN and LSTM model to predict the fracture propagation and failure for brittle materials. Dackermann et al. [38] proposed an ensemble of artificial neural networks to identify defects and their locations for beam structures.

Our work discussed in this paper differs from the research and applications mentioned earlier. We focus on compliance data obtained from a three-point bending test with cyclic loading. Compliance or flexibility is inversely proportional to the stiffness of a material. Mathematically, the compliance (*C*) is defined as the change in deflection (*v*) with respect to the change in external applied force (*P*), as *C* = *dv*/*dP*. Increased compliance indicates that the material will deform more under the applied external force. Using the compliance data and machine learning techniques, we identify potential damage indicators, as well as likely future failures [39]. We consider identifying damage indicators in the compliance as an anomaly detection problem and characterize the structural fatigue behavior with three potential compliance-based damage indicators: steady-state increase, singular jumps, and clusters of jumps, all of which contribute toward the measured compliance. Discovered damage indicators explain the hypotheses about what may be occurring in the material due to successive application of repetitive stress patterns. The failure prediction problem is a classification problem addressed using different machine learning approaches. The contributions of this work are the following:

Identifying three compliance-based damage indicators using a novel anomaly detection approach, involving

*jump detection*and*clustering*to analyze sudden increases in compliance over multiple cycles during which the material is subjected to stress.Solving the failure prediction problem as a multi-class classification problem using information extracted from the compliance changes over time by applying different machine learning algorithms.

## Dataset

The dataset we consider in our work is obtained from a three-Point bend test of fiberglass-weave/epoxy specimens. The testing specimens are manufactured with the following techniques: needling, un-needling, and punching. Needling and un-needling methods result in interlayer thicknesses of 0 mil, 5 mil, and 10 mil, whereas the punching method does not produce any interlayer. The data for 222 specimens are obtained from the laboratory experiments performed by Dr. Robert Haynes, where the specimens are tested with loads ranging from 60, 70, 80, to 90% of the bend strength [40]. A schematic for the three-point bending fatigue experimental setup is shown in Fig. 1.

The dataset includes measurements of mid-point deflection (in millimeters) of the specimen during the loading and unloading phase in the cyclic bend test. The dataset also includes the maximum and minimum load applied in each cycle (in Newtons), allowing computing the compliance. The compliance (mm/N) is computed by examining the differences resulting from the maximum and minimum load conditions applied in each cycle, which is a critical fatigue damage indicator [39].

We consider specimens manufactured with the un-needling technique in this study, where we analyze the data for 222 specimens tested with cycles ranging from 250 to 1,200,000. We manually select 30, 31, and 16 specimens for 0 mil, 5 mil, and 10 mil interlayers, respectively, from the available set of 222 specimens. We ignore the specimens that fail under very few loading and unloading cycles. Table 1 shows the mean failure cycles for the selected specimens with different interlayers and applied loads, along with the standard deviation (SD) values. Mean failure cycles are fewer for higher loads compared to lower loads. In each case, the SD values are large; hence, it is not feasible to presume a fixed number of cycles before failure for any class of samples, necessitating evaluation of samples using nondestructive testing.

Interlayer (mil) | Load (%) | Mean failure cycles | SD |
---|---|---|---|

0 | 60 | 570,554 | 334,992 |

0 | 70 | 19,761 | 14,109 |

0 | 80 | 2993 | 1352 |

0 | 90 | 691 | 292 |

5 | 60 | 306,837 | 159,910 |

5 | 70 | 21,886 | 16,815 |

5 | 80 | 2799 | 2542 |

5 | 90 | 835 | 377 |

10 | 80 | 1739 | 377 |

10 | 90 | 818 | 209 |

Interlayer (mil) | Load (%) | Mean failure cycles | SD |
---|---|---|---|

0 | 60 | 570,554 | 334,992 |

0 | 70 | 19,761 | 14,109 |

0 | 80 | 2993 | 1352 |

0 | 90 | 691 | 292 |

5 | 60 | 306,837 | 159,910 |

5 | 70 | 21,886 | 16,815 |

5 | 80 | 2799 | 2542 |

5 | 90 | 835 | 377 |

10 | 80 | 1739 | 377 |

10 | 90 | 818 | 209 |

In our experiments, we sample a fixed number of cycles (*N* = 50) from the compliance of each selected specimen. We evaluate each of the three interlayer thicknesses separately, splitting the set of 77 (30 + 31 + 16) specimens into three distinct interlayer datasets. The resulting datasets for interlayer thickness 0 mil, 5 mil, and 10 mil (where 1 mil = 0.001 in.) are further split into training, testing, and validation sets, respectively. The number of training, validation, and testing samples for each dataset are shown in Table 2. We consider four different features in a particular sampled slice of fixed cycles from the compliance: minimum displacement, maximum displacement, displacement start range, and displacement end range. They constitute the extracted features for each slice to create a feature vector for machine learning algorithms.

## Anomaly Detection

Now we discuss the approach for identifying compliance-based damage indicators. We define a *jump* as a sudden change in compliance. Our approach is based on the plausible hypothesis that such jumps characterize the damage behavior in the specimens. Compliance value fluctuates during the loading and unloading cycles which makes jump detection in compliance a challenging task. We formulate two different anomaly detection approaches to identify these jumps.

The first approach uses a threshold where we begin by identifying the middle phase of the compliance. We estimate the boundaries of the middle phase by piecewise linear regression. The locations of local gradient maxima at the beginning and end of the compliance curve emerge as robust stage identifiers. The minimum compliance increase rate is calculated via the compliance ratio at the beginning and end of the middle stage. We identify the anomalous jumps by calculating the average compliance for a pre-specified window of past compliance values and their variation. Jumps are identified when a compliance value exceeds a threshold, calculated using a sliding window of

*n*previous compliance values and a multiple*m*of the standard deviation of these values. We set the size of the sliding window*n*= 100 and the multiple of the standard deviation*m*= 3, which allow us to detect all jumps without picking up small peaks that can be considered as noise.A signal processing technique inspires the second approach. The compliance curve for a given specimen is divided into fixed-size slices consisting of 50 cycles each. For each slice, the mean compliance of the slice is subtracted. A convolution operation is carried out by a Gaussian step filter, which gives segment slopes. A sign change in slope is considered to indicate the location of a jump. The cycle index where the sign change occurs gives the location of the jump in the slice. This process is repeated for all the compliance curve slices, and the jump locations are saved. The main steps of the jump detection algorithm are summarized in Fig. 2.

Now that we can identify jumps, we define a new type of damage indicator termed as *jump cluster*, which is a collection of jumps in neighboring slices. To find such jump clusters, once the locations of all the jumps and their compliance values are known, a clustering algorithm is executed to group the jumps into several clusters. Each identified cluster has a centroid, a label, and a group of jumps. Our implementation uses a *k*-means clustering algorithm with the number of clusters *k* = 5, which is found to work best on available data, based on empirical evaluation with various possible values for *k*. Other clustering algorithms may be used (instead of *k*-means clustering), mainly when the number of clusters in the data could be more significant or unknown.

Figure 3 shows the three types of compliance-based damage indicators: steady-state increase, anomalous jump, and a cluster of jumps. All three phenomena contribute to the increase of compliance values. A sample for the identified damage indicators is shown in Fig. 3. The steady-state increase can be seen in the middle phase of the compliance curve ranging from 200 to 1000 cycles. Label 4 shows a single jump and is not assigned any cluster by the clustering algorithm. Labels 2 and 3 represent jump clusters that contain the jumps occurring in neighboring slices, and a “+” sign represents the centroid of the detected cluster. Since we are interested in anomaly detection in the middle phase only, we ignore clusters labeled 1 and 5 since they are toward the beginning and end of the compliance curve.

We also quantify the contribution in the middle phase compliance due to the detected singular jumps and jump clusters. Figure 4 shows a magnified view of the compliance curve where a is detected. The difference in compliance from all detected jumps is computed, and a summation is performed to get the total compliance contribution. The percentage contribution in the middle phase is found to account for up to 40% of the total compliance.

## Failure Prediction

We consider a scenario in which a decision is required for a given component to either replace or allow continued usage. In such scenario, we use the AI models and machine learning algorithms mentioned earlier (such as CNN, LSTM, and MLP) to answer questions such as “If we have no history of a specimen, other than the most recent 50 stress cycles, can we predict whether it is likely to fail in the next 50 cycles?” based on compliance data. Accurate answers to such questions are essential. For example, withdrawing the component that is in no danger of failing might increase maintenance costs. On the other hand, hazardous conditions can result from using a component that may fail during its operation.

To answer such questions, we formulate a multi-class failure classification problem where we consider three failure classes. It is assumed that the last 15% of the observed life cycles are in the failure region for all compliance curves. Therefore, the compliance curve slices, which are in the first 85% life cycles, are labeled as Class 0 (no failure). For the last 15% life cycles, we create two additional classes. Class 1 and Class 2 are defined as “may fail” and “failure in the next 50 cycles”—intervals other than 50 cycles may also be used. Class 2 indicates whether the failure will occur in the next 50 cycles, which is a good indicator class to decide the future usage for a particular component. Class 1 is the intermediate stage between no failure and failure in the next 50 cycles class.

The assumption for considering 15% failure cycles leads to a comparatively small number of class labels available for classes 1 and 2 since 85% of the created labels belong to class 0. It gives rise to class imbalance which is avoided by oversampling class 1 and 2 labels resulting in an equal number of failure class labels. A summary of the available training, validation, and testing datasets for different interlayers is listed in Table 2. These samples are used to train the following four types of machine learning models for failure prediction.

The first step in applying learning algorithms is obtaining high-level hyperparameters, which must be determined before a learning algorithm can be applied. These parameters are different from the learning parameters such as connection weights in a neural network model or numerical coefficients in a regression model. In order to find out the architectures of LSTM, CNN, and MLP models for our datasets, we perform hyperparameter tuning to determine the number of hidden layers, number of neurons, activation functions, number of LSTM units, batch size, dropout ratio, and learning rate. The hyperparameter tuning is carried out using a grid search method where the candidate models are trained for a fixed number of 100 epochs. We select the architectures with the smallest number of trainable parameters that succeed in obtaining good quality results.

### Support Vector Regression Model.

A support vector machine (SVM) creates a set of hyperplanes in a high dimensional space which can be used for classification and regression tasks. A specialized version of SVM is Support Vector Regression (SVR), which can work with continuous values. One of the main advantages of using SVR is the tunable threshold parameter (*ɛ*) which can be adjusted to fit the error within a specified tolerance. A good SVR model can achieve good separation for the given data points from hyperplanes. SVR transforms the data into the desired high-dimensional space by applying a user-selected kernel (mathematical) function. Examples of popular kernels include linear, nonlinear, polynomial, radial basis function (RBF), and sigmoid kernels.

*ɛ-Support Vector Regression* model is implemented using “scikit-learn” python package. Gaussian RBF is selected as kernel type since it can measure the similarity among the data points to better differentiate among failure classes. By performing a grid search for the regularization parameter (C) in the range [10, 250] and kernel coefficient *γ* in the range [0.1, 2.5], the identified values which are found to work best for the Gaussian kernel are shown in Table 3.

### Convolutional Neural Network Model.

Convolutional neural networks (CNNs) provide near-human performances on tasks such as classification and object detection. High-dimensional features are extracted from the input dataset using convolution operations, and low-dimensional features are built using the extracted features. In a typical CNN layer, the convolution operations are carried out using different convolution kernels in horizontal and vertical strides on the input data represented in matrices. This study utilizes a specialized one-dimensional version of CNN where a one-dimensional convolution operation is performed on the compliance data in vectors instead of matrices. This type of network selection is due to the nature of compliance features computed for fixed-size compliance slices.

CNN architecture details are as follows. The input layer is connected to a 1D convolution layer with 128 filters, kernel size of 3, and ReLU activation nodes. The output of the convolution layer is flattened and passed to a dense layer with 16 units having ReLU activation. The final layer is the output layer with 4 units and a softmax activation function. There are a total of 4692 trainable parameters in the 1D CNN model. Adam optimizer is used with a categorical cross-entropy loss function. The parameters used for Adam optimizer are shown in Table 4. The model is trained with a batch size of 1024 for 2000 epochs. Figures 5(g), 5(h), and 5(i) show the training and validation losses for the 1D CNN architecture trained with different interlayer datasets. The output from the 1D CNN model is failure class probability, and the class with maximum probability is selected as the predicted class.

Parameter | Description | Value |
---|---|---|

L_{r} | Learning rate | 0.001 |

β_{1} | Decay rate for the first moment | 0.9 |

β_{2} | Decay rate for the second moment | 0.999 |

ɛ | Numerical stability constant | 10^{−7} |

Parameter | Description | Value |
---|---|---|

L_{r} | Learning rate | 0.001 |

β_{1} | Decay rate for the first moment | 0.9 |

β_{2} | Decay rate for the second moment | 0.999 |

ɛ | Numerical stability constant | 10^{−7} |

### Long Short-Term Memory Model.

Convolution neural networks do not store the information associated with the model, which is problematic for sequential data since the sequence output depends on the previous sequence output. LSTM models overcome this problem and can efficiently handle sequential or time-series data. A typical LSTM network consists of cells that contain memory blocks and gates. When the LSTM network is trained with the error-based backpropagation method using a gradient-based optimizer, the memory blocks prevent the vanishing gradients problem by memorizing the network parameters over time. Gates are similar to logical switches, which are activated by the selected activation function. Multiple cells containing memory blocks and gates are stacked in series by connecting the output from the previous cell to the input of the next cell. Finally, the stacked cells are trained end-to-end with the given sequential dataset. Interested readers can refer to the following reference for additional details on the working of LSTM models [6].

The following architecture of the LSTM network is used in the current work. The input layer is connected to an LSTM layer with 32 LSTM units. The output of the LSTM layer is fed to the dense layer, which has eight neurons with ReLU activation function. The dense layer is fed to the output layer, which has a linear activation function. The kernels in the dense and output layers are initialized uniformly. We use Adam optimizer which is a popular gradient-based optimizer used for problems with large dataset or model parameters [41]. The parameters to optimize the model using the Adam optimizer are shown in Table 4.

A mean-squared error loss function is used to quantify the error in prediction against ground truth labels. There are 4625 trainable parameters in the specified LSTM model. Figures 5(a), 5(b), and 5(c) show the training and validation losses for the LSTM architecture trained with different interlayer datasets. The prediction output from the LSTM model is a real number between 0.0 and 2.0. If the model prediction ranges from 0.0 to 0.6 (exclusive), the class label 0 is assigned to the prediction. Similarly, the ranges from 0.6 to 1.2 (exclusive) and 1.2 to 2.0 (inclusive) are assigned class labels 1 and 2, respectively.

### Multi-Layer Perceptron Model.

Multi-layer perceptrons are used to learn regression and classify datasets that are not linearly separable. They are very flexible and can be used to learn the mapping from the input datasets to the output labels. A typical MLP consists of an input layer, hidden layers, and a fully connected output layer (also called the visible layer). The outputs from the hidden layers and visible layer depend on the selected activation function. The end-to-end training of the model is carried out using a gradient descent-based optimizer which has been successful in many neural network training applications.

The architecture of the implemented MLP model is as follows. We use two hidden layers with ReLu activation. A total of 64 and 32 hidden units are used for the first and second hidden layers, respectively. The output layer has three classes with a softmax activation function. There are a total of 2532 trainable parameters in the MLP model, which are optimized using the previously mentioned Adam optimizer with parameters shown in Table 4. The training is carried out using 1024 batch size and 2000 epochs with categorical cross-entropy as a loss function. Figures 5(d), 5(e), and 5(f) show the training and validation losses for MLP architecture trained on different interlayer datasets. With the softmax activation function, the prediction output from the MLP model is the failure class probability, and the class with maximum probability is selected as the predicted class.

## Results and Discussion

The anomaly detection method can also quantify the compliance contribution due to detected damage in the middle compliance region. Table 5 shows the percentage of compliance contribution in the middle phase for various loading conditions and interlayers. The higher the number of jumps and clusters, the higher is the contribution due to their increased compliance. We notice that a 0 mil interlayer sample has only two types of damage indicators, whereas 5 mil and 10 mil interlayer samples have three indicators. In Table 5, it can be seen that 5 and 10 mil interlayer samples have more jumps and clusters; therefore, the contribution in compliance by jumps is also higher as compared to the 0 mil interlayer samples. The data for 10 mil interlayer samples with 70% loading are ignored due to very high irregularities in the compliance.

Interlayer | Compliance contribution (%) due to jumps | |||
---|---|---|---|---|

(mil) | 60% Load | 70% Load | 80% Load | 90% Load |

0 | 22.54 | 14.81 | 19.41 | 34.18 |

5 | 22.27 | 22.98 | 27.01 | 44.75 |

10 | 40.70 | – | 36.62 | 44.15 |

Interlayer | Compliance contribution (%) due to jumps | |||
---|---|---|---|---|

(mil) | 60% Load | 70% Load | 80% Load | 90% Load |

0 | 22.54 | 14.81 | 19.41 | 34.18 |

5 | 22.27 | 22.98 | 27.01 | 44.75 |

10 | 40.70 | – | 36.62 | 44.15 |

The performances of the trained models are compared with the testing samples. We use the following performance metrics: accuracy, precision, recall, and F1 score. Tables 6, 7, and 8 show the performance of the different machine learning approaches for 0 mil, 5 mil, and 10 mil interlayer testing datasets, respectively.

Model | Accuracy | Precision | Recall | F1 score |
---|---|---|---|---|

SVR | 94.98 | 94.95 | 94.98 | 94.94 |

LSTM | 96.42 | 96.44 | 96.42 | 96.39 |

MLP | 95.22 | 95.24 | 95.22 | 95.22 |

1D CNN | 95.17 | 95.20 | 95.17 | 95.15 |

Model | Accuracy | Precision | Recall | F1 score |
---|---|---|---|---|

SVR | 94.98 | 94.95 | 94.98 | 94.94 |

LSTM | 96.42 | 96.44 | 96.42 | 96.39 |

MLP | 95.22 | 95.24 | 95.22 | 95.22 |

1D CNN | 95.17 | 95.20 | 95.17 | 95.15 |

Model | Accuracy | Precision | Recall | F1 score |
---|---|---|---|---|

SVR | 90.84 | 91.61 | 90.84 | 90.76 |

LSTM | 94.77 | 94.80 | 94.77 | 94.71 |

MLP | 89.50 | 90.81 | 89.50 | 89.55 |

1D CNN | 89.02 | 90.89 | 89.02 | 89.06 |

Model | Accuracy | Precision | Recall | F1 score |
---|---|---|---|---|

SVR | 90.84 | 91.61 | 90.84 | 90.76 |

LSTM | 94.77 | 94.80 | 94.77 | 94.71 |

MLP | 89.50 | 90.81 | 89.50 | 89.55 |

1D CNN | 89.02 | 90.89 | 89.02 | 89.06 |

Model | Accuracy | Precision | Recall | F1 score |
---|---|---|---|---|

SVR | 95.00 | 95.11 | 95.00 | 94.99 |

LSTM | 81.65 | 82.99 | 81.65 | 80.93 |

MLP | 87.56 | 87.94 | 87.56 | 87.62 |

1D CNN | 81.47 | 81.56 | 81.47 | 81.42 |

Model | Accuracy | Precision | Recall | F1 score |
---|---|---|---|---|

SVR | 95.00 | 95.11 | 95.00 | 94.99 |

LSTM | 81.65 | 82.99 | 81.65 | 80.93 |

MLP | 87.56 | 87.94 | 87.56 | 87.62 |

1D CNN | 81.47 | 81.56 | 81.47 | 81.42 |

For 0 mil interlayer models, the highest accuracy of 96.4% is achieved by the LSTM model. With LSTM, the advantage of retaining the learned information over the short-term results in slightly better accuracy, as shown in Table 6. In the 0 mil interlayer dataset, the middle region’s compliance is flat and has a steady compliance increase. LSTM can capture this behavior slightly better as compared to the other models. The testing performances of MLP and 1D CNN models are approximately equal. Corresponding to Table 6, the confusion matrices are shown in Fig. 6.

Comparing the testing accuracy of 5 mil interlayer models, the LSTM model provides the highest accuracy of 94.8%. The second-best accuracy of 90.8% is provided by the SVR model, whereas MLP and 1D CNN models give similar accuracy. The compliance data for 5 mil interlayer has a relatively higher number of irregular jumps and clusters than 0 mil interlayer compliance data. The LSTM model best captures the repeated occurrences of those jumps and clusters in slices due to the inherent nature of information retention with LSTM units. The obtained performance metrics for various models are shown in Table 7, and the corresponding confusion matrices are shown in Fig. 7.

For the 10 mil interlayer models, the linear SVR model achieves the highest accuracy of 95.0%. The compliance data for the majority of the 10 mil interlayer specimens are linear in the middle region. This behavior is best captured by the linear SVR model and gives the best accuracy compared to the other models, as shown in Table 8. The corresponding confusion matrices are shown in Fig. 8. The neural network model performance is worse than SVR due to the relatively small amount of data for 10 mil interlayer thickness; in other cases, the amount of data is sufficient for the neural network model to perform well. Figure 5 shows that neither overfitting nor underfitting takes place during the training process, and the corresponding losses decrease gradually with increasing epochs.

## Conclusion and Future Work

We have shown that anomaly detection algorithms can help discover damage indicators for bending fatigue. We propose three compliance-based damage indicators, i.e., steady-state, singular jumps, and jump clusters contributing to the compliance change. The lifetime of extreme stress-sensitive materials might get affected by slight variations in applied loads. Such theoretical understandings and practical applications concerning material lifetimes can be assisted by further study into how damage indicators contribute to compliance changes.

For future work, different qualitative measures can be explored to study the compliance curve behavior. Further research can take advantage of more detailed datasets and long-term monitoring of the specimens under testing to facilitate more sophisticated answers to fatigue questions. Incorporating prior domain knowledge of materials science via Bayesian learning algorithms can further address the classification task.

## Footnote

## Acknowledgment

We are thankful to Robert Haynes for conducting the experiments at Army Research Laboratory and providing us the experimental data. Further details about the experiments can be found in the following publication [40].

## Conflict of Interest

There are no conflicts of interest.

## Data Availability Statement

The data and information that support the findings of this article are freely available at online.^{2} Data provided by a third party listed in Acknowledgment.