Data Driven Remaining Useful Life Prediction Via M
Data Driven Remaining Useful Life Prediction Via M
Data Driven Remaining Useful Life Prediction Via M
ISA Transactions
journal homepage: www.elsevier.com/locate/isatrans
Research article
highlights
• A new deep long short-term memory (DLSTM) model is constructed for accurate remaining useful life (RUL) prediction.
• Proposed DLSTM model fuses multi-sensor signals for enhanced RUL prediction performance.
• The proposed method is very suitable for multisensory scenario which is validated by two multisensory experiments.
article info a b s t r a c t
Article history: Remaining useful life (RUL) prediction is very important for improving the availability of a system and
Received 4 April 2019 reducing its life cycle cost. This paper proposes a deep long short-term memory (DLSTM) network-
Received in revised form 26 June 2019 based RUL prediction method using multiple sensor time series signals. The DLSTM model fuses
Accepted 2 July 2019
multi-sensor monitoring signals for accurate RUL prediction, which is able to discover the hidden
Available online xxxx
long-term dependencies among sensor time series signals through deep learning structure. By grid
Keywords: search strategy, the network structure and parameters of the DLSTM are efficiently tuned using
Remaining useful life an adaptive moment estimation algorithm so as to realize an accurate and robust prediction. Two
Deep long short-term memory (DLSTM) various turbofan engine datasets are adopted to verify the performance of the DLSTM model. The
neural networks experimental results demonstrate that the DLSTM model has a competitive performance in comparison
Deep learning
with state-of-the-arts reported in literatures and other neural network models.
Sensor data fusion
© 2019 ISA. Published by Elsevier Ltd. All rights reserved.
https://doi.org/10.1016/j.isatra.2019.07.004
0019-0578/© 2019 ISA. Published by Elsevier Ltd. All rights reserved.
Please cite this article as: J. Wu, K. Hu, Y. Cheng et al., Data-driven remaining useful life prediction via multiple sensor signals and deep long short-term memory neural
network. ISA Transactions (2019), https://doi.org/10.1016/j.isatra.2019.07.004.
2 J. Wu, K. Hu, Y. Cheng et al. / ISA Transactions xxx (xxxx) xxx
Please cite this article as: J. Wu, K. Hu, Y. Cheng et al., Data-driven remaining useful life prediction via multiple sensor signals and deep long short-term memory neural
network. ISA Transactions (2019), https://doi.org/10.1016/j.isatra.2019.07.004.
J. Wu, K. Hu, Y. Cheng et al. / ISA Transactions xxx (xxxx) xxx 3
but also shares with other neurons. For DLSTM model, the num-
ber of LSTM layers and the neuron number in LSTM layer are vital
to model performance. Hence, these two important parameters
are optimized by grid search strategy in this paper, which is
detailed in Section 2.3.
A fully-connected dense layer is adopted as the output layer,
where LSTM layer output are sent into it and multi-sensor data
are eventually fused to the RUL values. The mean squared error
function, as a commonly used loss function in machine learning,
is adopted for minimizing the error between the predicted RULs
and the RUL labels. During the testing stage, online sensor data is
sequentially sent into the trained DLSTM and predicted RULs will
be acquired.
Please cite this article as: J. Wu, K. Hu, Y. Cheng et al., Data-driven remaining useful life prediction via multiple sensor signals and deep long short-term memory neural
network. ISA Transactions (2019), https://doi.org/10.1016/j.isatra.2019.07.004.
4 J. Wu, K. Hu, Y. Cheng et al. / ISA Transactions xxx (xxxx) xxx
model is shown in Fig. 3. In this schematic, red circles are the Both the Score and R are used to assess the difference between
neurons in hidden layer which are temporarily discarded from
the forecasted RULs and the actual RULs. And a small Score or R
the network according to a certain probability during the training
value represents a good predictive effect. However, there exists a
process of DLSTM. Since the discards occur randomly, different
subtle difference between the two indicators. As shown in Fig. 5,
networks are trained in each mini-batch. Hence, Dropout can
the Score penalizes late prediction more than the early prediction.
effectively alleviate the data overfitting problem of DLSTM. It
The indicator RUL error range represents the margin of error
should be pointed out that dropout only works during the train-
ing process and is disabled during the testing process, which for all RUL prediction values. A smaller RUL error range reflects
means all hidden neurons are working during the testing process. the higher effectiveness and stability of the prediction method.
In this paper, all candidate dropout values are tested, and the
optimal values are finally applied to DLSTM.
3.3. Multiple sensor data preprocessing
3. Experiment analysis
3.3.1. Sensor data smoothing
3.1. Dataset description Multiple sensor data obtained from the turbofan engines
present a large random fluctuation and noise jamming, which
The datasets used for evaluating the proposed method are might affect the performance of RUL prediction. The exponential
the NASA turbofan engine datasets, which are produced by the smoothing algorithm is adopted to remove the noise and weaken
CMAPSS platform [35]. Different operational settings containing the random fluctuation in the sensor data, which is expressed as
fuel velocity and pressure are variably input for the simulation of
x′t = α xt + (1 − α) x′t −1 ,
{
if t ≥ 2
different faults and degradation process in the turbofan engines. (16)
During the experiments, the turbofan engines begin to run in x′1 = x1 , if t = 1
good condition and some faults are developed which generates where xt is the actual measurement of sensors at t time, x′t repre-
degradation until a failure happens. sents the smoothed value at t time and x′t −1 holds the smoothed
The C-MAPSS platform offers four datasets: FD001∼FD004. In
value at t − 1 time, α indicates the smoothing coefficient.
each dataset, both training and testing sets are included. The
The value of α directly decides the smoothing effect of en-
training sets hold the signal of the whole life time while testing
gine sensor data which indirectly affects the accuracy of RUL
sets only contains multiple sensor data terminated at some time
prediction. Fig. 6 shows the preprocessed sensor data of sensor 2
before engine failure and the RUL need to be predicted. Both the
training sets and the testing sets consist of a series of cycles and with different α compared with raw sensor data in FD001. Three
each cycle contains 26 columns which respectively indicate the different α values are used for the sensor data smoothing which
ID of engine, cycle index, three operational settings and 21 sensor are 0.25, 0.5 and 0.75. As illustrated in Fig. 6, the fluctuations
measurements. of smoothed sensor data are reduced compared with raw sensor
Two groups of the datasets, that is FD001 and FD003, are data and the smoothed sensor data can well reflect the trend of
adopted in this paper since the engine has obvious and clear the raw sensor data. In addition, it is found through a series of
health degradation process. FD001 has only one failure mode comparative experiments that the preprocessed sensor data with
while FD003 has two failure modes. In addition, both FD001 and α value of 0.25 has smaller fluctuations which means the data
FD003 contain 100 training engines and 100 testing engines. smoothing effect is better. So, α is set to 0.25 in this experiment.
Please cite this article as: J. Wu, K. Hu, Y. Cheng et al., Data-driven remaining useful life prediction via multiple sensor signals and deep long short-term memory neural
network. ISA Transactions (2019), https://doi.org/10.1016/j.isatra.2019.07.004.
J. Wu, K. Hu, Y. Cheng et al. / ISA Transactions xxx (xxxx) xxx 5
Please cite this article as: J. Wu, K. Hu, Y. Cheng et al., Data-driven remaining useful life prediction via multiple sensor signals and deep long short-term memory neural
network. ISA Transactions (2019), https://doi.org/10.1016/j.isatra.2019.07.004.
6 J. Wu, K. Hu, Y. Cheng et al. / ISA Transactions xxx (xxxx) xxx
Fig. 7. (a) Sorted CSC values of sensors in FD001; (b) Sorted CSC values of sensors in FD003.
3.3.3. RUL label randomly chosen for the DLSTM model training and the rest of
Noted that the label value has a significant impact on the pre- 10 engines are used to verify the model’s effectiveness.
diction performance. Several papers have proved that piecewise Next, considering the characteristics of monitoring signals in
linear labeling of CMASS engine degradation data is effective and datasets, the proposed DLSTM model for RUL prediction is con-
beneficial [20,38]. In other words, RUL label is assumed to be structed as well as its parameters will be determined optimally.
constant in the initial period and degrades linearly afterwards. The model is trained by the grid search method for exploring
According to the literature [20,38], sample points in the early an optimal DLSTM structure. A two-dimensional grid is formed
stage is labeled with a constant RUL value, which is set to 125 by the LSTM layer number and the neuron number in each
in this paper. layer form, and each node parameter in the grid is verified as a
candidate parameter. Considering the time constraints and com-
putational complexity, the LSTM layer number is set from 1 to 6
3.4. Case one and neuron number in each LSTM layer is set from 50 to 300. In
this grid, each two-parameter combination is used to construct
Dataset FD001 is analyzed for validation of the proposed a new DLSTM. The 10 validation engines are adopted to verify
method. First of all, the selected sensor data of 100 engines each model structure and RMSE are utilized for comparing model
are used to build the training datasets after the preprocessing training results. Fig. 8 shows the training result of DLSTM with
of multiple sensor data is completed, in which 90 engines are different layer numbers and neuron numbers. It can be seen
Please cite this article as: J. Wu, K. Hu, Y. Cheng et al., Data-driven remaining useful life prediction via multiple sensor signals and deep long short-term memory neural
network. ISA Transactions (2019), https://doi.org/10.1016/j.isatra.2019.07.004.
J. Wu, K. Hu, Y. Cheng et al. / ISA Transactions xxx (xxxx) xxx 7
Fig. 8. Training of DLSTM with different layer numbers and neuron numbers.
Table 1
Training results of DLSTM in the case of partial parameter combination.
No. LSTM layer Neurons number RMSE Training
number in each layer time (s)
1 2 150 20.40 34804.87
2 3 200 20.35 69720.27
3 3 250 18.56 98679.69
4 4 250 20.52 153776.33
5 5 100 18.43 56994.93
6 5 300 18.70 260307.16
Please cite this article as: J. Wu, K. Hu, Y. Cheng et al., Data-driven remaining useful life prediction via multiple sensor signals and deep long short-term memory neural
network. ISA Transactions (2019), https://doi.org/10.1016/j.isatra.2019.07.004.
8 J. Wu, K. Hu, Y. Cheng et al. / ISA Transactions xxx (xxxx) xxx
Fig. 12. Real RUL versus predicted RUL for 100 engines.
Please cite this article as: J. Wu, K. Hu, Y. Cheng et al., Data-driven remaining useful life prediction via multiple sensor signals and deep long short-term memory neural
network. ISA Transactions (2019), https://doi.org/10.1016/j.isatra.2019.07.004.
J. Wu, K. Hu, Y. Cheng et al. / ISA Transactions xxx (xxxx) xxx 9
Fig. 13. The boxplots of predicted RUL error for five models.
Table 5
Performance comparisons of five models on the dataset FD003.
Methods Score R RUL error Training Online average
range time (s) calculation time (s)
DRNN 1358 26.12 [−73,44] 81503.25 0.11
DGRU 1105 20.86 [−48,45] 296997.23 0.15
BDLSTM 980 19.48 [−54,67] 234484.16 0.28
BDGRU 967 19.94 [−52,67] 449436.09 0.36
DLSTM 828 19.78 [−44,38] 346454.71 0.18
on dataset FD001. Meanwhile, a variety of RNN models are [6] Liao CL, Köttig F. A hybrid framework combining data-driven and model-
compared with the DLSTM, and the prediction results of dataset based methods for system remaining useful life prediction. Appl Soft
Comput 2016;44:191–9.
FD003 show that the DLSTM model is superior to other RNN
[7] Ramasso E, Rombaut M, Zerhouni N. Joint prediction of continuous and
models. discrete states in time-series based on belief functions. IEEE Trans Cybern
In addition, this method has the ability of multisensory data 2013;43:37–50.
fusion and can be widely applied to different types of equipment [8] Liu J, Zio E. Prediction of peak values in time series data for prognos-
in industrial field. Hence, the application of the proposed method tics of critical components in nuclear power plants. IFAC-PapersOnLine
2016;49(28):174–8.
in other multi-sensor monitoring equipment will be carried out [9] Ompusunggu AP, Papy JM, Vandenplas S. Kalman-filtering-based prog-
in the future. nostics for automatic transmission clutches. IEEE/ASME Trans Mech
2016;21:419–30.
Acknowledgments [10] Wu JR, Xu JX, Huang XL. An indirect prediction method of remaining life
based on glowworm swarm optimization and extreme learning machine
for lithium battery. In: Proceedings of the 2017 36th chinese control
This research is funded in part by the National Natural Sci- conference (CCC). Dalian (China); 2017. p. 7259–64.
ence Foundation of China under the Grant No. 51875225 and [11] Khelif R, Chebel-Morello B, Malinowski S, Laajili E. Direct remaining useful
51605095, in part by the National Key Research and Development life estimation based on support vector regression. IEEE Trans Ind Electron
2017;64(3):2276–84.
Program of China under the Grant No. 2018YFB1702302, and in
[12] Morando S, Jemei S, Gouriveau R, Zerhouni N, Hissel D. Fuel cells remaining
part by the Key Research and Development Program of Guang- useful lifetime forecasting using echo state network. In: Proceedings of the
dong Province, China under the Grant No. 2019B090916001. 2014 IEEE vehicle power and propulsion conference. Coimbra (Portugal);
2014. p. 1–6.
Declaration of competing interest [13] Li X, Ding Q, Sun JQ. Remaining useful life estimation in prognostics using
deep convolution neural networks. Reliab Eng Syst Saf 2018;172:1–11.
[14] Liu H, Zhou J, Zheng Y, Jiang W, Zhang Y. Fault diagnosis of rolling
The author(s) declared no potential conflicts of interest with bearings with recurrent neural network-based autoencoders. ISA Trans
respect to the research, authorship, and/or publication of this 2018;77:167–78.
article. [15] Malhi A, Gao RX. Recurrent neural networks for long-term prediction in
machine condition monitoring. In: Proceedings of the 21st IEEE instru-
mentation and measurement technology conference. Como (Italy); 2004.
References p. 2048–53.
[16] Chandra R. Competition and collaboration in cooperative coevolution of
[1] Lei Y, Li N, Guo L, Li N, Yan T, Lin J. Machinery health prognostics: a Elman recurrent neural networks for time-series prediction. IEEE Trans
systematic review from data acquisition to RUL prediction. Mech Syst Neural Netw Learn Syst 2015;26(12):3123–36.
Signal Process 2018;104:799–834. [17] Lukoševičius M, Jaeger H. Reservoir computing approaches to recurrent
[2] Wu J, Wu CY, Cao S, Or SW, Deng C, Shao XY. Degradation data- neural network training. Comput Sci Rev 2009;3(3):127–49.
driven time-to-failure prognostics approach for rolling element bearings [18] Liu B, Cheng J, Cai K, Shi P, Tang X. Singular point probability improve
in electrical machines. IEEE Trans Ind Electron 2019;66(1):529–39. LSTM network performance for long-term traffic flow prediction. In: Na-
[3] Wu J, Wu C, Lv Y, Deng C, Shao X. Design a degradation condition tional conference of theoretical computer science(NCTCS 2017): Theoretical
monitoring system scheme for rolling bearing using EMD and PCA. Ind Computer Science. Wuhan (China); 2017. p. 328–40.
Manage Data Syst 2017;117:713–28. [19] Zhao R, Wang DZ, Yan RQ, Mao KZ, Shen F, Wang JJ. Machine health
[4] Li L-L, Zhang X-B, Tseng M-L, Zhou Y-T. Optimal scale gaussian pro- monitoring using local feature-based gated recurrent unit networks. IEEE
cess regression model in insulated gate bipolar transistor remaining life Trans Ind Electron 2018;65(2):1539–48.
prediction. Appl Soft Comput 2019;78:261–73. [20] Wu YT, Yuan M, Dong SP, Lin L, Liu YQ. Remaining useful life estimation of
[5] Wang YH, Deng C, Wu J, Wang YC, Xiong Y. A corrective maintenance engineered systems using vanilla LSTM neural networks. Neurocomputing
scheme for engineering equipment. Eng Fail Anal 2014;36:269–83. 2018;275:167–79.
Please cite this article as: J. Wu, K. Hu, Y. Cheng et al., Data-driven remaining useful life prediction via multiple sensor signals and deep long short-term memory neural
network. ISA Transactions (2019), https://doi.org/10.1016/j.isatra.2019.07.004.
10 J. Wu, K. Hu, Y. Cheng et al. / ISA Transactions xxx (xxxx) xxx
[21] Guo L, Li N, Jia F, Lei Y, Lin J. A recurrent neural network based health [36] Saxena A, Goebel K, Simon D, Eklund N. Damage propagation modeling
indicator for remaining useful life prediction of bearings. Neurocomputing for aircraft engine run-to-failure simulation. In: Proceedings of the 2008
2017;240:98–109. international conference on prognostics and health management. Denver
[22] Cheng YW, Zhu HP, Wu J, Shao XY. Machine health monitoring using (USA); 2008. p. 1–9.
adaptive kernel spectral clustering and deep long short-term memory [37] Elbouchikhi E, Choqueuse V, Amirat Y, Benbouzid MEH, Turri S. An effi-
recurrent neural networks. IEEE Trans Ind Inform 2019;15(2):987–97. cient Hilbert-Huang transform-based bearing faults detection in induction
[23] Elsheikh A, Yacout S, Ouali MS. Bidirectional handshaking LSTM for machines. IEEE Trans Energy Convers 2018;32(2):401–13.
remaining useful life prediction. Neurocomputing 2019;323:148–56. [38] Li X, Ding Q, Sun JQ. Remaining useful life estimation in prognostics using
[24] Xia M, Li T, Xu L, Liu LZ, Silva CW. Fault diagnosis for rotating machinery deep convolution neural networks. Reliab Eng Syst Safe 2018;172:1–11.
using multiple sensors and convolutional neural networks. IEEE/ASME [39] Babu GS, Zhao P, Li XL. Deep convolutional neural network based regres-
Trans Mech 2018;23:101–10. sion approach for estimation of remaining useful life. In: International
[25] Al-Sharman MK, Emran BJ, Jaradat MA, Najjaran H, Al-Husari R, Zweiri Y. conference on database systems for advanced applications. Dallas (USA);
Precision landing using an adaptive fuzzy multi-sensor data fusion 2016. p. 214–28.
architecture. Appl Soft Comput 2018;69:149–64. [40] Javed K, Gouriveau R, Zerhouni N. A new multivariate approach for
[26] Wang W, Hong G, Wong Y, Zhu K. Sensor fusion for online tool condition prognostics based on extreme learning machine and fuzzy clustering. IEEE
monitoring in milling. Int J Prod Res 2007;45(21):5095–116. Trans Cybern 2015;45(12):2626–39.
[27] Wu J, Su Y, Cheng Y, Shao X, Deng C, Liu C. Multi-sensor information [41] Louen C, Ding SX, Kandler C. A new framework for remaining useful
fusion for remaining useful life prediction of machining tools by adaptive life estimation using support vector machine classifier. In: Proceedings of
network based fuzzy inference system. Appl Soft Comput 2018;68:12–23. the 2013 conference on control and fault-tolerant systems (SysTol). Nice
[28] Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput (France); 2013. p. 228–33.
1997;9:1735–80. [42] Peng Y, Wang H, Wang JM, Liu DT, Peng XY. A modified echo state network
[29] Qiu X, Ren Y, Suganthan PN, Amaratunga GAJ. Empirical mode decomposi- based remaining useful life estimation approach. In: Proceedings of the
tion based ensemble deep learning for load demand time series forecasting. 2012 IEEE conference on prognostics and health management. Denver
Appl Soft Comput 2017;54:246–55. (USA); 2012. p. 1–7.
[30] Wang JL, Zhang J, Wang XX. Bilateral LSTM: a two-dimensional long short- [43] Wang T, Yu J, Siegel D, Lee J. A similarity-based prognostics approach for
term memory model with multiply memory units for short-term cycle remaining useful life estimation of engineered systems. In: Proceedings of
time forecasting in re-entrant manufacturing systems. IEEE Trans Ind Inf the 2008 IEEE conference on prognostics and health management. 2008.
2018;14(2):748–58. p. 1–6.
[31] Liu B, Wang L, Liu M, Xu C. Lifelong federated reinforcement learning: [44] Schmidhuber J. Deep learning in neural networks: an overview. Neural
a learning architecture for navigation in cloud robotic systems. arXiv Netw 2015;61:85–117.
preprint arXiv:1901.06455. [45] Cho K, Van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F,
[32] Almalaq A, Zhang JJ. Evolutionary deep learning-based energy consumption Schwenk H, Bengio Y. Learning phrase representations using RNN encoder–
prediction for buildings. IEEE Access 2019;7:1520–31. decoder for statistical machine translation. arXiv preprint arXiv:1406.
[33] Hu YL, Chen L. A nonlinear hybrid wind speed forecasting model using 1078.
LSTM network, hysteretic ELM and differential evolution algorithm. Energ [46] Zhao R, Wang D, Yan R, Mao K, Shen F, Wang J. Machine health monitoring
Convers Manage 2018;173:123–42. using local feature-based gated recurrent unit networks. IEEE Trans Ind
[34] Wielgosz M, Skoczeń A, Mertik M. Using lstm recurrent neural networks Electron 2018;65(2):1539–48.
for monitoring the LHC superconducting magnets. Nucl Instrum Methods [47] Zhang Z, Pinto J, Plahl C, Schuller B, Willett D. Channel mapping using
A 2017;867:40–50. bidirectional long short-term memory for dereverberation in hands-free
[35] Frederick DK, DeCastro JA, Litt JS. Users guide for the commercial voice controlled devices. IEEE Trans Consum Electr 2014;60(3):525–33.
modular aero-propulsion system simulation (c-mapss). Technical Manual
TM2007-215026, Cleveland (USA): NASA/ARL; 2007.
Please cite this article as: J. Wu, K. Hu, Y. Cheng et al., Data-driven remaining useful life prediction via multiple sensor signals and deep long short-term memory neural
network. ISA Transactions (2019), https://doi.org/10.1016/j.isatra.2019.07.004.