Rolling Bearing Fault Diagnosis Based On Convolutional Neural Network and Support Vector Machine
August 6, 2020.
10.1109/ACCESS.2020.3012053
ABSTRACT Rolling bearings are one of the essential components in rotating machinery. Efficient bearing
fault diagnosis is necessary to ensure the regular operation of the mechanical system. Traditional fault
diagnosis methods usually rely on a complex artificial feature extraction process, which requires a lot of
human expertise. Emerging deep learning methods can reduce the dependence of the feature extraction
process on manual intervention effectively. However, its training requires a large number of fault signals,
which is difficult to obtain in actual engineering. In this paper, a rolling bearing fault diagnosis method based
on Convolutional Neural Network and Support Vector Machine is proposed to solve the above problems.
Firstly, the Continuous Wavelet Transform is used to convert one-dimensional original vibration signals
into two-dimensional time-frequency images. Secondly, the obtained time-frequency images are input for
training the constructed model. Finally, the diagnosis of the fault location and severity is completed. The
method is verified on the CWRU data set and the MFPT data set. The results demonstrate that the proposed
method achieves higher diagnostic accuracy and stability than other advanced techniques.
INDEX TERMS Convolutional neural network, continuous wavelet transform, fault diagnosis, rolling
bearing, support vector machine.
I. INTRODUCTION processing, which can analyze the time domain and fre-
Since some industrial machines need to work continuously in quency domain as a whole. Commonly used time-frequency
harsh environments, failures of critical components such as analysis methods include Empirical Mode Decomposition
bearings often occur. As one of the basic elements of many (EMD) [4], Short-Time Fourier Transform (STFT) [5], and
industrial machinery, the working state of rolling bearings has Wavelet Transform (WT) [6]. The signal can be adaptively
a great influence on the operation of the entire equipment [1]. decomposed by EMD into intrinsic modal function compo-
Therefore, the research on the fault diagnosis technology of nents with different scales. However, there is a problem of
rolling bearings is significant for the safety of the production modal confusion in this process. Although STFT has the
process and the reduction of economic losses. With the devel- ability to realize the time-frequency analysis of the signal,
opment of machine learning, many typical intelligent meth- it cannot adequately reflect the sudden change of the vibra-
ods have been successfully used in fault diagnosis research, tion signal because its time resolution is fixed. WT is a
mainly including of two stages: signal feature extraction and time-frequency analysis technology, of which time window
fault classification [2]. can shrink as the frequency of the signal increases, and vice
The vibration signals of bearings usually contain sufficient versa. WT expands STFT and effectively makes up for its
fault information, but most of them are nonlinear and non- shortcomings, so it is widely used. Yan et al. have sum-
stationary. Therefore, signal feature extraction is a crucial marized the application of Continuous Wavelet Transform
step [3]. Time-frequency analysis is a powerful tool in signal (CWT), Discrete Wavelet Transform (DWT), Wavelet Packet
Transform (WPT), and Second-Generation Wavelet Trans-
The associate editor coordinating the review of this manuscript and
approving it for publication was Yan-Jun Liu.
approving it for publication was Yan-Jun Liu. other hand, fault identification also plays a vital role in
L. Yuan et al.: Rolling Bearing Fault Diagnosis Based on Convolutional Neural Network and Support Vector Machine
fault diagnosis because it cannot meet the requirements of This paper is organized as follows: Section 2 is dedicated
data batch processing only by fault extraction. Traditional to the theory of CWT, CNN, and SVM. In Section 3, the
fault recognition tools include Bayesian classifier [8], Arti- proposed CNN-SVM model is presented. In Section 4, the
ficial Neural Network (ANN) [9]–[10], and Support Vector complete experimental procedure and analysis of the results
Machine (SVM). Under the condition that there are enough are introduced. Finally, the conclusion is given in Section 5.
training samples, the first two can distinguish the fault types
effectively. However, a large number of available failure sam- II. THEORETICAL BACKGROUND
ples are hard to obtain in actual work. SVM can achieve effec- In this paper, an intelligent diagnosis method of rolling bear-
tive classification through a small amount of samples due ing faults is proposed. Firstly, we convert the original vibra-
to its strong network generalization ability, good generality, tion signals into time-frequency images using CWT. Then,
and high classification accuracy. Therefore, SVM has been CNN is applied to extract the in-depth features of the time-
widely used in mechanical fault diagnosis research [11]–[14]. frequency images. Finally, the classifier SVM is trained using
However, SVM performs poorly on redundant data because the extracted features. The fundamental theories of CWT,
its shallow structure has some difficulties in learning in-depth CNN, and SVM are introduced as follows.
features [15].
Recently, deep learning has become a vital research direc- A. CWT FOR TIME-FREQUENCY ANALYSIS
tion and gradually applied to various fields [16]–[19]. The The CWT time-frequency analysis method performs multi-
deep learning model is composed of multi-layer neural net- scale refinement on the signal through scaling and transla-
works, which can extract and learn the in-depth features tion operations. Therefore, CWT can automatically adapt to
of the input signal. Deep learning method can handle the the requirements of time-frequency signal analysis, clearly
complex and high-dimensional problems in the massive data describing the change of signal frequency with time. [7].
that cannot be solved by shallow learning [20]. Due to its high Here, CWT is used for preliminary feature extraction,
efficiency, plasticity and universality, scholars have applied converting the original 1-D time-domain signals into 2-D
many deep learning models to the research of fault diagnosis, time-frequency images. The conversion process is shown in
such as Long Short-Term Memory (LSTM) [21], Deep Belief Fig. 1 [31].
Network (DBN) [22], Deep Auto-encoder (DAE) [23], Gated
Recurrent Unit Network (GRUN) [24], and Convolutional 1) CONTINUOUS WAVELET TRANSFORM
Neural Network (CNN). Among them, CNN has more sophis- CWT is a method to obtain characteristic signal information,
ticated applications in image processing, including image which can be used for the processing and analysis of nonlin-
classification [25], target positioning [26], and face recog- ear signals. Its algorithm is relatively mature, and the basic
nition [27]. It has received more attention in the study of definition can be expressed as [32]:
rolling bearing fault diagnosis. In [28], a deep CNN struc- 1
t −b
tural model that can automatically classify rolling bearing Wϕ (a, b) = √ x(t)ϕ ∗ ( )dt, a > 0 (1)
a a
faults is established. In [29], a method for rolling bearing
fault diagnosis based on Cyclic Spectral Coherence (CSCoh) where a represents the scale parameter, b represents the time
and CNN is proposed, which improves the fault recognition or translation parameter, x(t) represents the original one-
performance. To simplify the network architecture, Hierar- dimensional data signal, ϕ represents the wavelet function
chical Symbolic Analysis (HSA) and CNN are combined for with scale a and position offset b, and ϕ ∗ is the complex
bearing fault diagnosis in [30]. However, the above method conjugate of ϕ.
relies on large quantities of samples that can be used for train-
ing, which is difficult to obtain in actual engineering. More-
over, the construction time of the deep network is relatively An optimal wavelet basis function (WBF) is essential in any
long. signal processing using CWT. For the same signal, we will
Considering the advantages and disadvantages of all the get different results by using different WBF analysis. WBF
above work, CNN and SVM are combined to build a deep should be selected based on two aspects, including the general
neural network framework CNN-SVM to diagnose bearing properties of wavelet and the characteristics of the analyzed
faults in this paper. In the first place, CWT is used for prelimi- object. Its general principles include orthogonality, tight sup-
nary feature extraction. Then, the CNN-SVM network model port, symmetry, and smoothness. When we use CWT for
is constructed using the method of transfer learning. In the pulse signal processing, the closer the shape of WBF is to
end, the preliminary extracted features are used to train the the signal waveform, the more features can be obtained. The
network model to achieve bearing fault classification. This formula of the similarity between the two is as follows [33]:
method makes full use of the excellent feature extraction k
X m2i
capability of CNN and the exceptional classification perfor- δ= αi (2)
mance of SVM, which solves the problem of in-depth signal i=1
feature extraction and the difficulty of obtaining massive where δ represents the similarity coefficient, mi represents
samples in practical work. the maximum value of each peak after WBF is taken as the
absolute value, si represents the area covered by each peak where xjl is the output of layer l, xil−1 represents the output of
after WBF is taken as the absolute value, αi represents the layer l − 1, that is, the input of layer l, Mj is the feature set of
weighted coefficient of each peak after WBF is taken as the layer l − 1, kijl represents the weight matrix, blj represents the
absolute value, αi = max(m
, and k is the number of peaks network bias, and f (·) represents the activation function.
after WBF is taken as the absolute value.
B. CONVOLUTIONAL NEURAL NETWORK In the pooling layer, the data is down-sampled by calculat-
As a common method for extracting data features in deep ing the local average or maximum value, which reduces the
learning models, CNN has made remarkable achievements network calculation complexity and retains the most critical
in image recognition research [34]. The overall structure of features, thereby improving the efficiency of feature extrac-
CNN is demonstrated as Fig. 2. Its internal hidden layer tion. The calculation method can be expressed as:
structure is mainly composed of convolution layer, pooling
layer, and fully-connected layer. xjl = f βjl down xjl−1 + blj (4)
In this study, the signals collected at the frequency of is selected. In this way, different fault locations and damage
12kHz under 12 health conditions of the drive end bearing severities of bearings can be better simulated.
FIGURE 9. Time-frequency images of twelve health conditions on CWRU dataset: (a) Class 1; (b) Class 2; (c) Class 3; (d) Class 4; (e) Class 5; (f) Class 6;
(g) Class 7; (h) Class 8; (i) Class 9; (j) Class 10; (k) Class 11; (l) Class 12.
In this experiment, all the data points in the three fault data
sets are used, and 48828 data points are taken as the sample
length. In order to obtain more experimental samples, 24414
data points are repeated between adjacent parts. The baseline
set is down-sampled to 48828 Hz to match other fault sets.
The three files in the baseline set are divided into 22 segments
respectively, the seven inner fault files are divided into five FIGURE 10. Classification results on CWRU dataset.
FIGURE 11. Time-frequency images of three health conditions on MFPT dataset: (a) Class 1; (b) Class 2; (c) Class 3.
TABLE 7. Accuracy and average training time of the three models. under a motor load of 0hp, and then complete classification
for the 10 health conditions under the working situation with
a motor load of 1, 2, and 3hp.
The architecture and parameters of ResNet-18 which
selected by the proposed method for feature extraction are
given in Table 1. The selection of other hyperparameters is as
follows. The mini batch size is 256. The learning rate starts
from 0.1 and is divided by 10 when the error reaches a plateau,
TABLE 8. Average accuracy of the proposed method and several other and the models are trained for up to 60 × 104 iterations, the
weight decay is 0.0001, and the momentum equals to 0.9
[36]. The accuracies of the 5 contrastive methods are 97.82%,
77.86%, 98.125%, 94.73%, and 97.81%, respectively.
By comparing the results with other methods, it can be
easily seen that the proposed method achieves a higher diag-
nosis accuracy, which further shows the effectiveness of the
proposed method.
In this study, a new deep neural network model CNN-SVM
As can be seen, the diagnosis accuracy of the proposed is built for fault diagnosis of rolling bearings. First, we use
method in 5 trials is higher than that of the other two standard the CWT to construct the time-frequency images of vibration
models, and the training time is considerably shortened com- signals. Then, the obtained images are input to the proposed
pared to the standard CNN model. The results show that SVM model for training. Finally, the diagnosis of fault location and
is better than the default classifier, i.e., Softmax. It is proved severity of the rolling bearing is completed. The experiments
that the proposed model can effectively solve the difficulty of indicate that the diagnostic accuracy of this method can reach
extracting in-depth features of miscellaneous data by SVM 98.75% for the CWRU dataset and 98.89% for the MFPT
and the challenge of meeting the needs of massive samples dataset which verify the flexibility and practicability of the
for the training of CNN, and has excellent stability. constructed model. By comparing with the standard CNN
In recent years, there are many works about deep learning- and standard SVM, it is showed that the proposed model is
based fault diagnosis methods. Thus, to further illustrate the able to resolve the difficulty of deep feature extraction in the
innovation of the proposed method in the field of bearing fault traditional method and the small sample problem in actual
diagnosis, we compare it with existing advanced techniques engineering and has excellent stability. By comparing with
such as PNN-SFAM [9], BPNN [39], CNN-HMM [39], advanced methods such as PNN-SFAM, BPNN, CNN-HMM,
DAFD [40], and DGNN [41]. The average diagnosis accuracy DAFD, and DGNN, the effectiveness of the proposed method
of each method is listed in Table 8. is further verified.
The details of the different methods are listed below. In [9], However, for relatively noisy data sources, the accuracy of
the PNN is introduced to classify rolling bearing states in the proposed method still needs to be enhanced. Therefore,
healthy and not healthy. If the PNN decision shows that the the structure of the proposed model needs further improve-
processed state is not healthy, then the SFAM is used to clas- ment in the future. Moreover, in this paper, only single-fault
sify 7 types of faults. In [39], for the BPNN method, 50 units bearing vibration signals are used for model training, and no
in hidden layer, learning rate is 0.2, momentum is 0.05, and compound fault samples are created to simulate the actual
iteration number is 800. 12 types of health conditions are clas- situation, which is a certain challenge for the application
sified by using the learned features of the experimental data of the proposed model in practical engineering. It has also
as inputs. For the CNN-HMM method, a 50 × 50 matrix of become a research direction for us in the future.
each sample is constructed, learning rate is 1, and maximum
iteration is 100. 12 health conditions are classified by using
[19] C. Zhang, H. Zhang, J. Qiao, D. Yuan, and M. Zhang, ‘‘Deep transfer
