An Intrusion Detection Model Based On A Convolutio
An Intrusion Detection Model Based On A Convolutio
An Intrusion Detection Model Based On A Convolutio
Abstract
Machine-learning techniques have been actively employed to information security in recent years. Traditional rule-based security solutions
are vulnerable to advanced attacks due to unpredictable behaviors and unknown vulnerabilities. By employing ML techniques, we are able
to develop intrusion detection systems (IDS) based on anomaly detection instead of misuse detection. Moreover, threshold issues in anomaly
detection can also be resolved through machine-learning. There are very few datasets for network intrusion detection compared to datasets
for malicious code. KDD CUP 99 (KDD) is the most widely used dataset for the evaluation of IDS. Numerous studies on ML-based IDS
have been using KDD or the upgraded versions of KDD. In this work, we develop an IDS model using CSE-CIC-IDS 2018, a dataset
containing the most up-to-date common network attacks. We employ deep-learning techniques and develop a convolutional neural network
(CNN) model for CSE-CIC-IDS 2018. We then evaluate its performance comparing with a recurrent neural network (RNN) model. Our
experimental results show that the performance of our CNN model is higher than that of the RNN model when applied to CSE-CIC-IDS 2018
dataset. Furthermore, we suggest a way of improving the performance of our model.
Key Words: Intrusion detection, Deep learning, Convolutional neural network, Recurrent neural network.
165
An Intrusion Detection Model based on a Convolutional Neural Network
datasets. We first convert the CIC-2018 numerical data into DL-based studies use CNN, RNN, LSTM and Deep Neural
images. We then develop a CNN-based intrusion detection Network (DNN) algorithms [7-9], [17-18]. Moreover, some
model by organizing convolutional layers and max-pooling studies focus on preprocessing techniques of KDD [19-20].
layers. Furthermore, we train the images based on the NSL-KDD was generated to resolve some issues in KDD,
proposed model and evaluate its performance by comparing especially duplicated records and lack of patterns of several
experimental results with that of a recurrent neural network attacks. Chuanlong [21] studies an intrusion detection model
(RNN) model. Lastly, we discuss on a way of improving using Recurrent Neural Network (RNN) using NSL-KDD.
the performance. CNN and RNN are fundamental deep Canadian Institute for Cybersecurity (CIC) generated IDS
learning models for image data and time-series data, datasets in 2012, 2017 and 2018. In 2012, ISCX IDS 2012
respectively. Inception [25] as well as ResNet [26] are (CIC-2012)[22] was generated by injecting 4 types of
based on CNN. Long Short-Term Memory (LSTM) [27] is attacks including infiltration attacks from inside, HTTP
an advanced model of RNN. By employing these DoS attacks, DDoS(distributed denial of service) attacks
fundamental models, we are able to identify the optimal and brute force attacks. Tamim [23] detects attacks in
analysis model for the characteristics of CIC-2018. CIC-2012 based on CNN. He generates input images by
Furthermore, we could improve the performance using converting destination payloads and classifies the
those advanced models in the future. The remainder of this images into normal and attack, while we classify two or
paper is organized as follows. Section 2 briefly describes more attacks in CIC-2018 based on a multi-class
existing ML-based studies on intrusion detection as well as classification. CIC-2017 [10] and CIC-2018 [11] are the
DL algorithms we use in this work. In Section 3, we design
most up-to-date datasets for IDS evaluation. CIC-2017
our CNN-based intrusion model along with features. We
contains network traffic with most common attack
evaluate the proposed model discuss a preprocessing issue
families including brute force attacks, heartbleed attacks,
for the better performance in Section 4. Finally, the
botnets, DDOS attacks and web attacks. Faker [13]
conclusion is in Section 5.
studies intrusion detection using CIC-2017 and UNSW-
NB15 datasets. This study removes socket information
II. RELATED WORKS
to prevent model overfitting. To reduce data size, they
remove null values and unimportant traffic information.
KDD CUP 99(KDD) was generated for IDS evaluation
They also convert string values into numerical values
and includes four types of attacks such as DoS, R2L, U2R,
and normalize the values. If there are missing data or
and probing. KDD consists of 41 features including traffic
features, basic and content features of each TCP connection. infinite data, they make two versions of data set. First,
KDD has been widely used for data mining and ML studies replace all of missing and infinite data into average data.
on intrusion detection. Table 1 shows existing ML/DL-based Second, remove all the missing and infinite data. They
studies on intrusion detection using KDD. evaluate their model with the two kinds of datasets. As
training algorithms, DNN (Deep Neural Network),
Table 1. ML/DL-based intrusion detection studies using KDD Random Forest, and Gradient Boosting Tree
CUP 99 and NSL-KDD.
classification are used. X. Zhang [14] focuses on
Dataset NSL-
KDD CUP 99 intrusion detection using Deep Forest. They preprocess
Algorithm KDD
the datasets using based on the P-ZigZag encoding
SVM O O
method and apply an inverse discrete cosine transform
ANN O
ML (IDCT) into the preprocessed datasets.
Decision
O CIC-2018 contains more recent network traffic
Tree
with/without attacks. CIC-2018 was generated by
DNN O
collecting network traffic and system logs for about 80
CNN O
DL features. Qianru [15] analyzes the CIC-2018 dataset
RNN O O O
employing ML techniques. This study preprocesses the
LSTM O O O
dataset by eliminating normal data and noise data, and
preprocessing O O O O
then remove unnecessary values after decimal point.
Reference [17] [6] [8] [9] [18] [7] [19] [20] [15] [21] With these preprocessing methods, the size of CIC-2018
decreased by 4MB.
Some studies employ ML technique such as SVM, Decision
Tree, and Artificial Neural Network (ANN) [6, 17]. Most of
166
Journal of Multimedia Information System VOL. 6, NO. 4, December 2019 (pp. 165-172): ISSN 2383-7632 (Online)
http://doi.org/10.33851/JMIS.2019.6.4.165
Table 2. Analysis of Intrusion Detection Studies using CIC-2017. As ML techniques, they Random Forest, Decision Tree,
Dataset Gaussian Naïve bayes classifier, Multi-Layer Perceptron
CIC-IDS 2017(CIC-2017)
Algorithm (MLP), K-nearest neighbors classifier, and Quadratic
Random
discriminant analysis classifier. Table 2 and Table 3
O - show IDS studies using CIC-2017 and CIC-2018. We
Forest
GNB - - can hardly find DL-based IDS studies using CIC-2018.
ML
Decision
- - In this work, we suggest an IDS model employing DL
Tree
MLP - - techniques.
DNN O -
GBT O - III. METHODS
DL XGBoost - O
CNN - - 3.1. Datasets and features
RNN - -
Convert the dataset CSE-CIC-IDS2018(CIC-2018) is a dataset containing
Remove socket data
into images network traffic and system logs. CIC-2018 consists of 10
Data Padding Remove white space
days of sub-datasets collected on different days through
P-ZigZag Encoding Encode label
pre-processing - Normalize data injecting 16 types of attacks. This dataset was generated
- Replace or Remove using CICFlowMeter-V3 [24] and contains about 80 types of
missing/infinite data features. These features provide forward and backward
- Remove normal traffic directions of network flow and packets. The size of CIC-
data
Binary Classification - 2018 is more than 400GB, which is the larger amount than
DNN that of CIC-2017. We can develop a DL-based IDS model
P-ZigZag
Binary Classification - and evaluate its performance using CIC-2018.
GBT
evaluation
Multiclass Table 4. Type of injected attacks and amounts of sub-datasets.
Classification - DNN Amounts of Total
OHE
Multiclass Sub-datasets Type of attacks
samples samples
Classification - GBT
reference [13] [14] Benign 446,772
SD - 1 DoS-Hulk 461,912 1,048,574
Table 3. Analysis of Intrusion Detection Studies using CIC-2018. DoS-SlowHTTPTest 139,890
Dataset CSE-CIC-IDS 2018 (CIC-2018) Benign 663,808
SD - 2 FTP-BruteForce 193,354 1,044,751
Algorithm SSH-Bruteforce 187,589
ML Random O - Benign 988,050
Forest SD - 3 DoS-GoldenEye 41,508 1,040,548
GNB O - DoS-Slowloris 10,990
Decision O - Benign 7,313,104
Tree SD - 4 7,889,295
DDoS-LOIC-HTTP 576,191
MLP O -
Benign 360,833
DL DNN - -
SD - 5 DDOS-HOIC 686,012 1,048,575
GBT - -
XGBoost - - DDOS-LOIC-UDP 1,730
CNN - O Benign 1,042,603
RNN - O Brute Force -Web 249
SD - 6 1,042,965
pre-processing Remove normal/noise Remove null values and Brute Force -XSS 79
data infinite values SQL Injection 34
Eliminate unnecessary Convert numerical data Benign 1,042,301
value after decimal into images
Brute Force -Web 362
-
Replace untreatable SD - 7 1,042,867
value Brute Force -XSS 151
evaluation Classify each Zero- Multiclass Classification SQL Injection 53
Day attack & benign - CNN Benign 538,666
data SD - 8 606,902
Infilteration 68,236
Classify mixed Zero- Multiclass Classification Benign 235,778
Day attack & benign - RNN SD - 9 328,181
Infilteration 92,403
data Benign 758,334
SD-10 1,044,525
reference [15] Our approach Bot 286,191
167
An Intrusion Detection Model based on a Convolutional Neural Network
3.2. Design of our CNN model In order to evaluate the performance of our model, we
also train the dataset based on RNN model and compare
CNN is the most commonly used deep learning algorithm
the experimental results with each other. We design the
for image training. In order to develop a CNN-based
RNN model based on ‘vanilla RNN’ with 10 units.
intrusion model, converting the CIC-2018 dataset into
Figure 2 shows the experimental results of CNN and
images is required. We convert each labeled data into 13x6
RNN models. In most sub-datasets, our CNN model has
size of images because each data contains 78 features
except the ‘Label’ feature. The ‘Label’ is used for image
a higher accuracy than that of the RNN model.
classification. A CNN model consists of convolutional The accuracy is measured as follows:
2×𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 × 𝑟𝑒𝑐𝑎𝑙𝑙
layers, max-pooling layers, and a fully connected layer. We 𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = 𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛+𝑟𝑒𝑐𝑎𝑙𝑙
, (1)
can find out the optimal CNN model by organizing those
𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠
layers along with modeling parameters such as a kernel size, where 𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = 𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠+𝑓𝑎𝑙𝑠𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 and
number of kernels, and ratio of dropout. Figure 1 shows our 𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠
CNN model for CIC-2018. 𝑟𝑒𝑐𝑎𝑙𝑙 = .
𝑓𝑎𝑙𝑠𝑒 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒𝑠+𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠
168
Journal of Multimedia Information System VOL. 6, NO. 4, December 2019 (pp. 165-172): ISSN 2383-7632 (Online)
http://doi.org/10.33851/JMIS.2019.6.4.165
Especially, the accuracies of SD-2, SD-3, SD-5, and It means our model has the best performance in
SD-9 with CNN are about 10% to 60% higher than that classifying benign data because the CIC-2018 dataset
of using RNN. provides much more ‘benign’ data than attack-labeled
data.
Here we adjust the ratio of labeled data for the better
performance of DL, although the original dataset better
represents the real-world network environment and
distinguishing anomalous traffic from massive benign
traffic in the real network is challenging. In ML and DL,
a data preprocessing is an important strategy for high.
We preprocessed sub-datasets (SD-3, SD-6, SD-7, SD-8,
and SD-9) with low accuracy in attack-labeled data so
that the amount of benign data must not be more than
Fig. 2. Comparison of the accuracy of CNN and RNN model.
five times than that of the smallest amount of attack-
labeled data. We then train the datasets using our CNN
Although our experimental results show that our CNN model. Figure 3 compares the accuracy of each attack
model detects attacks in CIC-2018 with high accuracy, we before and after the preprocessing considering the data
still need to figure out a way of improving the accuracy of ratio. The experimental results show that the accuracies
each attack. For instance, the accuracy of SD-3 is 0.9677 of most attacks dramatically increase through the
in Figure 2. According to the confusion matrix of SD-3, preprocessing. We can find out the optimal ratio of
however, the accuracies of ‘DoS-GoldenEye’ and ‘DoS- benign and attack-labeled data through repeatitive
Slowloris’ are 0.66 and 0.47 while the accuracy of ‘benign’ preprocessing and training.
is 0.99 as shown Table 7.
Table 7. Accuracy of each attack with a CNN model.
Sub-datasets Type of attack Accuracy
SD-1 Benign 1
DoS attacks-Hulk 1
DoS attacks-SlowHTTPTest 1
SD-2 Benign 0.93
FTP-BruteForce 0.98
SSH-Bruteforce 0.96
SD-3 Benign 0.99
DoS attacks-GoldenEye 0.47 Fig. 3. Comparison of accuracy of before and after preprocessing.
DoS attacks-Slowloris 0.66
SD-4 Benign 1 V. CONCLUSION
DDoS attacks-LOIC-HTTP 1
SD-5 Benign 1
DDOS attack-HOIC 1 We have employed DL techniques for intrusion
DDOS attack-LOIC-UDP 1 detection. CIC-2018 has been used as an IDS dataset in
SD-6 Benign 1 this work. We have designed a CNN model consisting of
Brute Force -Web 0.3
two convolutional layers and two max-pooling layers
Brute Force -XSS 0.65
and converted the dataset into images. These images
SQL Injection 0.08
SD-7 Benign 1 have been trained based on the proposed CNN model and
Brute Force -Web 0 the experimental results showed that our model detects
Brute Force -XSS 0 benign and attack data in CIC-2018 with high accuracy.
SQL Injection 0 In order to evaluate the performance of our model, we
SD-8 Benign 0.94 have also trained the dataset using RNN. In the multi-
Infilteration 0
class classification, our CNN model is more accurate
SD-9 Benign 0.85
Infilteration 0.35 than the RNN model when applied to CIC-2018, the
SD-10 Benign 1 latest CIC dataset, using the image-based deep learning
Bot 1 method introduced in Tami’s work [23]. Furthermore,
169
An Intrusion Detection Model based on a Convolutional Neural Network
we have suggested a way of improving the performance [7] Jihyun Kim, Howon Kim, An Effective Intrusion
by preprocessing the dataset considering ratio of benign Detection Classifier Using Long Short-Term
and attack-labeled data. The experimental results Memory with Gradient Descent Optimization,
showed that the accuracy of attack-labeled data Proceeding of the 2017 IEEE International
increased through the preprocessing method. In the Conference on Platform Technology and Service
future, we will train another IDS dataset based on our (PlatCon), pp. 1-6, 2017..
CNN model and find out the optimal model by [8] R. C. Staudemeyer and C. W. Omlin, “Evaluating
reorganizing the convolutional layers along with CNN performance of long short-term memory recurrent
parameters. neural networks on intrusion detection data,” In
Proceedings of the South African Institute for
Acknowledgement Computer Scientists and Information Technologists
Conference, pp. 218-224, 2013.
This research was supported by Basic Science Research [9] G. Kim, H. Yi, J. Lee, Y. Paek, and S. Yoon, “LSTM-
Program through the National Research Foundation of Based System-Call Language Modeling and Robust
Korea (NRF) funded by the Ministry of Education Ensemble Method for Designing Host-Based
(NRF-2018R1D1A1B07050543) and also partially Intrusion Detection Systems,” arXiv preprint
supported by special research grant from Seoul Women's arXiv:1611.01726, 2016.
University (2019) [10] Intrusion Detection Evaluation Dataset
(CICIDS2017), https://www.unb.ca/cic/datasets/ids-
REFERENCES 2017.html
[11] CSE-CIC-IDS2018 on AWS, https://www.unb.ca
[1] Jiyeon Kim, Yulim Ahn, and Eunjung Choi, /cic/datasets/ids-2018.html
“Network Intrusion Detection using Machine [12] Sharafaldin I., Gharib A., Habibi Lashkari A., and
Learning Techniques”, in Proceeding of Ghorbani A. A.. Towards a reliable intrusion
International Conference on Culture Technology detection benchmark dataset, Software Networking,
2019, August 2019. vol. 2017, no. 1, pp. 177–200, 2017.
[2] Hasan, Md. Al & Nasser, Mohammed & Pal, [13] Faker, Osama & Dogdu, Erdogan, “Intrusion
Biprodip & Ahmad, Shamim, “Support Vector Detection Using Big Data and Deep Learning
Machine and Random Forest Modeling for Intrusion Techniques,” in Proceedings of the 2019 ACM
Detection System (IDS),” Journal of Intelligent Southeast Conference, pp. 86-93. 2019.
Learning Systems and Applications, vol. 06, pp. 45- [14] Zhang Xueqin, Chen Jiahao, Zhou Yue, Han,
52, 2014. Liangxiu, Lin Jiajun, “A Multiple-layer
[3] Mulay, Snehal & Devale, P.R. & Garje, Goraksh, Representation Learning Model for Network-Based
“Intrusion Detection System Using Support Vector Attack Detection,” IEEE Access. pp. 1-1.
Machine and Decision Tree,” International Journal of 10.1109/ACCESS.2019.2927465, 2019.
Computer Applications vol. 3. 10.5120/758-993, [15] Zhou Qianru, Pezaros Dimitrios, “Evaluation of
2010. Machine Learning Classifiers for Zero-Day Intrusion
[4] Beghad, Rachid, “Training all the KDD data set to Detection -- An Analysis on CIC-AWS-2018 dataset,”
classif and detect attacks,” Neural Network World, 2018.
vol. 17, pp. 81-91, 2017. [16] Jackins, V., and D. Shalini Punithavathani. “An
[5] Jia, F. & Kong, L.-Z., “Intrusion Detection anomaly-based network intrusion detection system
Algorithm Based on Convolutional Neural Network,” using ensemble clustering,” International Journal of
Beijing Ligong Daxue Xuebao/Transaction of Enterprise Network Management, vol. 9.3-4, pp.
Beijing Institute of Technology, vol. 37, pp. 1271- 251-260, 2018.
1275, 2017. [17] Y. X. Meng, “The practice on using machine
[6] Yuchen Liu, Shengli Liu and Xing Zhao, “Intrusion learning for network anomaly intrusion detection,” in
Detection Algorithm Based on Convolutional Neural Proceeding of Machine Learning and Cybernetics
Network”, in Proceeding of the 4th International (ICMLC), 2011 International Conference, vol. 2, pp.
Conference on Engineering Technology and 576-581, IEEE, 2011.
Application, 2017.
170
Journal of Multimedia Information System VOL. 6, NO. 4, December 2019 (pp. 165-172): ISSN 2383-7632 (Online)
http://doi.org/10.33851/JMIS.2019.6.4.165
171
An Intrusion Detection Model based on a Convolutional Neural Network
172