Deep Learning Approaches For Crack Detection in Bridge Concrete Structures

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Deep Learning Approaches for Crack Detection in

Bridge Concrete Structures

Daniel Einarson Dawit Mengistu


Department of Computer Science Department of Computer Science
Kristianstad University Kristianstad University
2022 International Conference on Electronic Systems and Intelligent Computing (ICESIC) | 978-1-6654-8385-8/22/$31.00 ©2022 IEEE | DOI: 10.1109/ICESIC53714.2022.9783576

Kristianstad, Sweden Kristianstad, Sweden


[email protected] [email protected]

Abstract—Convolutional Neural Networks are among the most perspective, the main issues investigated in this work are
effective algorithms for image analysis applications. However, the classification accuracy of the model and, improvements of
accuracy of the algorithms depends on the availability of powerful computational performance, mainly in the cloud. The rest of this
computational resources and the quality of the images used to paper is organized as follows. A brief background on the
train the models. This paper investigates ways to build robust Öresund Bridge and review of CNN models is presented. This
models to detect cracks in concrete structures using low resolution is followed by descriptions of the experimental methodology
images and third-party datasets. Our experiments show that applied in this study. The achieved results are then presented
reducing image sizes by a factor of 4 does not significantly impact
along with discussions on their significance. The report
the accuracy. This is helpful to shorten execution time and hence
concludes by summarizing the important findings and citing
lower cloud service costs. It is also observed that applying a model
trained on one image dataset to detect cracks in images from a
directions for future work.
different source is not a trivial task.
II. BACKGROUND
Keywords—Concrete Crack Detection; Convolutional Neural
Networks; Deep Learning; Performance Optimization A. Brief Overview of the Öresund Bridge
The Öresund Bridge is a combined railway and motorway
I. INTRODUCTION bridge that links Sweden and Denmark. The bridge runs nearly
Convolutional Neural Networks (CNN) have been 8 kilometers (5 miles) from the Swedish coast to the man-made
successfully used for image analysis in different application island Peberholm. The island is connected to the Danish coast
domains. The use of CNN models to detect cracks in concrete via the 4-kilometre (2.5 mi) Drogden Tunnel on its other end.
structures has been investigated and promising results have been Øresundsbro Konsortiet1 is the management body responsible
achieved [1]. These models can be effectively employed to for maintaining the bridge, and related assets.
detect existing cracks and predict potential damages in roads At present, the bridge management has a large repository of
railways, buildings, bridges, etc. [2], [3]. Machine learning footages that were not systematically organized. Because the
models have a major advantage over the traditional manual images were taken from afar, they lack clarity and sharpness
approach used for crack detection. Manual inspection desired for an accurate analysis. As a result, it has been difficult
compromises safety as it exposes people to work in dangerous to draw meaningful conclusions using the repository in its
and inaccessible parts of the concrete structures. Automating the present form.
process by employing robots to capture images in hazardous and
inaccessible areas and feeding these images to a machine A pilot study to investigate the possibilities of using CNN on
learning pipeline is recommended to address this problem. this dataset was performed as a thesis work at Kristianstad
University [5]. A dataset of 6,639 images (4,633 crack free and
This paper presents a study that investigates ways to improve 2,006 having cracks) was available for this study. The original
the accuracy of concrete crack detection on the Öresundsbron images in the dataset have a 6013x3376 pixels size. However,
(the Öresund Bridge). The Bridge management is tasked with they were cropped down to 256x256 pixels for ease of analysis,
performing inspection of the bridge for preventive maintenance. inspired by the work of [1]. Fig. 1 shows few examples of
The inspection involves early detection of cracks in the bridge’s images containing cracks vs crack-free.
structures. It is intended to automate this task by using drones to
take images of the concrete structure, and by applying machine The study showed that there are two major challenges to
learning techniques on these images for crack detection. achieve the intended objective. First, because the study was
performed on a laptop with moderate resources (CPU, memory),
The objective of this study is to investigate how machine the model training phase took a long time. Second, it was not
learning models can be implemented and incorporated into the possible to achieve the desired level of accuracy owing to the
Bridge’s preventive maintenance system. From a research limited size of the training dataset. The quality of the images and

1
Øresundsbro Konsortiet - https://www.oresundsbron.com/en/info/company

978-1-6654-8385-8/22/$31.00 2022
c IEEE 7

Authorized licensed use limited to: University of Leeds. Downloaded on October 10,2023 at 11:53:01 UTC from IEEE Xplore. Restrictions apply.
the balance between image classes (number of with-crack vs. no- MNIST-problem may actually be solved by a significantly
crack) also weighed on the accuracy of the models. simpler and faster, fully connected neural network. A simple
feed forward network with only one hidden layer achieves an
accuracy level of about 98% for this problem [9].
(a) Techniques relying on simple networks to detect cracks were
proposed by [11]. However, this is not the case for our crack
detection dataset. This is because the images from the Öresund
Bridge are significantly larger and too blurry to be handled by
(b) simple networks. Therefore, the need for advanced algorithms
such as convolutional neural networks is evident.
Fig. 1. Images from Öresund dataset. (a) with Cracks. (b). cracks-free Convolutional networks are becoming increasingly complex
and evolved with the growing challenges in image classification
As is the case in image classification projects, tackling these
problems [12]. Earlier versions of CNN such as LeNet-5 (1998),
limitations is of significant interest. Deploying the model on the
AlexNet 2012) and VGG16 (2014) consist of a sequence of a
Cloud has become the preferred approach as the Cloud proved
lower number of layers. Later variants of CNNs such as
to be an efficient high performance computing environment. To
Inception-v3 (2015), and ResNeXt-50 (2017) have complex
address the accuracy aspect, we build the CNN model using a
structures consisting of combinations of layers organized in
third-party dataset and test the model’s performance with the
parallel.
images from our Bridge dataset. Accordingly, we shall use the
Mendeley dataset2, which is available in a public repository and Previous studies on crack detection show that CNNs of the
used for similar studies [4]. Fig. 2 illustrates representative type VGG-16 can produce accurate models [1]. It can be
images from the Mendeley dataset to show the difference observed in the results in [13] that VGG-16 achieved a good
between with-crack and no-crack image classes. accuracy on a dataset of 2500 images with 256x256 resolution.
The CNN in this study has 13 convolutional, and 3 fully
(a) connected layers. An accuracy of 92.27% is achieved in 50
epochs. An AlexNet network consisting of 5 convolutional, and
3 fully connected layers proposed in [14] achieves an accuracy
level of about 98%. Surprisingly, the more advanced VGG16
(b)
models actually show lower accuracy than the less advanced
AlexNet.
The study in [1] gives interesting insights about the accuracy
Fig. 2. Images from Mendeley dataset. (a) with Cracks. (b). cracks-free and execution performance of CNNs. The model developed in
this study has 8 layers: 4 convolutional, 2 pooling, 1 ReLU, and
B. Review of Convolutional Networks 1 Softmax layers. The concrete images were taken at a close
While CNNs can be applied in several application domains, distance (1 – 1.5 meters from the concrete). In total, 32K images
they have shown superior performance in image recognition, were used, achieving an accuracy of about 98% around the 50th
due to their built-in structures [6]. A CNN basically consists of epoch. The experiment completed in 90 minutes on two GPUs
Convolutions, Pooling, Activation Functions, and Dense Layers. while it took 1-2 days on a standard CPU. This shows the need
These operations are performed in a number of layers through for accelerated computing environment, to achieve speed and
which the original input data is matched against possible output, performance.
as illustrated by Fig. 3. Such layers are often supplemented with
a final Softmax operation before mapping to Output. For more A study in [4] compares several types of CNNs, such as
information about this, see, [1] and [7]. AlexNet, VGG16, and ResNet50, based on the Mendeley image
dataset, and shows impressive results, with an accuracy of above
99%. The dataset contains 40K high quality images, with
227x227 pixel resolution. The time required to train 28K dataset
per epoch is, however, unacceptably high. For instance, the time
required for AlexNet is shown to be 133 seconds, and for
VGG16, it is 2,827 seconds.
From the above studies, it can be seen that a simpler CNN
Fig. 3. A CNN to classify handwritten digits such as AlexNet can achieve an acceptable accuracy and
Fig. 3 relates to the classic MNIST-example [8], which performance. However, early experiments with the Öresund
examines neural networks to recognize hand-written digits of Bridge images of size 256x256 pixels showed unsatisfactory
size 28x28 pixels. A study in [8] tackles the problem on the basis results. Worse to note that the time to train the system to get such
of variations of the LeNet CNN-structure (Fig. 3). Although a dismal results was about 24 hours (on one machine), pointing to
LeNet-based CNN may reach an accuracy above 99%, the the clear need for alternative approaches to address the problem.

2
Mendeley Data set for Classification of Concrete Crack Images,
https://data.mendeley.com/datasets/5y9wdsg2zt/2

8 2022 International Conference on Electronic Systems and Intelligent Computing (ICESIC)

Authorized licensed use limited to: University of Leeds. Downloaded on October 10,2023 at 11:53:01 UTC from IEEE Xplore. Restrictions apply.
III. METHODOLOGY A. Resizing the images
We conducted experiments in two different computational As explained earlier, we chose 5100 positive (having crack)
environments, to investigate execution performance issues. The images as well as 5100 negative (crack-free) images from the
first experiment is performed on a laptop computer (we call it Mendeley dataset. Three test-cases are prepared with 100
the local version) having 11th Generation Intel core i5-11400H, images removed from each group and saved aside for use as test
2.30GHz processor, 12MB cache and 16 GB RAM. images. Therefore, we have three different sets of 5000 (positive
plus negative) images for training, and an additional 100 images
The second experiment is performed in the Amazon cloud (positive plus negative) for testing. From the training set, 20%
on custom-bult, accelerated computing instance of of the images were used for validation, thus, the other 80% are
ml.g4dn.4xlarge type. This powerful machine has 16 virtual used for building the CNN model.
CPUs, 64GB RAM and a T4 GPU (2560 cores). This cloud
instance is provisioned with AWS SageMaker Notebook Next, we created new datasets of lower resolution from the
preconfigured for deep learning in the Python language. In both original 227X227 images. The new image sets have sizes of
cases, our training models are based on an AlexNet-type CNN 128x128, 64x64, and 32x32 pixels. Accordingly, a test-suite of
consisting of six layers as follows: twelve different cases is created, that is, four image-sizes times
three test-cases for each. The reference scale on which the
images are resized is the original size of the Öresund Bridge
images, which is 256x256 pixels. Halving the sizes of the images
is considered significant enough to observe clear effects. The
resizing is done through an INTER_AREA method, as described
in [16].
The outcome of image resizing is illustrated in Fig. 5. A
Fig. 4. A six-layer AlexNet-type of CNN
crack is seen on the original and the resized images. The
1. A Convolution layer of size 16 (C-16), an Activation assumption here is that, if cracks are clearly visible to the eye,
function (A) of type Relu, a MaxPooling (P) of size 5x5, they should certainly be detected by a CNN model as well.
and a DropOut (D) of 20%
2. C-32, A-Relu, P-3x3, D-20%
3. C-64, A-Relu, P-2x2, D-20%
4. A Flatten Layer
5. A Dense Layer of size 32, and A-Relu
6. Dense-1, and a Sigmoid Activation Function finally
providing output.
Fig. 5. Different size images with cracks
The experiments were run for 20 epochs, with learning rate
set to 0.001 for the first 10 epochs and decreased to 0.0005 for B. Unbalancing the datasets
the rest3. The batch size chosen is 64.
In the real world, cracks in concrete are a seldom occurrence.
The original Mendeley dataset4 contains 40 000 images split This means, the number of images with cracks is certainly much
into 20 000 Positive (with cracks), and 20 000 Negative (without less than that without cracks if the dataset is assembled naturally.
cracks). We selected some 10K images from this dataset to This imbalance poses clear difficulties in training the CNN
conduct our experiment. The following two-step approach was model in the usual way. The use of unbalanced classes of data
further employed. (positive vs. negative) has obvious impact on accuracy.
1. To investigate the effect of image size on classification In this part of the experiment, we start with completely
accuracy and execution performance, the original balanced sets of images 5000 positive plus 5000 negative and
images were resized to different sizes and used to train proceed to lower the number of positive images in the next runs.
the CNN (procedure A). This is achieved by successively removing positive images so
2. For each category of resized images, the experiment was that the number of positive images is 40%, 20%, 10%, 5%, 1%
repeated by varying the proportion of positive images and 0.1% of the negative images and then rerun the test. For
(containing crack) used to train the model. This step was instance, an imbalance factor of 40% corresponds to 5000
used to investigate the effect of unbalanced data on negative, and 2000 positive images.
classification accuracy (procedure B). The accuracy figures should be regarded with special care.
Both procedures were implemented and evaluated on both For instance, in a case where the number of positive images in
the local and cloud-based experiments. the dataset is less than 10% of the negative images, an accuracy
of 90% may not be satisfactory. Such an accuracy result could

3 4
A learning rate of 0.001 is a standard setting, tuning the value is a common The Bridge dataset has 4,633 images considered without cracks, resp. 2,006
way to further improve accuracy. images considered with cracks.

2022 International Conference on Electronic Systems and Intelligent Computing (ICESIC) 9

Authorized licensed use limited to: University of Leeds. Downloaded on October 10,2023 at 11:53:01 UTC from IEEE Xplore. Restrictions apply.
lead to reporting many false negative or false positive cases. TABLE II. CONFUSION MATRIX FOR RESIZED IMAGES
Therefore, the result must be complemented with the confusion Case Size TP TN FP FN
matrix5.
1 32 99 96 4 1
1 64 100 98 2 0
IV. RESULTS
1 128 100 99 1 0
Our results are reported below according to the structure
outlined in the Methodology in Section III. The tables show size 1 227 100 98 2 0
of images, time to train and validate a CNN in seconds, 2 32 91 100 0 9
percentage of training accuracy (Accuracy), and Validation 2 64 97 100 0 3
accuracy, for all test cases 1-3. The details from the confusion
2 128 100 99 1 0
matrix tables (TP, TN, FP, FN) are presented for both procedures
(experiments with resized images and unbalanced datasets). 2 227 99 99 1 1
3 32 100 97 3 0
A. Local (laptop-based) experiments 3 64 100 97 3 0
TABLE I, shows the experiments run for the three testcases, 3 128 100 98 2 0
and for the different image sizes. It can be observed that the 3 227 99 100 0 1
values of the training as well as the validation accuracy are
TP=True Positive, TN=True Negative,
surprisingly high. That is, due to the sharpness of the images of FP=False Positive, FN=False Negative
the Mendeley dataset, even downscaled images give high
accuracy. For procedure B, image sizes of 64x64 pixels are used to run
the experiments with unbalanced images sets. The results of
As can be seen, the execution time and image size are not these experiments for different levels of imbalance are shown in
linearly related. Doubling image size by a factor of about 2, TABLE III. It is interesting to note the achieved high accuracy
implies quadrupling the number of input nodes and hence the values and, low false positive and false negative rates in these
total number of edges will be accordingly higher. This seems to experiments.
result in a difference in execution time by a factor of about 3.
This is clearly demonstrated in TABLE I. From TABLE III, it can be observed that the images of the
Mendeley dataset are sharp enough to give high accuracy with
unbalanced datasets. The false negative rate is low if the
TABLE I. RESIZING IMAGES THROUGH DOWNSCALING imbalance factor remains above 10%. It is important to note that
Exec. Val. the accuracy figure alone can be deceptive. As can be seen in
Accuracy
Case Size Time
(%)
Accuracy the table, a test with an imbalance factor of 0.1% gives 99.9%
(sec) (%) accuracy, while its false negative rate is 100 (or predicted all
1 32 72.3 98.4 98.7 crack images wrong).
1 64 225.9 99.2 98.8
1 128 819.3 99.7 99.0 TABLE III. UNBALANCING SETS OF IMAGES OF SIZE 64X64 PIXELS
1 227 2556.2 99.8 99.3 Exec.
Imbalance Accuracy
TP TN FP FN Time
2 32 77.9 98.5 99.0 factor (%) (%)
(sec.)
2 64 257.7 99.3 98.9 40 99.5 100 96 4 1 154.5
2 128 844.0 99.4 98.3 20 99.5 99 99 1 1 134.2
2 227 2551.9 99.6 99.0 10 99.4 98 100 0 2 115.5
3 32 79.7 98.6 99.1 5 99.7 96 99 1 5 114.1
3 64 261.9 99.2 98.9 1 99.6 31 100 0 69 114.2
3 128 844.0 99.5 98.9
0.1 99.9 0 100 0 100 110.5
3 227 2567.7 99.7 98.9

Furthermore, the values TP, TN, FP, and FN, may be used
The accuracy results need to be interpreted to evaluate their to identify further measurements, such as Sensitivity (Sv), and
significance. Therefore, we generated the data in TABLE II so Specificity (Sp). Those are defined as follows ([15]):
that conclusions can be drawn as to whether the obtained results
are fair enough for all image sizes.
Sv = TP / (TP + FN) 

5
Please, see, e.g., [15] for more information on the concepts of the confusion
matrix.

10 2022 International Conference on Electronic Systems and Intelligent Computing (ICESIC)

Authorized licensed use limited to: University of Leeds. Downloaded on October 10,2023 at 11:53:01 UTC from IEEE Xplore. Restrictions apply.
Sp = TN / (FP + TN) 

In the context of the results shown in TABLE III, Sv explains 80


Classification Accuracy (%) for Different Image Sizes
how accurately images showing cracks can be identified by our
model. Conversely, Sp refers to the model’s ability to correctly 32
identify crack free images. Table IV shows these values 60
computed from the results shown in TABLE III. 64
40
TABLE IV. EXTENSION TO TABLE III 128
20
Imbalance
factor (%) Sensitivity Specificity
0 227
40 99% 96%
0 2 4 6 8 10 12 14
20 99% 99%
Training data size (K)
10 98% 100%
Fig 7. Experiment with Bridge images
5 95% 99%
1 31% 100%
The trained model gives a good prediction accuracy (close to
99%) when tested with Mendeley images. However, the
0.1 0% 100%
accuracy of the model on the actual Bridge images is not
satisfactory. A closer examination of the images shows that the
It can be seen in TABLE IV that the sensitivity of the model Bridge images were blurry and therefore, the training dataset
declines drastically as the proportion of positive images needs further processing to accommodate this fact.
(imbalance factor) in the training set is reduced. This is expected Another important observation we made is that the model
because the network is exposed mostly to negative images has an accuracy above 90% when tested with the crack-free
during the training phase. On the other hand, the model is robust (negative) images from the Bridge dataset itself. However, the
for crack-free images. accuracy is very low for crack containing (positive) images of
the Bridge. This is an indication of a high rate of false negatives
B. Cloud based experiments and thus our model cannot reliably detect cracks. We explored
In this part of our work, the laptop-based experiment is other options such as increasing the proportion of positive
repeated on the cloud albeit with different sizes of data. The aim images, to train the model with more images showing cracks.
of this task is to obtain a more accurate model by using a much However, the accuracy improvement is not significant.
larger dataset. Additionally, it is interesting to observe if the
model trained on images from one concrete structure can be used Training time in sec. for different image sizes
for prediction in a different structure. This possibility would 1000
simplify the job of practitioners who have a limited dataset to 32
train a new model of their own. It is also beneficial for those
users whose image dataset is not yet labelled as manual labelling 64
of a large dataset requires opening and visually inspecting each 500
image file. Accordingly, we used images from two different 128
sources for the test set. The first test case used the Mendeley
dataset, similar to the laptop- based experiment. The second test 227
case used images from the Bridge dataset itself. The experiment
0
was repeated for different image sizes. The results of both test
0 5 10 15 20 25 30
cases are shown in figures 6 and 7.
Training data size (K)
Classification Accuracy (%) for different image sizes Fig 8. Execution performance on cloud
100,5
An important observation we made on the cloud execution
environment relates to performance. As expected, the cloud-
100 32
based implementation is much faster than the lap-top version. It
64 could train 24K images in less than 12 minutes on the 227X227
99,5 set. This is in fact many folds improvement over the local
128 version. It is also possible to see that the experiment with 32X32
99 images executes 10 times faster than the 227X227 set while
227 there is no significant difference in their accuracies. Achieving
98,5 comparable accuracies with low resolution images has a
0 5 10 15 20 25 30 significant impact as there will be substantial reduction in cloud
services cost.
Training data size (K)
Fig 6. Experiment with Mendeley images

2022 International Conference on Electronic Systems and Intelligent Computing (ICESIC) 11

Authorized licensed use limited to: University of Leeds. Downloaded on October 10,2023 at 11:53:01 UTC from IEEE Xplore. Restrictions apply.
Another important observation we made is that transferring deep convolutional neural network," 2016 IEEE International Conference
large datasets from the repository to the cloud execution node on Image Processing (ICIP), 2016, pp. 3708-3712, http://doi.org/10.1109/
ICIP.2016.7533052
could take longer time than the actual training time. This incurs
[3] Soukup, D. & Huber-Mork, R. (2014), Convolutional neural ¨ networks
significant costs specially on repeated and long-running for steel surface defect detection from photometric stereo images, in
experiments. Furthermore, due to the inherent inefficiency of Proceedings of 10th International Symposium on Visual Computing, Las
Python’s memory management, vectorizing the images to Vegas, NV, 668–77.
prepare them for training consumes a substantial part of the [4] Özgenel, Ç.F., Gönenç Sorguç, A. “Performance Comparison of
memory. We were able to tackle these challenges and managed Pretrained Convolutional Neural Networks on Crack Detection in
to train large datasets by implementing improved data caching Buildings”, ISARC 2018, Berlin.
and memory optimization techniques. [5] Martijn de Redelijkheid Kristian Kokoneshi, A Machine Learning
Analysis of Photographs of the Öresund Bridge, Bachelor Thesis at Dept.
of Compuiter Science, Kristianstad University, 2020, available at
V. CONCLUSIONS AND FUTURE WORK http://hkr.diva-
portal.org/smash/record.jsf?pid=diva2%3A1451429&dswid=-6276
This study addressed two major issues in machine learning: [6] LeCun, Yann & Bengio, Yoshua, Convolutional Networks for Images,
accuracy, and performance of CNN models in concrete crack Speech, and Time Series, The Handbook Brain Theory Neural Networks,
detection use case. Because the dataset provided by the user was vol. 3361, 1995.
not in a readily usable form, the Mendeley crack dataset was [7] Saha Sumit, A Comprehensive Guide to Convolutional Neural Networks
used in the study. The experiment was tested on a laptop for — the ELI5 way, towards data science, Dec 15, 2018, available at:
proof-of-concept and a larger version was deployed and https://towardsdatascience.com/a-comprehensive-guide-to-
convolutional-neural-networks-the-eli5-way-3bd2b1164a53
evaluated on the cloud. Resizing the images did not show
significant loss of accuracy. The 64X64 images give an accuracy [8] LeCun, Yann; Léon Bottou; Yoshua Bengio; Patrick Haffner (1998).
"Gradient-based learning applied to document recognition" (PDF).
level of 99.2%, while their execution time is less than 10% of Proceedings of the IEEE. 86 (11): 2278–2324.
the 227X227 images. [9] Rashid T., Make Your Own Neural Network, Createspace Independent
Publishing Platform, ISBN 9781530826605, 2016.
It has been shown that the original images of 227x227 pixels
can be scaled down to 64x64 pixels without any significant loss [10] Neural Network, Databricks, No date, available at https://databricks.com/
glossary/neural-network
in accuracy, and with extraordinary results (model accuracy
[11] Hoang N-D., Detection of Surface Crack in Building Structures Using
about 99.2%), compared with previously published studies. Image Processing Technique with an Improved Otsu Method for Image
Furthermore, the effect of unbalanced datasets has been studied Thresholding, Advances in Civil Engineering, Volume 2018 |Article ID
by progressively decreasing the proportion of positive images in 3924120 | https://doi.org/10.1155/2018/3924120, 2018.
the training set. It was found that the prediction accuracy is [12] Karim Raimi, Illustrated: 10 CNN Architectures - A compiled
robust even when the model is trained on a dataset with only visualization of the common convolutional neural networks, towards data
10% positive images. Preliminary studies on the Öresund Bridge science, Jul 29, 2019, available at: https://towardsdatascience.com/
illustrated-10-cnn-architectures-95d78ace614d
dataset also show promising results, with about 98% accuracy,
[13] Silva W, Lucena D. “Concrete Cracks Detection Based on Deep Learning
on images scaled down to 64x64 pixels, and an imbalance factor Image Classification.” Proceedings. 2018;2(8):489
in the training dataset of about 40%.
[14] Kim H, Ahn E, Shin M & Sim S. “Crack and Noncrack Classification
While this work can be considered a preliminary study, it from Concrete Surface Images Using Machine Learning.” Structural
Health Monitoring. 2018;18(3):725-738.
yielded important results and the experimental findings opened
additional research inquiries for future work. Relevant questions [15] Géron A., Hands-On Machine Leraning with Scikit-Learn, Keras &
TensorFlow. O’Reilly, ISBN 978-1-492-03264-9, 2019.
to ask in this regard include: How far can one simplify the CNN
[16] Dong W., What is OpenCV’s INTER_AREA Actually Doing?, Medium,
model without compromising accuracy? How can the relation 2018, available at: https://medium.com/@wenrudong/what-is-opencvs-
between image-sizes and CNN-structure on one side, and inter-area-actually-doing-282a626a09b3
execution time on the other be balanced? How can we build an
accurate prediction model on one dataset and use it on images
from other sources than it trained on? How can we reduce the
effect of dataset imbalance for this particular use case?

ACKNOWLEDGMENT
The authors would like to acknowledge students involved in
the initial experiments of this project. Special thanks also to the
managers of the Öresund Bridge for interesting and valuable
discussions and support.

REFERENCES
[1] Cha Y, Choi W, Büyüköztürk O. “Deep Learning-Based Crack Damage
Detection Using Convolutional Neural Networks”. Computer-Aided Civil
and Infrastructure Engineering. 2017;32(5):361-378.
[2] Lei Zhang , Fan Yang , Yimin Daniel Zhang, and Y. J. Z., Zhang, L.,
Yang, F., Zhang, Y. D., & Zhu, Y. J. (2016). "Road crack detection using

12 2022 International Conference on Electronic Systems and Intelligent Computing (ICESIC)

Authorized licensed use limited to: University of Leeds. Downloaded on October 10,2023 at 11:53:01 UTC from IEEE Xplore. Restrictions apply.

You might also like