Dense-RefineDet for Traffic Sign Detection and Classification
Abstract
:1. Introduction
- 1.
- We proposed an anchor-design method to detect small traffic signs using k-means clustering, followed by establishment of the center of the anchors of the shallowest feature layer at four points of each cell [namely (0.25, 0.25), (0.25, 0.75), (0.75, 0.25) and (0.75, 0.75)].
- 2.
- We built a feature-transformation module based on a dense connection in order to deliver semantic information contained in high-level layers to low-level layers and provide additional information for detecting small traffic signs.
- 3.
- Experiments using the Tsinghua-Tencent 100K and Caltech pedestrian datasets demonstrated that the Dense-Refinedet model enhanced the detection accuracy of the original RefineDet and achieved competitive performance with other state-of-the-art methods used for detecting real-world traffic signs and pedestrians.
2. Related Work
2.1. Context-Related CNN-Based Object-Detection Methods
2.2. CNN-Based Traffic Sign-Detection and -Classification Methods
3. The Proposed Method
3.1. RefineDet Rrevisited
3.2. Framework Overview
3.3. Anchor Design
- 1.
- Apply k-means clustering to obtain the anchor shapes. All GTBs in the training set were used for k-means clustering to obtain anchor shapes, with k set to four.
- 2.
- Determine the anchor coordinates. Four scaled feature maps were output in our model, with the anchors shapes obtained in step 1 corresponding to different scales of the target objects. Two sets of the anchor shapes were applied to the shallowest output feature maps, and the center coordinates of these anchors were set to (0.25, 0.25), (0.25, 0.75), (0.75, 0.25), and (0.75, 0.75), making the number of anchors for the shallowest feature map cell eight. All four anchor shapes were then applied to the remaining three output feature layers with center coordinates of (0.5, 0.5), making the number of anchors for each feature map cell four.
3.4. Building the Dense-TCB
4. Experiments and Results
4.1. Datasets and Experimental Setup
4.2. Detection Performance
4.2.1. Performance on the Tsinghua-Tencent 100K Dataset
4.2.2. Performance on the Caltech Pedestrian Dataset
4.2.3. Ablation Study
5. Conclusions
Author Contributions
Funding
Conflicts of Interest
References
- Zhu, Z.; Liang, D.; Zhang, S.; Huang, X.; Li, B.; Hu, S. Traffic-sign detection and classification in the wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 2110–2118. [Google Scholar]
- Yang, Y.; Luo, H.; Xu, H.; Wu, F. Towards real-time traffic sign detection and classification. IEEE Trans. Intell. Transp. Syst. 2015, 17, 2022–2031. [Google Scholar] [CrossRef]
- Stallkamp, J.; Schlipsing, M.; Salmen, J.; Igel, C. Man vs. computer: Benchmarking machine learning algorithms for traffic sign recognition. Neural Netw. 2012, 32, 323–332. [Google Scholar] [CrossRef]
- Houben, S.; Stallkamp, J.; Salmen, J.; Schlipsing, M.; Igel, C. Detection of Traffic Signs in Real-World Images: The German Traffic Sign Detection Benchmark. In Proceedings of the International Joint Conference on Neural Networks, Dallas, TX, USA, 4–9 August 2013. Number 1288. [Google Scholar]
- Liu, Z.; Du, J.; Tian, F.; Wen, J. MR-CNN: A multi-scale region-based convolutional neural network for small traffic sign recognition. IEEE Access 2019, 7, 57120–57128. [Google Scholar] [CrossRef]
- Liu, Z.; Li, D.; Ge, S.S.; Tian, F. Small traffic sign detection from large image. Appl. Intell. 2020, 50, 1–13. [Google Scholar] [CrossRef]
- Meng, Z.; Fan, X.; Chen, X.; Chen, M.; Tong, Y. Detecting small signs from large images. In Proceedings of the 2017 IEEE International Conference on Information Reuse and Integration (IRI), San Diego, CA, USA, 4–6 August 2017; pp. 217–224. [Google Scholar]
- Li, J.; Liang, X.; Wei, Y.; Xu, T.; Feng, J.; Yan, S. Perceptual generative adversarial networks for small object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 1222–1230. [Google Scholar]
- Luo, H.; Yang, Y.; Tong, B.; Wu, F.; Fan, B. Traffic sign recognition using a multi-task convolutional neural network. IEEE Trans. Intell. Transp. Syst. 2017, 19, 1100–1111. [Google Scholar] [CrossRef]
- Zhu, Y.; Zhang, C.; Zhou, D.; Wang, X.; Bai, X.; Liu, W. Traffic sign detection and recognition using fully convolutional network guided proposals. Neurocomputing 2016, 214, 758–766. [Google Scholar] [CrossRef]
- Aghdam, H.H.; Heravi, E.J.; Puig, D. A practical approach for detection and classification of traffic signs using convolutional neural networks. Robot. Auton. Syst. 2016, 84, 97–112. [Google Scholar] [CrossRef]
- Dewi, C.; Chen, R.C.; Yu, H. Weight analysis for various prohibitory sign detection and recognition using deep learning. Multimed. Tools Appl. 2020, 79, 32897–32915. [Google Scholar] [CrossRef]
- Cao, G.; Xie, X.; Yang, W.; Liao, Q.; Shi, G.; Wu, J. Feature-fused SSD: Fast detection for small objects. In Proceedings of the Ninth International Conference on Graphic and Image Processing (ICGIP 2017), Qingdao, China, 14–16 October 2017; International Society for Optics and Photonics: San Diego, CA, USA, 2018; Volume 10615, p. 106151E. [Google Scholar]
- Chu, W.; Cai, D. Deep feature based contextual model for object detection. Neurocomputing 2018, 275, 1035–1042. [Google Scholar] [CrossRef] [Green Version]
- Bell, S.; Lawrence Zitnick, C.; Bala, K.; Girshick, R. Inside-outside net: Detecting objects in context with skip pooling and recurrent neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 2874–2883. [Google Scholar]
- Zhu, Y.; Zhao, C.; Wang, J.; Zhao, X.; Wu, Y.; Lu, H. Couplenet: Coupling global structure with local parts for object detection. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 4126–4134. [Google Scholar]
- Chen, L.C.; Papandreou, G.; Kokkinos, I.; Murphy, K.; Yuille, A.L. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 40, 834–848. [Google Scholar] [CrossRef]
- Fu, C.Y.; Liu, W.; Ranga, A.; Tyagi, A.; Berg, A.C. Dssd: Deconvolutional single shot detector. arXiv 2017, arXiv:1701.06659. [Google Scholar]
- Xie, H.; Chen, Y.; Shin, H. Context-aware pedestrian detection especially for small-sized instances with Deconvolution Integrated Faster RCNN (DIF R-CNN). Appl. Intell. 2019, 49, 1200–1211. [Google Scholar] [CrossRef]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. Ssd: Single shot multibox detector. In European Conference on Computer Vision; Springer: Amsterdam, The Netherlands, 2016; pp. 21–37. [Google Scholar]
- Zhang, S.; Wen, L.; Bian, X.; Lei, Z.; Li, S.Z. Single-shot refinement neural network for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 4203–4212. [Google Scholar]
- Tong, K.; Wu, Y.; Zhou, F. Recent advances in small object detection based on deep learning: A review. Image Vis. Comput. 2020, 97, 103910. [Google Scholar] [CrossRef]
- Li, B.; Wu, T.; Zhang, L.; Chu, R. Auto-context r-cnn. arXiv 2018, arXiv:1807.02842. [Google Scholar]
- Sommer, L.; Schumann, A.; Schuchert, T.; Beyerer, J. Multi feature deconvolutional faster r-cnn for precise vehicle detection in aerial imagery. In Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), Lake Tahoe, NV, USA, 12–15 March 2018; pp. 635–642. [Google Scholar]
- Lisha, C.; Lv, P.; Xiaoheng, J.; Zhimin, G.; Bing, Z.; Mingliang, X. MDSSD: Multi-scale Deconvolutional Single Shot Detector for Small Objects. Sci. China Inf. Sci. 2020, 63, 120113. [Google Scholar]
- Lim, J.S.; Astrid, M.; Yoon, H.J.; Lee, S.I. Small Object Detection using Context and Attention. arXiv 2019, arXiv:1912.06319. [Google Scholar]
- Noh, J.; Bae, W.; Lee, W.; Seo, J.; Kim, G. Better to Follow, Follow to Be Better: Towards Precise Supervision of Feature Super-Resolution for Small Object Detection. In Proceedings of the IEEE International Conference on Computer Vision, Seoul, Korea, 27 October–2 November 2019; pp. 9725–9734. [Google Scholar]
- Song, S.; Que, Z.; Hou, J.; Du, S.; Song, Y. An efficient convolutional neural network for small traffic sign detection. J. Syst. Archit. 2019, 97, 269–277. [Google Scholar] [CrossRef]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- Lu, J.; Tang, S.; Wang, J.; Zhu, H.; Wang, Y. A review on object detection based on deep convolutional neural networks for autonomous driving. In Proceedings of the 2019 Chinese Control And Decision Conference (CCDC), Nanchang, China, 3–5 June 2019; pp. 5301–5308. [Google Scholar]
- Zheng, L.; Fu, C.; Zhao, Y. Extend the shallow part of single shot multibox detector via convolutional neural network. In Proceedings of the Tenth International Conference on Digital Image Processing (ICDIP 2018), Shanghai, China, 11–14 May 2018; Volume 10806, p. 1080613. [Google Scholar]
- Huang, G.; Liu, Z.; Van Der Maaten, L.; Weinberger, K.Q. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4700–4708. [Google Scholar]
- Dollar, P.; Wojek, C.; Schiele, B.; Perona, P. Pedestrian detection: An evaluation of the state of the art. IEEE Trans. Pattern Anal. Mach. Intell. 2011, 34, 743–761. [Google Scholar] [CrossRef]
- Zhang, S.; Benenson, R.; Omran, M.; Hosang, J.; Schiele, B. How far are we from solving pedestrian detection? In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 1259–1267. [Google Scholar]
- Mao, J.; Xiao, T.; Jiang, Y.; Cao, Z. What can help pedestrian detection? In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 3127–3136. [Google Scholar]
- Liu, W.; Hasan, I.; Liao, S. Center and Scale Prediction: A Box-free Approach for Pedestrian and Face Detection. arXiv 2019, arXiv:1904.02948. [Google Scholar]
- Russakovsky, O.; Deng, J.; Su, H.; Krause, J.; Satheesh, S.; Ma, S.; Huang, Z.; Karpathy, A.; Khosla, A.; Bernstein, M.; et al. Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 2015, 115, 211–252. [Google Scholar] [CrossRef] [Green Version]
- Zhang, S.; Benenson, R.; Schiele, B. Citypersons: A diverse dataset for pedestrian detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 3213–3221. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks. In Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada, 7–12 December 2015; pp. 91–99. [Google Scholar]
- Liu, S.; Huang, D. Receptive field block net for accurate and fast object detection. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 385–400. [Google Scholar]
- Pon, A.; Adrienko, O.; Harakeh, A.; Waslander, S.L. A hierarchical deep architecture and mini-batch selection method for joint traffic sign and light detection. In Proceedings of the 2018 15th Conference on Computer and Robot Vision (CRV), Toronto, ON, Canada, 8–10 May 2018; pp. 102–109. [Google Scholar]
- Song, S.; Zhu, Y.; Hou, J.; Zheng, Y.; Huang, T.; Du, S. Improved Convolutional Neutral Network Based Model for Small Visual Object Detection in Autonomous Driving. In Proceedings of the 2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS), Hsinchu, Taiwan, 18–20 March 2019; pp. 179–183. [Google Scholar]
- Cai, Z.; Fan, Q.; Feris, R.S.; Vasconcelos, N. A unified multi-scale deep convolutional neural network for fast object detection. In European Conference on Computer Vision; Springer: Amsterdam, The Netherlands, 2016; pp. 354–370. [Google Scholar]
- Li, J.; Liang, X.; Shen, S.; Xu, T.; Feng, J.; Yan, S. Scale-aware fast R-CNN for pedestrian detection. IEEE Trans. Multimed. 2017, 20, 985–996. [Google Scholar] [CrossRef] [Green Version]
- Zhang, S.; Yang, J.; Schiele, B. Occluded pedestrian detection through guided attention in CNNs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 8–22 June 2018; pp. 6995–7003. [Google Scholar]
- Cai, Z.; Saberian, M.; Vasconcelos, N. Learning complexity-aware cascades for deep pedestrian detection. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 3361–3369. [Google Scholar]
- Tian, Y.; Luo, P.; Wang, X.; Tang, X. Deep learning strong parts for pedestrian detection. In Proceedings of the IEEE International Conference on Computer vision, Santiago, Chile, 7–13 December 2015; pp. 1904–1912. [Google Scholar]
- Brazil, G.; Liu, X. Pedestrian detection with autoregressive network phases. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–21 June 2019; pp. 7231–7240. [Google Scholar]
- Yun, I.; Jung, C.; Wang, X.; Hero, A.O.; Kim, J.K. Part-level convolutional neural networks for pedestrian detection using saliency and boundary box alignment. IEEE Access 2019, 7, 23027–23037. [Google Scholar] [CrossRef]
- Lin, C.; Lu, J.; Wang, G.; Zhou, J. Graininess-aware deep feature learning for pedestrian detection. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 732–747. [Google Scholar]
Methods | Testing Time (s/Frame) | Metrics | Small | Medium | Large |
---|---|---|---|---|---|
Faster RCNN [39] | 0.23 | recall | 49.8 | 83.7 | 91.2 |
precision | 24.1 | 65.6 | 80.8 | ||
SSD [20] | - | recall | 43.4 | 77.5 | 86.9 |
precision | 25.3 | 67.8 | 81.5 | ||
Pon et al. [41] | - | recall | 24.0 | 54.0 | 70.0 |
precision | 65.0 | 67.0 | 75.0 | ||
RFB [40] | 0.14 | recall | 73.5 | 84.3 | 85.1 |
precision | 76.2 | 79.5 | 91.5 | ||
Zhu et al. [1] | 0.77 | recall | 87.4 | 93.6 | 87.7 |
precision | 81.7 | 90.8 | 90.6 | ||
Song et al. [42] | - | recall | 88.0 | 94.0 | 87.0 |
precision | 83.0 | 91.0 | 91.0 | ||
MR-CNN [5] | - | recall | 89.3 | 94.4 | 88.2 |
precision | 82.9 | 92.6 | 92.0 | ||
DR-CNN [6] | 0.26 | recall | 89.3 | 94.8 | 89.6 |
precision | 83.1 | 91.7 | 92.4 | ||
Dense-RefineDet | 0.13 | recall | 84.3 | 95.2 | 92.6 |
precision | 83.9 | 95.6 | 94.0 |
Methods | Input Size | MR (New) | MR (Original) | Runtime (s/Frame) |
---|---|---|---|---|
DeepParts [47] | - | 60.61 | 64.78 | 1.00 |
SA-FastRCNN [44] | 57.02 | 62.59 | 0.59 | |
MS-CNN [43] | 55.69 | 60.95 | 0.40 | |
AR-Ped [48] | 55.24 | 58.83 | 0.09 | |
Dense-RefineDet | 47.12 | 54.03 | 0.06 |
Metrics | RefineDet Only | RefineDet + Designed Anchors | RefineDet + Designed Anchors + Dense-TCB | |
---|---|---|---|---|
mAP | 80.76 | 82.06 | 83.25 | |
Small | recall | 61.38 | 64.63 | 64.98 |
precision | 62.72 | 66.71 | 67.05 | |
Medium | recall | 84.26 | 89.66 | 90.69 |
precision | 86.70 | 90.70 | 91.92 | |
Large | recall | 87.48 | 93.39 | 93.81 |
precision | 90.13 | 93.89 | 94.68 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Sun, C.; Ai, Y.; Wang, S.; Zhang, W. Dense-RefineDet for Traffic Sign Detection and Classification. Sensors 2020, 20, 6570. https://doi.org/10.3390/s20226570
Sun C, Ai Y, Wang S, Zhang W. Dense-RefineDet for Traffic Sign Detection and Classification. Sensors. 2020; 20(22):6570. https://doi.org/10.3390/s20226570
Chicago/Turabian StyleSun, Chang, Yibo Ai, Sheng Wang, and Weidong Zhang. 2020. "Dense-RefineDet for Traffic Sign Detection and Classification" Sensors 20, no. 22: 6570. https://doi.org/10.3390/s20226570