Use of multimodal neural network techniques to assess quality of roadways
M.G. Gorodnichev, K.A. Polyantseva, I.D. Razumovsky
Upload the full text
Abstract: The article discusses the problem of automatic detection of pavement defects using multimodal neural network methods.
Aim. To develop and experimentally evaluate a multimodal neural network method for automatically detecting pavement defects using combined analysis of visual and three-dimensional data.
Methods. The Faster R-CNN model is used for detecting damage areas, the Swin Transformer Small model for classifying visual fragments, and the PointNet model for analyzing surface geometry based on lidar data. The predictions from each modality are combined by weighted summation (weights 0.1, 0.6, and 0.4, respectively). The training and testing are conducted on the RSRD multimodal dataset, which includes RGB images and point clouds obtained in various road and weather conditions.
Results. Experimental studies have shown that the multimodal approach provides an increase in classification accuracy of up to 95.57%, as well as a significant improvement in defect detection metrics. For the pothole class, completeness increased by 27% and F1-score by 20% compared to using individual models.
Conclusions. The developed architecture demonstrates high stability and accuracy in the tasks of analyzing the roadway. The results obtained confirm the effectiveness of the integration of visual and spatial data and the expediency of using multimodal methods to build intelligent monitoring systems for road infrastructure.
Keywords: machine learning, neural networks, pavement quality, defect detection, computer vision, lidar, point clouds, convolutional neural networks, transformers, intelligent transport systems
For citation. Gorodnichev M.G., Polyantseva K.A., Razumovsky I.D. Use of multimodal neural network techniques to assess quality of roadways. News of the Kabardino-Balkarian Scientific Center of RAS. 2025. Vol. 27. No. 6. Pp. 89–108. DOI: 10.35330/1991-6639-2025-27-6-89-108
References
- Kozyrev S.V., Polyantseva K.A. Comprehensive analysis and comparison of advanced road surface defect detection algorithms using various data collection systems. Inzhenernyy vestnik Dona [Engineering Bulletin of the Don]. 2024. No. 11(119). Pp. 72–116. EDN: JHKKTB. (In Russian)
- Ranyal E., Sadhu A., Jain K. Road condition monitoring using smart sensing and artificial intelligence: a review. Sensors. 2022. Vol. 22. No. 8. P. 3044. DOI: 10.3390/s22083044
- Abdelwahed S.H., Sharobim B.K., Wasfey B. et al. Advancements in real-time road
damage detection: a comprehensive survey of methodologies and datasets. Journal of Real-Time Image Processing. 2025. Vol. 22. P. 137. DOI: 10.1007/s11554-025-01683-1 - Polyantseva K.A., Gorodnichev M.G. Neural network approaches in the problems of detecting and classifying roadway defects. Wave Electronics and Its Application in Information and Telecommunication Systems. 2022. Vol. 5. No. 1. Pp. 364–370. EDN: CFBLOQ
- Polyantseva K.A. Development of data accumulation algorithms using a stereo pair and detection of road surface defects. Sovremennye naukoemkie tekhnologii [Modern High Technologies]. No. 5-1. Pp. 107–112. DOI: 10.17513/snt.39156. (In Russian)
- Ma N., Fan J., Wang W. et al. Computer vision for road imaging and pothole detection: a state-of-the-art review of systems and algorithms. Transportation Safety and Environment. 2022. Vol. 4. No. 4. P. tdac026. DOI: 10.1093/tse/tdac026
- Toral V., Krushangi T., Varia Harishkumar R. Automated potholes detection using vibration and vision-based techniques. World Journal of Advanced Engineering Technology and Sciences. 2023. Vol. 10. No. 1. Pp. 157–176.
- Wu C., Wang Z., Hu S. et al. An automated machine-learning approach for road pothole detection using smartphone sensor data. Sensors. 2020. Vol. 20. No. 19. P. 5564. DOI: 10.3390/s20195564
- Sholevar N., Golroo A., Esfahani S.R. Machine learning techniques for pavement condition evaluation. Automation in Construction. 2022. Vol. 136. P. 104190. DOI: 10.1016/j.autcon.2022.104190
- Dong D., Li Z. Smartphone sensing of road surface condition and defect detection. Sensors. 2021. Vol. 21. No. 16. P. 5433. DOI: 10.3390/s21165433
- Raslan E., Alrahmawy M.F., Mohammed Y.A. et al. Evaluation of data representation techniques for vibration based road surface condition classification. Scientific Reports. 2024.
Vol. 14. P. 11620. DOI: 10.1038/s41598-024-61757-1 - Jahan I.A., Huq A.S., Mahadi M.K. et al. RoadSense: a framework for road condition monitoring using sensors and machine learning. IEEE Transactions on Intelligent Vehicles. DOI: 10.1109/TIV.2024.3486020
- Gu J., Lind A., Chhetri T.R. et al. End-to-end multimodal sensor dataset collection framework for autonomous vehicles. Sensors. 2023. Vol. 23. No. 15. P. 6783. DOI:
10.3390/s23156783 - Faisal A., Gargoum S. Cost-effective LiDAR for pothole detection and quantification using a low-point-density approach. Automation in Construction. 2025. Vol. 172. P. 106006. DOI: 10.1016/j.autcon.2025.106006
- Yang C., Yang L., Duan H. et al. A review of pavement defect detection based on visual perception. International Journal of Mechatronics and Applied Mechanics. 2024. No. 17.
Pp. 131–146. - Mkrtchian G., Polyantseva K. On the use of an acoustic sensor in the tasks of determining defects in the roadway. Systems of Signals Generating and Processing in the Field of on Board Communications. 2024. Vol. 7. No. 1. Pp. 276–280. DOI: 10.1109/IEEECONF60226.2024.10496721
- Safyari Y., Mahdianpari M., Shiri H. A review of vision-based pothole detection methods using computer vision and machine learning. Sensors. 2024. Vol. 24. No. 17. P. 5652.
DOI: 10.3390/s24175652 - Chen W., Yang J.S., Xia C. et al. Road surface damage detection based on enhanced YOLOv8. Computers in Industry. 2025. Vol. 173. P. 104363. DOI: 10.1016/j.compind.2025.104363
- Lincy A., Dhanarajan G., Kumar S.S., Gobinath B. Road pothole detection system. ITM Web of Conferences. 2023. Vol. 53. P. 01008. DOI: 10.1051/itmconf/20235301008
- Yang L., Deng J., Duan H. et al. An efficient fusion detector for road defect detection. Scientific Reports. 2025. Vol. 15. P. 27959. DOI: 10.1038/s41598-025-01399-z
- Song W., Zhang Z., Zhang B. et al. ISTD-PDS7: A benchmark dataset for multi-type pavement distress segmentation from ccd images in complex scenarios. Remote Sensing. 2023.
Vol. 15. No. 7. P. 1750. DOI: 10.3390/rs15071750 - Zuo C., Huang N., Yuan C., Li Y. Pavement-DETR: a high-precision real-time detection transformer for pavement defect detection. Sensors. 2025. Vol. 25. No. 8. P. 2426. DOI:
10.3390/s25082426 - Arya D., Maeda H., Ghosh S.K. et al. RDD2022: a multi-national image dataset for automatic road damage detection. Geoscience Data Journal. 2024. Vol. 11. Pp. 846–862.
DOI: 10.1002/gdj3.260 - Xiao X., Li Zh., Wang W. et al. TD-RD: a top-down benchmark with real-time framework for road damage detection. 2025 IEEE International Conference on Acoustics,
Speech and Signal Processing (ICASSP). Hyderabad, India, 2025. Pp. 1–5. DOI: 10.1109/ICASSP49660.2025.10888616 - Abdelkader M.F., Hedeya M.A., Samir E. et al. EGY_PDD: a comprehensive multi-sensor benchmark dataset for accurate pavement distress detection and classification. Multimedia Tools and Applications. 2025. Vol. 84. Pp. 38509–38544. DOI: 10.1007/s11042-025-20700-w
- Xiao X. et al. Roadbench: A vision-language foundation model and benchmark for road damage understanding. arXiv preprint arXiv:2507.17353. 2025. URL: https://arxiv.org/abs/2507.17353. (accessed 09/01/2025)
- Khandakar A., Michelson D.G., Naznine M. et al. Harnessing smartphone sensors for enhanced road safety: a comprehensive dataset and review. Scientific Data. 2025. Vol. 12. P. 418. DOI: 10.1038/s41597-024-04193-0
- Polyantseva K., Gorodnichev M. On the applicability of multimodal neural network methods for determining the quality of the road surface. 2025 Systems of Signal Synchronization, Generating and Processing in Telecommunications (SYNCHROINFO). Tyumen, Russian Federation, Pp. 1–6. DOI: 10.1109/SYNCHROINFO65403.2025.11079337
- Ren S., He K., Girshick R., Sun J. Faster R-CNN: towards real-time object detection with region proposal networks. arXiv preprint arXiv:1506.01497. 2016. DOI: 10.48550/arXiv.1506.01497
- Terven J., Cordova-Esparza D. A comprehensive review of YOLO architectures in computer vision: from YOLOv1 to YOLOv8 and YOLO-NAS. Machine Learning and Knowledge Extraction. Vol. 5. Pp. 1680–1716. DOI: 10.48550/arXiv.2304.00501
- He K., Zhang X., Ren S., Sun J. Deep Residual Learning for Image Recognition. arXiv preprint arXiv:1512.03385. 2015. DOI: 10.48550/arXiv.1512.03385
- Tan M., Le Q.V. EfficientNet: rethinking model scaling for convolutional neural networks. International Conference on Machine Learning. 2019. DOI: 10.48550/arXiv.1905.11946
- Liu Z., Lin Y., Cao Y. et al. Swin transformer: hierarchical vision transformer using shifted Windows. arXiv preprint arXiv:2103.14030. 2021. DOI: 10.48550/arXiv.2103.14030
- Ma L., Li Y., Li J. et al. Mobile laser scanned point-clouds for road object detection and extraction: a review. Remote Sensing. 2018. Vol. 10. No. 10. P. 1531. DOI: 10.3390/rs10101531
- Zhao H., Jiang L., Jia J. et al. Point Transformer. arXiv preprint arXiv:2012.09164. 2021. DOI: 10.48550/arXiv.2012.09164
- Qi C.R., Su H., Mo K., Guibas L.J. PointNet: deep learning on point sets for 3d classification and segmentation. arXiv preprint arXiv:1612.00593. 2017. DOI: 10.48550/arXiv.1612.00593
- Qi C.R., Yi L., Su H., Guibas L.J. PointNet++: deep hierarchical feature learning on point sets in a metric space. arXiv preprint arXiv:1706.02413. 2017. DOI: 10.48550/arXiv.1706.02413
Information about the authors
Mikhail G. Gorodnichev, Candidate of Engineering Sciences, Associate Professor, Dean of the Faculty of Information Technology, Moscow Technical University of Communications and Informatics;
8A, Aviamotornaya street, Moscow, 111024, Russia;
m.g.gorodnichev@mtuci.ru, ORCID: https://orcid.org/0000-0003-1739-9831, SPIN-code: 4576-9642
Ksenia A. Polyantseva, Candidate of Technical Sciences, Associate Professor of the Department of Data Mining, Moscow Technical University of Communications and Informatics
8A, Aviamotornaya street, Moscow, 111024, Russia;
k.a.poliantseva@mtuci.ru, ORCID: https://orcid.org/0000-0002-7102-4208, SPIN-code: 8112-8560
Igor D. Razumovsky, Student, Moscow Technical University of Communications and Informatics;
8A, Aviamotornaya street, Moscow, 111024, Russia;
igor.raz@list.ru











