Application of machine learning method to analyse incomplete data
L.A. Lyutikova
Upload the full text
Abstract. This paper presents an integrated approach to the analysis of incomplete and inaccurate data, illustrated by the example of mudflow forecasting. The aim of the study is to demonstrate how a combination of different methods allows not only to obtain adequate forecasts, but also to deeply understand the logic of decision-making by the model, identifying the key factors influencing the forecast. The key point of the work is the use of categorization of numerical data to increase the stability of models to outliers and noise, as well as to take into account nonlinear dependencies. The integrated approach is based on a combination of associative data analysis and the construction of a logical classifier, which acts as an interpreter of the obtained decisions. This combination made it possible to identify critical input features and understand how the model uses information to form a forecast, identify factors that have the greatest impact on the forecast result, ensure the accuracy and stability of forecasts taking into account the specificity and complexity of mudflow data. The rules obtained during the study, which are the key principles of the studied area, contribute to a deeper understanding of the nature of mudflows.
Keywords: machine learning, neural networks, cluster analysis, associative rules
For citation. Lyutikova L.A. Application of machine learning method to analyse incomplete data. News of the Kabardino-Balkarian Scientific Center of RAS. 2024. Vol. 26. No. 6. Pp. 139–145. DOI: 10.35330/1991-6639-2024-26-6-139-145
References
- Kondrat’eva N.V. Preliminary assessment of the maximum volume of solid mudflow deposits using mathematical statistics methods for the Central Caucasus. Sovremennye problemy nauki i obrazovaniya [Modern problems of science and education]. 2014. No. 4. Pp. 50–56. URL: http://www.science-education.ru/118-13897. (In Russian)
- Kondrat’eva N.V., Adzhiev A.Kh., Bekkiev M.Yu. et al. Kadastr selevoy opasnosti Yuga evropeyskoy chasti Rossii [Mudflow hazard cadastre of the South of the European part of Russia]. M., Nal’chik: Feoriya, 2015. 148 p. (In Russian)
- Caiafa C.F., ,Jordi Solé-Casals J.S.-C., Marti-Puig P. et al. Decomposition methods for machine learning with small, incomplete or noisy datasets. Applied Sciences. 2020. Vol. 10. No. 23. P. 8481. DOI: 10.3390/APP10238481
- Kainthura P., Sharma N. Hybrid machine learning approach for landslide prediction, Uttarakhand, India. Scientific reports. 2022. Vol. 12. No. 1. P. 20101. DOI: 10.1038/s41598-022-22814-9
- Hadi F.A.A., Sidek L.M., Salih G.H.A. et al. Machine learning techniques for flood forecasting. Journal of Hydroinformatics. 2024. Vol. 26. No. 4. Pp. 779–799. DOI: 10.2166/hydro.2024.208
- Lombardo L., Mai P.M. Presenting logistic regression-based landslide susceptibility results. Engineering Geology. 2018. Vol. 244. Pp. 14–24. DOI: 10.1016/j.enggeo.2018.07.019
- Rahmati O., Kornejady A., Samadi M. et al. PMT: New analytical framework for automated evaluation of geo-environmental modelling approaches. The Science of the Total Environment. 2019. Vol. 664. Pp. 296–311. DOI: 10.1016/j.scitotenv.2019.02.017
- Kyul’ E.V., Ezaov A.K., Kankulova L.I. Theoretical foundations of geoecological monitoring of mountain ecosystems. Ustoychivoe razvitie gornykh territoriy [Sustainable development of mountain areas]. 2019. Vol. 11. No 1. Pp. 36–43. DOI: 10.21177/1998-4502-2019-11-1-36-43. (In Russian)
- Lyutikova L.A. Methods for Improving the Efficiency of Neural Network Decision-Making. Advances in Automation IV. RusAutoCon 2022. Lecture Notes in Electrical Engineering. 2023. Vol. 986. Pp. 294–303. DOI: 10.1007/978-3-031-22311-2_29
- Radeev N.A. Predicting Avalanche Hazard Using Machine Learning Methods. Vestnik NGU. Seriya: Informacionnye tekhnologii [Bulletin of NSU. Series: Information technology]. Vol. 19, No 2. Pp. 92–101. DOI: 10.25205/1818-7900-2021-19-2-92-101. (In Russian)
- Zhuravlyov Yu.I. On an algebraic approach to solving recognition or classification problems. Problemy kibernetiki [Problems of cybernetics]. 1978. Vol. 33. Pp. 5–68. (In Russian)
- Flakh P. Mashinnoe obuchenie: nauka i iskusstvo postroeniya algoritmov, kotorye izvlekayut znaniya iz dannykh [Machine Learning: The Art and Science of Algorithms that Make Sense of Data]. Moscow: DMK Press, 2015. (In Russian)
Information about the author
Larisa A. Lyutikova, Candidate of Physical and Mathematical Sciences, Head of the Department of
Neural Networks and Machine Learning, Institute of Applied Mathematics and Automation – branch of
Kabardino-Balkarian Scientific Center of the Russian Academy of Sciences;
360000, Russia, Nalchik, 89 A Shortanov street;
lylarisa@yandex.ru, ORCID: https://orcid.org/0000-0002-5819-9396, SPIN-code: 1679-7460











