<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE root>
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ali="http://www.niso.org/schemas/ali/1.0/" article-type="research-article" dtd-version="1.2" xml:lang="ru"><front><journal-meta><journal-id journal-id-type="publisher-id">News of the Kabardino-Balkarian Scientific Center of the Russian Academy of Sciences</journal-id><journal-title-group><journal-title xml:lang="en">News of the Kabardino-Balkarian Scientific Center of the Russian Academy of Sciences</journal-title><trans-title-group xml:lang="ru"><trans-title>Известия Кабардино-Балкарского научного центра РАН</trans-title></trans-title-group></journal-title-group><issn publication-format="print">1991-6639</issn><issn publication-format="electronic">2949-1940</issn></journal-meta><article-meta><article-id pub-id-type="publisher-id">294372</article-id><article-id pub-id-type="doi">10.35330/1991-6639-2025-27-2-11-22</article-id><article-id pub-id-type="edn">EWHPZV</article-id><article-categories><subj-group subj-group-type="toc-heading" xml:lang="en"><subject>System analysis, management and information processing</subject></subj-group><subj-group subj-group-type="toc-heading" xml:lang="ru"><subject>Системный анализ, управление и обработка информации</subject></subj-group><subj-group subj-group-type="article-type"><subject>Research Article</subject></subj-group></article-categories><title-group><article-title xml:lang="en">Building a machine learning model for predicting fraudulent transactions</article-title><trans-title-group xml:lang="ru"><trans-title>Построение модели машинного обучения для прогнозирования мошеннических транзакций</trans-title></trans-title-group></title-group><contrib-group><contrib contrib-type="author"><contrib-id contrib-id-type="orcid">https://orcid.org/0009-0000-9591-3301</contrib-id><contrib-id contrib-id-type="spin">3088-3121</contrib-id><name-alternatives><name xml:lang="en"><surname>Konstantinov</surname><given-names>Alexey F.</given-names></name><name xml:lang="ru"><surname>Константинов</surname><given-names>Алексей Федорович</given-names></name></name-alternatives><address><country country="RU">Russian Federation</country></address><bio xml:lang="en"><p>Postgraduate Student at the Department of Informatics</p></bio><bio xml:lang="ru"><p>аспирант кафедры информатики</p></bio><email>konstantinovaf@gmail.com</email><xref ref-type="aff" rid="aff1"/></contrib><contrib contrib-type="author"><contrib-id contrib-id-type="orcid">https://orcid.org/0000-0001-5229-8070</contrib-id><contrib-id contrib-id-type="spin">2513-8831</contrib-id><name-alternatives><name xml:lang="ru"><surname>Дьяконова</surname><given-names>Людмила Павловна</given-names></name><name xml:lang="en"><surname>Dyakonova</surname><given-names>Lyudmila P.</given-names></name></name-alternatives><address><country country="RU">Russian Federation</country></address><bio xml:lang="en"><p>Candidate of Physical and Mathematical Sciences, Associate Professor at the Department of Informatics</p></bio><bio xml:lang="ru"><p>канд. физ.-мат. наук, доцент кафедры информатики</p></bio><email>Dyakonova.LP@rea.ru</email><xref ref-type="aff" rid="aff1"/></contrib></contrib-group><aff-alternatives id="aff1"><aff><institution xml:lang="en">Plekhanov Russian University of Economics</institution></aff><aff><institution xml:lang="ru">Российский экономический университет имени Г. В. Плеханова</institution></aff></aff-alternatives><content-language>ru</content-language><pub-date date-type="pub" iso-8601-date="2025-06-11" publication-format="electronic"><day>11</day><month>06</month><year>2025</year></pub-date><pub-date date-type="collection"><year>2025</year></pub-date><volume>27</volume><issue>2</issue><issue-title xml:lang="en"/><issue-title xml:lang="ru"/><fpage>11</fpage><lpage>22</lpage><history><date date-type="received" iso-8601-date="2025-05-30"><day>30</day><month>05</month><year>2025</year></date><date date-type="accepted" iso-8601-date="2025-05-30"><day>30</day><month>05</month><year>2025</year></date></history><permissions><copyright-statement xml:lang="en">Copyright ©; 2025, Konstantinov A.F., Dyakonova L.P.</copyright-statement><copyright-statement xml:lang="ru">Copyright ©; 2025, Константинов А.Ф., Дьяконова Л.П.</copyright-statement><copyright-year>2025</copyright-year><copyright-holder xml:lang="en">Konstantinov A.F., Dyakonova L.P.</copyright-holder><copyright-holder xml:lang="ru">Константинов А.Ф., Дьяконова Л.П.</copyright-holder><ali:free_to_read xmlns:ali="http://www.niso.org/schemas/ali/1.0/"/><license><ali:license_ref xmlns:ali="http://www.niso.org/schemas/ali/1.0/">https://creativecommons.org/licenses/by/4.0</ali:license_ref></license></permissions><self-uri xlink:href="https://journals.rcsi.science/1991-6639/article/view/294372">https://journals.rcsi.science/1991-6639/article/view/294372</self-uri><abstract xml:lang="en"><p>The article presents development of a machine learning model for predicting fraudulent transactions using transactional data from a bank. It discusses the features of encoding categorical variables related to the presence of time in the transactional data to avoid information leakage. Additionally, experiments were conducted on the application of bagging and the creation of additional variables based on their contribution to the final prediction using Shapley values. The quality metrics of the machine learning model are examined and analyzed.</p></abstract><trans-abstract xml:lang="ru"><p>В статье представлена разработка модели машинного обучения для прогнозирования мошеннических транзакций на примере транзакционных данных банка. Рассмотрены особенности кодирования категориальных переменных, связанные с наличием времени в транзакционных данных, чтобы избежать утечек информации. Проведены эксперименты по применению баггинга (bootstrap aggregating) и созданию дополнительных переменных на основе их вклада в итоговый прогноз с применением Shapley values. Рассмотрены показатели качества модели машинного обучения и проведен их анализ.</p></trans-abstract><kwd-group xml:lang="ru"><kwd>мошеннические транзакции</kwd><kwd>catboost</kwd><kwd>кодирование категориальных переменных</kwd><kwd>catboost_encoder</kwd><kwd>target_encoder</kwd><kwd>bagging</kwd><kwd>создание переменных</kwd><kwd>Shapley values</kwd></kwd-group><kwd-group xml:lang="en"><kwd>fraudulent transactions</kwd><kwd>catboost</kwd><kwd>encoding categorical variables</kwd><kwd>catboost_encoder</kwd><kwd>target_encoder</kwd><kwd>bagging</kwd><kwd>variables creation</kwd><kwd>Shapley values</kwd></kwd-group><funding-group/></article-meta></front><body></body><back><ref-list><ref id="B1"><label>1.</label><mixed-citation>Mashrur A., Luo W., Zaidi N.A., Robles-Kelly A. Machine Learning for Financial Risk Management: A Survey. IEEE Access. 2020. Vol. 8. Pp. 203203–203223. DOI: 10.1109/ACCESS.2020.3036322</mixed-citation></ref><ref id="B2"><label>2.</label><mixed-citation>Awosika T., Shukla R.M., Pranggono B. Transparency and Privacy: The Role of Explainable AI and Federated Learning in Financial Fraud Detection. IEEE Access. 2024. Vol. 12. Pp. 64551–64560. DOI: 10.1109/ACCESS.2024.3394528</mixed-citation></ref><ref id="B3"><label>3.</label><mixed-citation>McMahan B., Moore E., Ramage D. et al. Communication-efficient learning of deep networks from decentralized data. Proceedings of the 20 th International Conference on Artificial Intelligence and Statistics. 2017. Vol. 54. Pp. 1273–1282. DOI: 10.48550/arXiv.1602.05629</mixed-citation></ref><ref id="B4"><label>4.</label><mixed-citation>Ali A.A., Khedr A.M., El-Bannany M., Kanakkayil S. A Powerful Predicting Model for Financial Statement Fraud Based on Optimized XGBoost Ensemble Learning Technique. Applied Sciences. 2023. Vol. 13. No. 4. P. 2272. DOI: 10.3390/app13042272</mixed-citation></ref><ref id="B5"><label>5.</label><mixed-citation>He K., Yang Q., Ji L. et al. Financial Time Series Forecasting with the Deep Learning Ensemble Model. Mathematics. 2023. Vol. 11. No. 4. P. 1054. DOI: 10.3390/math11041054</mixed-citation></ref><ref id="B6"><label>6.</label><mixed-citation>Prokhorenkova L., Gusev G., Vorobev A. et al. CatBoost: unbiased boosting with categorical features. NIPS'18: Proceedings of the 32nd International Conference on Neural Information Processing Systems. 2018. Pp. 6639–6649. DOI: 0.48550/arXiv.1706.09516</mixed-citation></ref><ref id="B7"><label>7.</label><mixed-citation>Micci-Barreca D. A Preprocessing Scheme for High-Cardinality Categorical Attributes in Classification and Prediction Problems. ACM SIGKDD Explorations Newsletter. Vol. 3. No. 1. Pp. 27–32. DOI: 10.1145/507533.507538</mixed-citation></ref><ref id="B8"><label>8.</label><mixed-citation>Dorogush A.V., Ershov V., Gulin A. CatBoost: gradient boosting with categorical features support. Workshop on ML Systems at NIPS. 2017. DOI: 10.48550/arXiv.1810.11363</mixed-citation></ref><ref id="B9"><label>9.</label><mixed-citation>Breiman L. Bagging predictors. Machine Learning. 1996. Vol. 24. No. 2. Pp. 123–140. DOI: 10.1007/BF00058655</mixed-citation></ref><ref id="B10"><label>10.</label><mixed-citation>Official website Catboost. Common parameters. Точка доступа: https://catboost.ai/en/docs/ references/training-parameters/common#bagging_temperature (дата обращения: 10 января 2025)</mixed-citation></ref><ref id="B11"><label>11.</label><mixed-citation>Shapley L. Notes on the n-person game, ii: the value of an n-person game. 1951.</mixed-citation></ref><ref id="B12"><label>12.</label><mixed-citation>Official website SHAP library. Точка доступа: https://shap.readthedocs.io/en/latest/ example_notebooks/tabular_examples/tree_based_models/Catboost%20tutorial.html (дата обращения: 10 января 2025)</mixed-citation></ref><ref id="B13"><label>13.</label><mixed-citation>Brier Glenn W. Verification of forecasts expressed in terms of probability. Monthly Weather Review. 1950. Vol. 78. No. 1. Pp. 1–3. Bibcode:1950MWRv...78....1B. DOI: 10.1175/1520-0493(1950)078 &lt;0001:VOFEIT&gt; 2.0.CO</mixed-citation></ref><ref id="B14"><label>14.</label><mixed-citation>Akiba T., Sano S., Yanase T. et al. Optuna: A Next-generation Hyperparameter Optimization Framework. KDD '19: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery &amp; Data Mining. Pp. 2623–2631. DOI: 10.1145/3292500.3330701</mixed-citation></ref></ref-list></back></article>
