{"id":2514,"date":"2025-06-23T11:09:18","date_gmt":"2025-06-23T10:09:18","guid":{"rendered":"http:\/\/newskbncran.ru\/?page_id=2514"},"modified":"2026-06-04T09:37:14","modified_gmt":"2026-06-04T08:37:14","slug":"27-2-6-en","status":"publish","type":"page","link":"https:\/\/izvestiyakbncran.ru\/index.php\/en\/27-2-6-en\/","title":{"rendered":"27.2.6 en"},"content":{"rendered":"\n<h1 class=\"wp-block-heading has-lora-font-family\" style=\"font-size:22px\"><strong>On the application of reinforcement learning in the task of choosing the optimal trajectory<\/strong><\/h1>\n\n\n\n<p class=\"has-foreground-color has-text-color has-link-color has-lora-font-family has-extra-small-font-size wp-elements-09b206ebae8709aba647d2e440eaa2d7 wp-block-paragraph\" style=\"margin-top:0;margin-bottom:0;padding-top:0;padding-bottom:0\"><strong><strong><strong>M.G. Gorodnichev<\/strong><\/strong><\/strong><\/p>\n\n\n\n<div class=\"wp-block-group is-vertical is-content-justification-left is-nowrap is-layout-flex wp-container-core-group-is-layout-20193d73 wp-block-group-is-layout-flex\" style=\"border-style:none;border-width:0px;margin-top:0;margin-bottom:0;padding-top:0;padding-bottom:0\">\n<p class=\"has-text-color has-link-color has-lora-font-family wp-elements-1934a6274ad67c2dd8d97ce7471e2bbb wp-block-paragraph\" style=\"color:#5b1919;font-size:12px;text-decoration:underline\"><\/p>\n\n\n\n<div class=\"wp-block-group is-horizontal is-layout-flex wp-container-core-group-is-layout-9076828a wp-block-group-is-layout-flex\" style=\"min-height:0px;margin-top:0;margin-bottom:0;padding-top:0;padding-bottom:0\">\n<div class=\"wp-block-buttons is-content-justification-left is-layout-flex wp-container-core-buttons-is-layout-856cf56e wp-block-buttons-is-layout-flex\" style=\"margin-top:0;margin-bottom:0;padding-top:0;padding-right:0;padding-bottom:0;padding-left:0\">\n<div class=\"wp-block-button has-custom-width wp-block-button__width-100 is-style-outline is-style-outline--1\"><a class=\"wp-block-button__link has-background-background-color has-text-color has-background has-link-color has-border-color has-custom-font-size wp-element-button\" href=\"http:\/\/izvestiyakbncran.ru\/wp-content\/uploads\/2026\/06\/6.-gorodnichev-6.pdf\" style=\"border-color:#5b1919;border-style:solid;border-width:2px;border-radius:8px;color:#5b1919;padding-top:0.4rem;padding-right:var(--wp--preset--spacing--40);padding-bottom:0.4rem;padding-left:var(--wp--preset--spacing--40);font-size:12px\">PDF<\/a><\/div>\n<\/div>\n\n\n\n<div style=\"height:0px;width:0px\" aria-hidden=\"true\" class=\"wp-block-spacer wp-container-content-273e683f\"><\/div>\n\n\n\n<div class=\"wp-block-buttons is-layout-flex wp-block-buttons-is-layout-flex\">\n<div class=\"wp-block-button has-custom-width wp-block-button__width-100 is-style-outline is-style-outline--2\"><a class=\"wp-block-button__link has-background-background-color has-text-color has-background has-link-color has-border-color has-text-align-center has-custom-font-size wp-element-button\" href=\"http:\/\/izvestiyakbncran.ru\/wp-content\/uploads\/2026\/06\/6.-gorod.xml\" style=\"border-color:#5b1919;border-width:2px;border-top-left-radius:8px;border-top-right-radius:8px;border-bottom-left-radius:8px;border-bottom-right-radius:8px;color:#5b1919;padding-top:0.4rem;padding-right:var(--wp--preset--spacing--40);padding-bottom:0.4rem;padding-left:var(--wp--preset--spacing--40);font-size:12px\">JATS XML<\/a><\/div>\n<\/div>\n<\/div>\n\n\n\n<p class=\"has-text-color has-link-color has-lora-font-family has-extra-small-font-size wp-elements-41ee428b6a5740c7f514a7432ff786a3 wp-block-paragraph\" style=\"border-style:none;border-width:0px;border-top-left-radius:0px;border-top-right-radius:0px;border-bottom-left-radius:0px;border-bottom-right-radius:0px;color:#5b1919;margin-top:0;margin-right:0;margin-bottom:0;margin-left:0;padding-top:0;padding-right:0;padding-bottom:0;padding-left:0\"><\/p>\n<\/div>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-wide\" style=\"margin-top:var(--wp--preset--spacing--20);margin-bottom:var(--wp--preset--spacing--20)\"\/>\n\n\n\n<p class=\"has-foreground-color has-text-color has-link-color has-lora-font-family has-extra-small-font-size wp-elements-471a2e057f9443609c6cc8e6d1239b82 wp-block-paragraph\" style=\"line-height:1.4\"><strong style=\"font-weight: bold;\"><em>Abstract<\/em><\/strong>. This paper reviews state-of-the-art reinforcement learning methods, with a focus on their application in dynamic and complex environments. The study begins by analysing the main approaches to reinforcement learning such as dynamic programming, Monte Carlo methods, time-difference methods and policy gradients. Special attention is given to the Generalised Adversarial Imitation Learning (GAIL) methodology and its impact on the optimisation of agents&#8217; strategies. A study of model-free learning is presented and criteria for selecting agents capable of operating in continuous action and state spaces are highlighted. The experimental part is devoted to analysing the learning of agents using different types of sensors, including visual sensors, and demonstrates their ability to adapt to the environment despite resolution constraints. A comparison of results based on cumulative reward and episode length is presented, revealing improved agent performance in the later stages of training. The study confirms that the use of simulated learning significantly improves agent performance by reducing time costs and improving decision-making strategies. The present work holds promise for further exploration of mechanisms for improving sensor resolution and fine-tuning hyperparameters.<\/p>\n\n\n\n<p class=\"has-foreground-color has-text-color has-link-color has-lora-font-family has-extra-small-font-size wp-elements-6a43e281d07b5d4e61f93100c9ada92c wp-block-paragraph\" style=\"line-height:1.4\"><strong><strong><em>Keywords<\/em><\/strong>:<\/strong> reinforcement learning, intelligent agents, optimal trajectory, highly automated vehicles, policy-based learning, actor-critic architectures, simulated learning, sensors, continuous states, discrete states, PPO, SAC<\/p>\n\n\n\n<p class=\"has-foreground-color has-text-color has-link-color has-lora-font-family wp-elements-0d66fa00f0cbdf9dfffd3bda4d7eb791 wp-block-paragraph\" style=\"font-size:12px;line-height:1.4\"><strong><strong>For citation<\/strong>.<\/strong> Gorodnichev M.G. On the application of reinforcement learning in the task of choosing the optimal trajectory. &nbsp;<em>News &nbsp;of &nbsp;the &nbsp;Kabardino-Balkarian &nbsp;Scientific &nbsp;Center &nbsp;of &nbsp;RAS.<\/em><strong> &nbsp;<\/strong>2025. &nbsp;Vol. 27. &nbsp;No. 2. &nbsp;Pp. 86\u2013102. DOI: 10.35330\/1991-6639-2025-27-2-86-102<\/p>\n\n\n\n<details class=\"wp-block-details has-foreground-color has-text-color has-link-color has-lora-font-family has-extra-small-font-size wp-elements-7319ba6c4ce476de9ef2b7f040c65b6c is-layout-flow wp-container-core-details-is-layout-f488f964 wp-block-details-is-layout-flow\" style=\"font-style:normal;font-weight:700;line-height:1.5\"><summary><strong>R<\/strong>eferences<\/summary>\n<ol style=\"margin-top:0;margin-bottom:0\" class=\"wp-block-list\">\n<li style=\"font-style:normal;font-weight:400\">Zhang S., Xia Q., Chen M., Cheng S. Multi-Objective Optimal Trajectory Planning for Robotic Arms Using Deep Reinforcement Learning. <em>Sensors. <\/em>2023. Vol. 23. P. 5974. DOI: 10.3390\/s23135974<\/li>\n\n\n\n<li style=\"font-style:normal;font-weight:400\">Tamizi M.G., Yaghoubi M., Najjaran H. A review of recent trend in motion planning of industrial robots. <em>International Journal of Intelligent Robotics and Applications.<\/em> 2023. Vol. 7. Pp. 253\u2013274. DOI:10.1007\/s41315-023-00274-2<\/li>\n\n\n\n<li style=\"font-style:normal;font-weight:400\">Kollar T., Roy N. Trajectory Optimization using Reinforcement Learning for Map Exploration. <em>International Journal of Robotics Research.<\/em> 2008. Vol. 27. No. 2. Pp. 175\u2013196. DOI: 10.1177\/0278364907087426<\/li>\n\n\n\n<li style=\"font-style:normal;font-weight:400\">Acar E.U., Choset H., Zhang Y., Schervish M. Path planning for robotic demining: robust sensor-based coverage of unstructured environments and probabilistic methods. <em>International Journal of Robotics Research.<\/em> 2003. Vol. 22. No. 7\u20138. Pp. 441\u2013466.<\/li>\n\n\n\n<li style=\"font-style:normal;font-weight:400\">Cohn D.A., Ghahramani Z., Jordan M.I. Active learning with statistical models. <em>Journal of Artificial Intelligence Research.<\/em> 1996. No. 4. Pp. 705\u2013712.<\/li>\n\n\n\n<li style=\"font-style:normal;font-weight:400\">Axhausen K. et al. Introducing MATSim. In: Horni, A et al (eds.). <em>Multi-Agent Transport Simulation MATSim.<\/em> London: Ubiquity Press. 2016. &nbsp;Pp. 3\u20138. DOI: 10.5334\/baw.1<\/li>\n\n\n\n<li style=\"font-style:normal;font-weight:400\">Wu G., Zhang D., Miao Z., Bao W., Cao J. How to Design Reinforcement Learning Methods for the Edge: An Integrated Approach toward Intelligent Decision Making. <em>Electronics. <\/em>2024. Vol. 13. P. 1281. &nbsp;DOI: 10.3390\/electronics13071281<\/li>\n\n\n\n<li style=\"font-style:normal;font-weight:400\">Zhou T., Lin M. Deadline-aware deep-recurrent-q-network governor for smart energy saving. <em>IEEE Transactions on Network Science and Engineering.<\/em> 2021. Vol. 9. Pp. 3886\u20133895. DOI: 10.1109\/TNSE.2021.3123280<\/li>\n\n\n\n<li style=\"font-style:normal;font-weight:400\">Yang Y., Wang J. An overview of multi-agent reinforcement learning from game theoretical perspective. arXiv 2020, arXiv:2011.00583. DOI: 10.48550\/arXiv.2011.00583<\/li>\n\n\n\n<li style=\"font-style:normal;font-weight:400\">Mazyavkina N., Sviridov S., Ivanov S., Burnaev E. Reinforcement learning for combinatorial optimization: A survey. Comput. Oper. Res. 2021. Vol. 134. P. 105400. DOI: 10.1016\/j.cor.2021.105400<\/li>\n\n\n\n<li style=\"font-style:normal;font-weight:400\">Junwei Zhang, Zhenghao Zhang, Shuai Han, Shuai L\u00fc, Proximal policy optimization via enhanced exploration efficiency. <em>Information Sciences.<\/em> 2022. Vol. 609. Pp. 750\u2013765. ISSN 0020-0255. &nbsp;DOI: 10.1016\/j.ins.2022.07.111<\/li>\n\n\n\n<li style=\"font-style:normal;font-weight:400\">Hessel M., Modayil J., H. van Hasselt, Schaul T. et al. &nbsp;Rainbow: Combining improvements in deep reinforcement learning. <em>In AAAI Conference on Artificial Intelligence<\/em>. 2018. Pp. 3215\u20133222. DOI:&nbsp;10.1609\/aaai.v32i1.11796<\/li>\n\n\n\n<li style=\"font-style:normal;font-weight:400\">Haarnoja T., Zhou A., Abbeel P., Levine S. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. <em>In International Conference on Machine Learning<\/em>. 2018. Pp. 1856\u20131865. DOI: 10.48550\/arXiv.1801.01290<\/li>\n\n\n\n<li style=\"font-style:normal;font-weight:400\">Lillicrap T.P., Hunt J.J., Pritzel A. et al. Continuous control with deep reinforcement learning. arXiv:1509.02971v1. 2015. file:\/\/\/C:\/Users\/%D0%90%D1%80%D1%81%D0%B5%D0%BD\/ Downloads\/1509.02971v1.pdf<\/li>\n\n\n\n<li style=\"font-style:normal;font-weight:400\">Chen Y., Lam C.T., Pau G., Ke W. From Virtual to Reality: A Deep Reinforcement Learning Solution to Implement Autonomous Driving with 3D-LiDAR. <em>Applied&nbsp;Sciences.<\/em>2025. Vol. 15. No. 3. P. 1423. DOI: 10.3390\/app15031423<\/li>\n\n\n\n<li style=\"font-style:normal;font-weight:400\">Guoyu Zuo, Kexin Chen, Jiahao Lu, Xiangsheng Huang. Deterministic generative adversarial imitation learning. <em>Neurocomputing.<\/em> 2020. Vol. 388. Pp. 60\u201369. ISSN 0925-2312. &nbsp;DOI: 10.1016\/j.neucom.2020.01.016<\/li>\n\n\n\n<li style=\"font-style:normal;font-weight:400\">Sawada &nbsp;R. Automatic Collision Avoidance Using Deep Reinforcement Learning with Grid Sensor. In: Sato, H., Iwanaga, S., Ishii, A. (eds). <em>Proceedings of the 23rd Asia Pacific Symposium on Intelligent and Evolutionary Systems.<\/em> IES 2019. Proceedings in Adaptation, Learning and Optimization. Springer, Cham. &nbsp;2020. Vol. 12. Pp. 17\u201332. DOI: 10.1007\/978-3-030-37442-6_3<\/li>\n\n\n\n<li style=\"font-style:normal;font-weight:400\">Hachaj T., Piekarczyk M. On Explainability of Reinforcement Learning-Based Machine Learning Agents Trained with Proximal Policy Optimization That Utilizes Visual Sensor Data. <em>Applied&nbsp;Sciences.<\/em><strong> <\/strong>2025. Vol. 15. No. 2. P. 538. DOI: 10.3390\/app15020538<\/li>\n<\/ol>\n<\/details>\n\n\n\n<details class=\"wp-block-details has-foreground-color has-text-color has-link-color has-lora-font-family has-extra-small-font-size wp-elements-25ac351882d6c1b161389507e869321b is-layout-flow wp-container-core-details-is-layout-9ff6af70 wp-block-details-is-layout-flow\" style=\"font-style:normal;font-weight:700;line-height:1.5\"><summary><strong>Information about the author<\/strong><\/summary>\n<div class=\"wp-block-group is-vertical is-layout-flex wp-container-core-group-is-layout-1c18c512 wp-block-group-is-layout-flex\" style=\"min-height:0px;margin-top:0;margin-bottom:0;padding-top:var(--wp--preset--spacing--20);padding-right:var(--wp--preset--spacing--40);padding-bottom:var(--wp--preset--spacing--20);padding-left:var(--wp--preset--spacing--40)\">\n<p class=\"wp-block-paragraph\" style=\"font-style:normal;font-weight:400\"><strong>Mikhail G. Gorodnichev, <\/strong>Candidate of &nbsp;Engineering Sciences, Associate Professor, Dean of the Faculty of &nbsp;Information Technology, Moscow Technical University of Communications and &nbsp;Informatics;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\" style=\"font-style:normal;font-weight:400\">111024, Russia, Moscow, 8A Aviamotornaya street;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\" style=\"font-style:normal;font-weight:400\">m.g.gorodnichev@mtuci.ru, ORCID: https:\/\/orcid.org\/0000-0003-1739-9831,<strong> <\/strong>SPIN-code:<strong> <\/strong>4576-9642<\/p>\n<\/div>\n<\/details>\n","protected":false},"excerpt":{"rendered":"<p>On the application of reinforcement learning in the task of choosing the optimal trajectory M.G. Gorodnichev Abstract. This paper reviews state-of-the-art reinforcement learning methods, with a focus on their application in dynamic and complex environments. The study begins by analysing the main approaches to reinforcement learning such as dynamic programming, Monte Carlo methods, time-difference methods [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"wp-custom-template-home","meta":{"footnotes":""},"class_list":["post-2514","page","type-page","status-publish","hentry"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.8 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>27.2.6 en - \u0418\u0417\u0412\u0415\u0421\u0422\u0418\u042f \u041a\u0410\u0411\u0410\u0420\u0414\u0418\u041d\u041e-\u0411\u0410\u041b\u041a\u0410\u0420\u0421\u041a\u041e\u0413\u041e \u041d\u0410\u0423\u0427\u041d\u041e\u0413\u041e \u0426\u0415\u041d\u0422\u0420\u0410 \u0420\u0410\u041d\u00bb<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/izvestiyakbncran.ru\/index.php\/en\/27-2-6-en\/\" \/>\n<meta property=\"og:locale\" content=\"ru_RU\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"27.2.6 en - \u0418\u0417\u0412\u0415\u0421\u0422\u0418\u042f \u041a\u0410\u0411\u0410\u0420\u0414\u0418\u041d\u041e-\u0411\u0410\u041b\u041a\u0410\u0420\u0421\u041a\u041e\u0413\u041e \u041d\u0410\u0423\u0427\u041d\u041e\u0413\u041e \u0426\u0415\u041d\u0422\u0420\u0410 \u0420\u0410\u041d\u00bb\" \/>\n<meta property=\"og:description\" content=\"On the application of reinforcement learning in the task of choosing the optimal trajectory M.G. Gorodnichev Abstract. This paper reviews state-of-the-art reinforcement learning methods, with a focus on their application in dynamic and complex environments. The study begins by analysing the main approaches to reinforcement learning such as dynamic programming, Monte Carlo methods, time-difference methods [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/izvestiyakbncran.ru\/index.php\/en\/27-2-6-en\/\" \/>\n<meta property=\"og:site_name\" content=\"\u0418\u0417\u0412\u0415\u0421\u0422\u0418\u042f \u041a\u0410\u0411\u0410\u0420\u0414\u0418\u041d\u041e-\u0411\u0410\u041b\u041a\u0410\u0420\u0421\u041a\u041e\u0413\u041e \u041d\u0410\u0423\u0427\u041d\u041e\u0413\u041e \u0426\u0415\u041d\u0422\u0420\u0410 \u0420\u0410\u041d\u00bb\" \/>\n<meta property=\"article:modified_time\" content=\"2026-06-04T08:37:14+00:00\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"\u041f\u0440\u0438\u043c\u0435\u0440\u043d\u043e\u0435 \u0432\u0440\u0435\u043c\u044f \u0434\u043b\u044f \u0447\u0442\u0435\u043d\u0438\u044f\" \/>\n\t<meta name=\"twitter:data1\" content=\"5 \u043c\u0438\u043d\u0443\u0442\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/izvestiyakbncran.ru\\\/index.php\\\/en\\\/27-2-6-en\\\/\",\"url\":\"https:\\\/\\\/izvestiyakbncran.ru\\\/index.php\\\/en\\\/27-2-6-en\\\/\",\"name\":\"27.2.6 en - \u0418\u0417\u0412\u0415\u0421\u0422\u0418\u042f \u041a\u0410\u0411\u0410\u0420\u0414\u0418\u041d\u041e-\u0411\u0410\u041b\u041a\u0410\u0420\u0421\u041a\u041e\u0413\u041e \u041d\u0410\u0423\u0427\u041d\u041e\u0413\u041e \u0426\u0415\u041d\u0422\u0420\u0410 \u0420\u0410\u041d\u00bb\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/izvestiyakbncran.ru\\\/#website\"},\"datePublished\":\"2025-06-23T10:09:18+00:00\",\"dateModified\":\"2026-06-04T08:37:14+00:00\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/izvestiyakbncran.ru\\\/index.php\\\/en\\\/27-2-6-en\\\/#breadcrumb\"},\"inLanguage\":\"ru-RU\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/izvestiyakbncran.ru\\\/index.php\\\/en\\\/27-2-6-en\\\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/izvestiyakbncran.ru\\\/index.php\\\/en\\\/27-2-6-en\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"\u0413\u043b\u0430\u0432\u043d\u0430\u044f \u0441\u0442\u0440\u0430\u043d\u0438\u0446\u0430\",\"item\":\"https:\\\/\\\/izvestiyakbncran.ru\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"27.2.6 en\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/izvestiyakbncran.ru\\\/#website\",\"url\":\"https:\\\/\\\/izvestiyakbncran.ru\\\/\",\"name\":\"\u0418\u0417\u0412\u0415\u0421\u0422\u0418\u042f \u041a\u0410\u0411\u0410\u0420\u0414\u0418\u041d\u041e-\u0411\u0410\u041b\u041a\u0410\u0420\u0421\u041a\u041e\u0413\u041e \u041d\u0410\u0423\u0427\u041d\u041e\u0413\u041e \u0426\u0415\u041d\u0422\u0420\u0410 \u0420\u0410\u041d\u00bb\",\"description\":\"\u041d\u0430\u0443\u0447\u043d\u044b\u0439 \u0436\u0443\u0440\u043d\u0430\u043b\",\"publisher\":{\"@id\":\"https:\\\/\\\/izvestiyakbncran.ru\\\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/izvestiyakbncran.ru\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"ru-RU\"},{\"@type\":\"Organization\",\"@id\":\"https:\\\/\\\/izvestiyakbncran.ru\\\/#organization\",\"name\":\"\u0418\u0417\u0412\u0415\u0421\u0422\u0418\u042f \u041a\u0410\u0411\u0410\u0420\u0414\u0418\u041d\u041e-\u0411\u0410\u041b\u041a\u0410\u0420\u0421\u041a\u041e\u0413\u041e \u041d\u0410\u0423\u0427\u041d\u041e\u0413\u041e \u0426\u0415\u041d\u0422\u0420\u0410 \u0420\u0410\u041d\u00bb\",\"url\":\"https:\\\/\\\/izvestiyakbncran.ru\\\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"ru-RU\",\"@id\":\"https:\\\/\\\/izvestiyakbncran.ru\\\/#\\\/schema\\\/logo\\\/image\\\/\",\"url\":\"https:\\\/\\\/izvestiyakbncran.ru\\\/wp-content\\\/uploads\\\/2025\\\/07\\\/oblozhka-zhurnala-na-angl-scaled.jpg\",\"contentUrl\":\"https:\\\/\\\/izvestiyakbncran.ru\\\/wp-content\\\/uploads\\\/2025\\\/07\\\/oblozhka-zhurnala-na-angl-scaled.jpg\",\"width\":1828,\"height\":2560,\"caption\":\"\u0418\u0417\u0412\u0415\u0421\u0422\u0418\u042f \u041a\u0410\u0411\u0410\u0420\u0414\u0418\u041d\u041e-\u0411\u0410\u041b\u041a\u0410\u0420\u0421\u041a\u041e\u0413\u041e \u041d\u0410\u0423\u0427\u041d\u041e\u0413\u041e \u0426\u0415\u041d\u0422\u0420\u0410 \u0420\u0410\u041d\u00bb\"},\"image\":{\"@id\":\"https:\\\/\\\/izvestiyakbncran.ru\\\/#\\\/schema\\\/logo\\\/image\\\/\"}}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"27.2.6 en - \u0418\u0417\u0412\u0415\u0421\u0422\u0418\u042f \u041a\u0410\u0411\u0410\u0420\u0414\u0418\u041d\u041e-\u0411\u0410\u041b\u041a\u0410\u0420\u0421\u041a\u041e\u0413\u041e \u041d\u0410\u0423\u0427\u041d\u041e\u0413\u041e \u0426\u0415\u041d\u0422\u0420\u0410 \u0420\u0410\u041d\u00bb","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/izvestiyakbncran.ru\/index.php\/en\/27-2-6-en\/","og_locale":"ru_RU","og_type":"article","og_title":"27.2.6 en - \u0418\u0417\u0412\u0415\u0421\u0422\u0418\u042f \u041a\u0410\u0411\u0410\u0420\u0414\u0418\u041d\u041e-\u0411\u0410\u041b\u041a\u0410\u0420\u0421\u041a\u041e\u0413\u041e \u041d\u0410\u0423\u0427\u041d\u041e\u0413\u041e \u0426\u0415\u041d\u0422\u0420\u0410 \u0420\u0410\u041d\u00bb","og_description":"On the application of reinforcement learning in the task of choosing the optimal trajectory M.G. Gorodnichev Abstract. This paper reviews state-of-the-art reinforcement learning methods, with a focus on their application in dynamic and complex environments. The study begins by analysing the main approaches to reinforcement learning such as dynamic programming, Monte Carlo methods, time-difference methods [&hellip;]","og_url":"https:\/\/izvestiyakbncran.ru\/index.php\/en\/27-2-6-en\/","og_site_name":"\u0418\u0417\u0412\u0415\u0421\u0422\u0418\u042f \u041a\u0410\u0411\u0410\u0420\u0414\u0418\u041d\u041e-\u0411\u0410\u041b\u041a\u0410\u0420\u0421\u041a\u041e\u0413\u041e \u041d\u0410\u0423\u0427\u041d\u041e\u0413\u041e \u0426\u0415\u041d\u0422\u0420\u0410 \u0420\u0410\u041d\u00bb","article_modified_time":"2026-06-04T08:37:14+00:00","twitter_card":"summary_large_image","twitter_misc":{"\u041f\u0440\u0438\u043c\u0435\u0440\u043d\u043e\u0435 \u0432\u0440\u0435\u043c\u044f \u0434\u043b\u044f \u0447\u0442\u0435\u043d\u0438\u044f":"5 \u043c\u0438\u043d\u0443\u0442"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/izvestiyakbncran.ru\/index.php\/en\/27-2-6-en\/","url":"https:\/\/izvestiyakbncran.ru\/index.php\/en\/27-2-6-en\/","name":"27.2.6 en - \u0418\u0417\u0412\u0415\u0421\u0422\u0418\u042f \u041a\u0410\u0411\u0410\u0420\u0414\u0418\u041d\u041e-\u0411\u0410\u041b\u041a\u0410\u0420\u0421\u041a\u041e\u0413\u041e \u041d\u0410\u0423\u0427\u041d\u041e\u0413\u041e \u0426\u0415\u041d\u0422\u0420\u0410 \u0420\u0410\u041d\u00bb","isPartOf":{"@id":"https:\/\/izvestiyakbncran.ru\/#website"},"datePublished":"2025-06-23T10:09:18+00:00","dateModified":"2026-06-04T08:37:14+00:00","breadcrumb":{"@id":"https:\/\/izvestiyakbncran.ru\/index.php\/en\/27-2-6-en\/#breadcrumb"},"inLanguage":"ru-RU","potentialAction":[{"@type":"ReadAction","target":["https:\/\/izvestiyakbncran.ru\/index.php\/en\/27-2-6-en\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/izvestiyakbncran.ru\/index.php\/en\/27-2-6-en\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"\u0413\u043b\u0430\u0432\u043d\u0430\u044f \u0441\u0442\u0440\u0430\u043d\u0438\u0446\u0430","item":"https:\/\/izvestiyakbncran.ru\/"},{"@type":"ListItem","position":2,"name":"27.2.6 en"}]},{"@type":"WebSite","@id":"https:\/\/izvestiyakbncran.ru\/#website","url":"https:\/\/izvestiyakbncran.ru\/","name":"\u0418\u0417\u0412\u0415\u0421\u0422\u0418\u042f \u041a\u0410\u0411\u0410\u0420\u0414\u0418\u041d\u041e-\u0411\u0410\u041b\u041a\u0410\u0420\u0421\u041a\u041e\u0413\u041e \u041d\u0410\u0423\u0427\u041d\u041e\u0413\u041e \u0426\u0415\u041d\u0422\u0420\u0410 \u0420\u0410\u041d\u00bb","description":"\u041d\u0430\u0443\u0447\u043d\u044b\u0439 \u0436\u0443\u0440\u043d\u0430\u043b","publisher":{"@id":"https:\/\/izvestiyakbncran.ru\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/izvestiyakbncran.ru\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"ru-RU"},{"@type":"Organization","@id":"https:\/\/izvestiyakbncran.ru\/#organization","name":"\u0418\u0417\u0412\u0415\u0421\u0422\u0418\u042f \u041a\u0410\u0411\u0410\u0420\u0414\u0418\u041d\u041e-\u0411\u0410\u041b\u041a\u0410\u0420\u0421\u041a\u041e\u0413\u041e \u041d\u0410\u0423\u0427\u041d\u041e\u0413\u041e \u0426\u0415\u041d\u0422\u0420\u0410 \u0420\u0410\u041d\u00bb","url":"https:\/\/izvestiyakbncran.ru\/","logo":{"@type":"ImageObject","inLanguage":"ru-RU","@id":"https:\/\/izvestiyakbncran.ru\/#\/schema\/logo\/image\/","url":"https:\/\/izvestiyakbncran.ru\/wp-content\/uploads\/2025\/07\/oblozhka-zhurnala-na-angl-scaled.jpg","contentUrl":"https:\/\/izvestiyakbncran.ru\/wp-content\/uploads\/2025\/07\/oblozhka-zhurnala-na-angl-scaled.jpg","width":1828,"height":2560,"caption":"\u0418\u0417\u0412\u0415\u0421\u0422\u0418\u042f \u041a\u0410\u0411\u0410\u0420\u0414\u0418\u041d\u041e-\u0411\u0410\u041b\u041a\u0410\u0420\u0421\u041a\u041e\u0413\u041e \u041d\u0410\u0423\u0427\u041d\u041e\u0413\u041e \u0426\u0415\u041d\u0422\u0420\u0410 \u0420\u0410\u041d\u00bb"},"image":{"@id":"https:\/\/izvestiyakbncran.ru\/#\/schema\/logo\/image\/"}}]}},"_links":{"self":[{"href":"https:\/\/izvestiyakbncran.ru\/index.php\/wp-json\/wp\/v2\/pages\/2514","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/izvestiyakbncran.ru\/index.php\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/izvestiyakbncran.ru\/index.php\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/izvestiyakbncran.ru\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/izvestiyakbncran.ru\/index.php\/wp-json\/wp\/v2\/comments?post=2514"}],"version-history":[{"count":4,"href":"https:\/\/izvestiyakbncran.ru\/index.php\/wp-json\/wp\/v2\/pages\/2514\/revisions"}],"predecessor-version":[{"id":13786,"href":"https:\/\/izvestiyakbncran.ru\/index.php\/wp-json\/wp\/v2\/pages\/2514\/revisions\/13786"}],"wp:attachment":[{"href":"https:\/\/izvestiyakbncran.ru\/index.php\/wp-json\/wp\/v2\/media?parent=2514"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}