On the application of reinforcement learning in the task of choosing the optimal trajectory