Train timetabling with the general learning environment and multi-agent deep reinforcement learning

Document Type

Journal Article

Publication Date


Subject Area

mode - rail, operations - coordination, operations - scheduling


Train timetabling, Railway system, Multi-agent actor–critic algorithm, Deep reinforcement learning


This paper proposes a multi-agent deep reinforcement learning approach for the train timetabling problem of different railway systems. A general train timetabling learning environment is constructed to model the problem as a Markov decision process, in which the objectives and complex constraints of the problem can be distributed naturally and elegantly. Through subtle changes, the environment can be flexibly switched between the widely used double-track railway system and the more complex single-track railway system. To address the curse of dimensionality, a multi-agent actor–critic algorithm framework is proposed to decompose the large-size combinatorial decision space into multiple independent ones, which are parameterized by deep neural networks. The proposed approach was tested using a real-world instance and several test instances. Experimental results show that cooperative policies of the single-track train timetabling problem can be obtained by the proposed method within a reasonable computing time that outperforms several prevailing methods in terms of the optimality of solutions, and the proposed method can be easily generalized to the double-track train timetabling problem by changing the environment slightly.


Permission to publish the abstract has been given by Elsevier, copyright remains with them.


Transportation Research Part B Home Page: