Multi-stage deep learning approaches to predict boarding behaviour of bus passengers

Document Type

Journal Article

Publication Date


Subject Area

mode - bus, ridership - behaviour, ridership - forecasting, technology - passenger information, technology - ticketing systems


Deep learning, Smart public transport, Boarding behaviour, Smart card data, Neural network


Smart card data has emerged in recent years and provide a comprehensive, and cheap source of information for planning and managing public transport systems. This paper presents a multi-stage machine learning framework to predict passengers’ boarding stops using smart card data. The framework addresses the challenges arising from the imbalanced nature of the data (e.g. many non-travelling data) and the ‘many-class’ issues (e.g. many possible boarding stops) by decomposing the prediction of hourly ridership into three stages: whether to travel or not in that one-hour time slot, which bus line to use, and at which stop to board. A simple neural network architecture, fully connected networks (FCN), and two deep learning architectures, recurrent neural networks (RNN) and long short-term memory networks (LSTM) are implemented. The proposed approach is applied to a real-life bus network. We show that the data imbalance has a profound impact on the accuracy of prediction at individual level. At aggregated level, FCN is able to accurately predict the rideship at individual stops, it is poor at capturing the temporal distribution of ridership. RNN and LSTM are able to measure the temporal distribution but lack the ability to capture the spatial distribution through bus lines.


Permission to publish the abstract has been given by Elsevier, copyright remains with them.


Sustainable Cities and Society