Imputing qualitative attributes for trip chains extracted from smart card data using a conditional generative adversarial network

Document Type

Journal Article

Publication Date


Subject Area

technology - passenger information, technology - ticketing systems, ridership - modelling, ridership - behaviour


Smart cards, Sociodemographic attributes, Trip purpose, Activity-based model, Conditional generative adversarial networks, Trip chain


Travel Diary Survey (TDS) collects comprehensive attributes such as sociodemographic attributes, trip purpose, and trip chain attributes of the trips taken by a small portion of the general population every few years. Meanwhile, smart cards (SC) gather all transit passenger time-space movements on a daily basis, which represent trip chain attributes. However, qualitative attributes such as sociodemographic attributes and trip purpose are unknown. This study proposed a novel method to estimate the trip purpose and sociodemographic attributes of SC data by using a conditional generative adversarial network (CGAN). With an efficient network structure that considers the spatial and sequential dependencies of a trip chain, the proposed CGAN is able to impute the sociodemographic attributes and trip purpose of SC data by mimicking the TDS data using adversarial training. Also, unlike the other generative models, the proposed CGAN can fully leverage the large-scale SC data in training, which allow sidestepping overfit to the TDS data. Evaluation results show that CGAN outperforms other generative models, i.e., Bayesian networks, Gibbs sampler, conventional CGAN, and a conditional variational autoencoder. Considerable performance improvement was achieved by both network structure that can consider spatial and sequential dependencies, and modeling framework that utilize the large-scale SC data. Several metrics were employed to evaluate fidelity, diversity, and creativity, and these verified that the existing generative models tend to overfit the marginal or bivariate distributions of training data. The proposed method addresses the incompleteness of passively collected mobile data. The imputed SC data representing comprehensive travel attributes is valuable for an activity-based model and behavior analysis, because it provides travel information that is more dynamic, continuous, and extensive.


Permission to publish the abstract has been given by Elsevier, copyright remains with them.


Transportation Research Part C Home Page: