Effects of buffer size on associations between the built environment and metro ridership: A machine learning-based sensitive analysis

Document Type

Journal Article

Publication Date


Subject Area

mode - subway/metro, land use - impacts, land use - planning, ridership - demand, ridership - forecasting, infrastructure - station


Metro, station catchment, ridership


Uncertainty in the relevant buffer size of metro station catchment areas may drive inconsistencies in the findings on the built environment and metro ridership. Although previous studies estimate the effect of this uncertainty, the results are far from definitive. By utilizing finer-grained big data and non-parametric machine learning approaches, this study conducted a sensitivity analysis defining built environment factors within four radial buffer sizes: 300 m, 600 m, 800 m, and 1000 m on associations with metro ridership. The results suggest that: (1) different buffer sizes have little influence on the ordinary least-squares model's predictive power, but significant influence on the machine learning model; (2) the use of a 600 m buffer size around the transit station demonstrates the best model fit and variation explanation compared to others; (3) findings on the relative importance, ranks, and nonlinear associations with metro ridership can be impacted as the choice of geographic delineation of buffer sizes deviate from the true relevant geographic context of built environment variables. The results assist planners in setting a benchmark for metro catchment areas for station-area planning and demand forecasting, more importantly, the findings highlight the importance of meticulously selecting the analytical spatial unit for area-based variables, especially when utilizing non-parametric machine learning approaches in research.


Permission to publish the abstract has been given by Elsevier, copyright remains with them.


Journal of Transport Geography home Page: