skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Towards a Better Understanding of Public Transportation Traffic: A Case Study of the Washington, DC Metro
The problem of traffic prediction is paramount in a plethora of applications, ranging from individual trip planning to urban planning. Existing work mainly focuses on traffic prediction on road networks. Yet, public transportation contributes a significant portion to overall human mobility and passenger volume. For example, the Washington, DC metro has on average 600,000 passengers on a weekday. In this work, we address the problem of modeling, classifying and predicting such passenger volume in public transportation systems. We study the case of the Washington, DC metro exploring fare card data, and specifically passenger in- and outflow at stations. To reduce dimensionality of the data, we apply principal component analysis to extract latent features for different stations and for different calendar days. Our unsupervised clustering results demonstrate that these latent features are highly discriminative. They allow us to derive different station types (residential, commercial, and mixed) and to effectively classify and identify the passenger flow of “unknown” stations. Finally, we also show that this classification can be applied to predict the passenger volume at stations. By learning latent features of stations for some time, we are able to predict the flow for the following hours. Extensive experimentation using a baseline neural network and two naïve periodicity approaches shows the considerable accuracy improvement when using the latent feature based approach.  more » « less
Award ID(s):
1637541
PAR ID:
10110163
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Urban Science
Volume:
2
Issue:
3
ISSN:
2413-8851
Page Range / eLocation ID:
65
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Urban public transit planning is crucial in reducing traffic congestion and enabling green transportation. However, there is no systematic way to integrate passengers' personal preferences in planning public transit routes and schedules so as to achieve high occupancy rates and efficiency gain of ride-sharing. In this paper, we take the first step tp exact passengers' preferences in planning from history public transit data. We propose a data-driven method to construct a Markov decision process model that characterizes the process of passengers making sequential public transit choices, in bus routes, subway lines, and transfer stops/stations. Using the model, we integrate softmax policy iteration into maximum entropy inverse reinforcement learning to infer the passenger's reward function from observed trajectory data. The inferred reward function will enable an urban planner to predict passengers' route planning decisions given some proposed transit plans, for example, opening a new bus route or subway line. Finally, we demonstrate the correctness and accuracy of our modeling and inference methods in a large-scale (three months) passenger-level public transit trajectory data from Shenzhen, China. Our method contributes to smart transportation design and human-centric urban planning. 
    more » « less
  2. Effective road traffic assessment and estimation is crucial not only for traffic management applications, but also for long-term trans- portation and, more generally, urban planning. Traditionally, this task has been achieved by using a network of stationary traffic count sensors. These costly and unreliable sensors have been replaced with so-called Probe Vehicle Data (PVD), which relies on sampling individual vehicles in traffic using for example smartphones to assess the overall traffic condition. While PVD provides uniform road network coverage, it does not capture the actual traffic flow. On the other hand, stationary sensors capture the absolute traffic flow only at discrete locations. Furthermore, these sensors are often unreliable; temporary mal- functions create gaps in their time-series of measurements. This work bridges the gap between these two data sources by learning the time-dependent fraction of vehicles captured by GPS-based probe data at discrete stationary sensor locations. We can then account for the gaps of the traffic-loop measurements by using the PVD data to estimate the actual total flow. In this work, we show that the PVD flow capture changes sig- nificantly over time in the Washington DC area. Exploiting this information, we are able to derive tight confidence intervals of the traffic volume for areas with no stationary sensor coverage. 
    more » « less
  3. Abstract Individual passenger travel patterns have significant value in understanding passenger’s behavior, such as learning the hidden clusters of locations, time, and passengers. The learned clusters further enable commercially beneficial actions such as customized services, promotions, data-driven urban-use planning, peak hour discovery, and so on. However, the individualized passenger modeling is very challenging for the following reasons: 1) The individual passenger travel data are multi-dimensional spatiotemporal big data, including at least the origin, destination, and time dimensions; 2) Moreover, individualized passenger travel patterns usually depend on the external environment, such as the distances and functions of locations, which are ignored in most current works. This work proposes a multi-clustering model to learn the latent clusters along the multiple dimensions of Origin, Destination, Time, and eventually, Passenger (ODT-P). We develop a graph-regularized tensor Latent Dirichlet Allocation (LDA) model by first extending the traditional LDA model into a tensor version and then applies to individual travel data. Then, the external information of stations is formulated as semantic graphs and incorporated as the Laplacian regularizations; Furthermore, to improve the model scalability when dealing with massive data, an online stochastic learning method based on tensorized variational Expectation-Maximization algorithm is developed. Finally, a case study based on passengers in the Hong Kong metro system is conducted and demonstrates that a better clustering performance is achieved compared to state-of-the-arts with the improvement in point-wise mutual information index and algorithm convergence speed by a factor of two. 
    more » « less
  4. Given historical traffic distributions and associated urban conditions observed in a city, the conditional urban traffic estimation problem aims at estimating realistic future projections of the traffic under a set of new urban conditions, e.g., new bus routes, rainfall intensity, and travel demands. The problem is important in reducing traffic congestion, improving public transportation efficiency, and facilitating urban planning. However, solving this problem is challenging due to the strong spatial dependencies of traffic patterns and the complex relations between the traffic and urban conditions. Recently, we proposed a Complex-Condition-Controlled Generative Adversarial Network C3-GAN, which tackles both of the challenges and solves the urban traffic estimation problem under various complex conditions by adding a fixed embedding network and an inference network on top of the standard conditional GAN model. The randomly chosen embedding network transforms the complex conditions to latent vectors, and the inference network enhances the connections between the embedded vectors and the traffic data. However, a randomly chosen embedding network cannot always successfully extract features of complex urban conditions, which indicates C3-GAN is unable to uniquely map different urban conditions to proper latent distributions. Thus, C3-GAN would fail in certain traffic estimation tasks. Besides, C3-GAN is hard to train due to vanishing gradients and mode collapse problems. To address these issues, in this article, we extend our prior work by introducing a new deep generative model, namely, C3-GAN+, which significantly improves the estimation performance and model stability. C3-GAN+ has new objective, architecture, and training algorithm. The new objective applies Wasserstein loss to the conditional generation case to encourage stable training. Shared convolutional layers between the discriminator and the inference network help to capture spatial dependencies of traffic more efficiently, part of the shared convolutional layers are used to update the embedding network periodically aiming to encourage good representation and avoid model divergence. Extensive experiments on real-world datasets demonstrate that our C3-GAN+ produces high-quality traffic estimations and outperforms state-of-the-art baseline methods. 
    more » « less
  5. This paper undertakes a detailed empirical study of traffic dynamics on a freeway. The results show the traffic dynamics that systematically determine the shape of the fundamental diagram, FD, can also violate the stationarity assumptions of both shockwave analysis and Lighthill, Whitham and Richard's models, thereby inhibiting the applicability of these classical macroscopic traffic flow theories. The outcome is challenging because there is no way to identify the problem using only the macroscopic detector data. The research examines conditions local to vehicle detector stations to establish the FD while the single vehicle passage method is used to analyze the composition of vehicles underlying the aggregate samples. Then, traffic states are correlated between successive stations to measure the actual signal velocities and show they are inconsistent with the classical theories. This analysis also revealed that conditions in one lane can induce signals in another lane. Rather than exhibiting a single signal passing a given point in time and space, the induced and intrinsic signals are superimposed on one another in the given lane. We suspect the subtle dynamics revealed in this research have gone unnoticed because they are far below the resolution of conventional traffic monitoring. The findings could have implications to other traffic flow models that rely on the FD, so care should be taken to assess if a given model is potentially sensitive to the non-stationary dynamics presented herein. The results have a direct impact on practice. Traffic flow theory is a critical input to many aspects of surface transportation, e.g., traffic management, traffic control, network design, vehicle routing, traveler information, and transportation planning all depend on models or simulation software that are based upon traffic flow theory. If the underlying traffic flow theory is flawed it puts the higher level applications at risk. So, the findings in this paper should lead to caution in accepting the predictions from traffic flow models and simulation software when the traffic exhibits a concave FD. 
    more » « less