skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: A Method for Granular Traffic Data Imputation Based on PARATUCK2
Imputing missing data is a critical task in data-driven intelligent transportation systems. During recent decades there has been a considerable investment in developing various types of sensors and smart systems, including stationary devices (e.g., loop detectors) and floating vehicles equipped with global positioning system (GPS) trackers to collect large-scale traffic data. However, collected data may not include observations from all road segments in a traffic network for different reasons, including sensor failure, transmission error, and because GPS-equipped vehicles may not always travel through all road segments. The first step toward developing real-time traffic monitoring and disruption prediction models is to estimate missing values through a systematic data imputation process. Many of the existing data imputation methods are based on matrix completion techniques that utilize the inherent spatiotemporal characteristics of traffic data. However, these methods may not fully capture the clustered structure of the data. This paper addresses this issue by developing a novel data imputation method using PARATUCK2 decomposition. The proposed method captures both spatial and temporal information of traffic data and constructs a low-dimensional and clustered representation of traffic patterns. The identified spatiotemporal clusters are used to recover network traffic profiles and estimate missing values. The proposed method is implemented using traffic data in the road network of Manhattan in New York City. The performance of the proposed method is evaluated in comparison with two state-of-the-art benchmark methods. The outcomes indicate that the proposed method outperforms the existing state-of-the-art imputation methods in complex and large-scale traffic networks.  more » « less
Award ID(s):
2026795 2027024
PAR ID:
10378126
Author(s) / Creator(s):
 ;  ;  
Publisher / Repository:
SAGE Publications
Date Published:
Journal Name:
Transportation Research Record: Journal of the Transportation Research Board
Volume:
2676
Issue:
10
ISSN:
0361-1981
Format(s):
Medium: X Size: p. 220-230
Size(s):
p. 220-230
Sponsoring Org:
National Science Foundation
More Like this
  1. The prevalent issue in urban trajectory data usage, notably in low-sample rate datasets, revolves around the accuracy of travel time estimations, traffic flow predictions, and trajectory similarity measurements. Conventional methods, often relying on simplistic mixes of static road networks and raw GPS data, fail to adequately integrate both network and trajectory dimensions. Addressing this, the innovative GRFTrajRec framework offers a graph-based solution for trajectory recovery. Its key feature is a trajectory-aware graph representation, enhancing the understanding of trajectory-road network interactions and facilitating the extraction of detailed embedding features for road segments. Additionally, GRFTrajRec's trajectory representation acutely captures spatiotemporal attributes of trajectory points. Central to this framework is a novel spatiotemporal interval-informed seq2seq model, integrating an attention-enhanced transformer and a feature differences-aware decoder. This model specifically excels in handling spatiotemporal intervals, crucial for restoring missing GPS points in low-sample datasets. Validated through extensive experiments on two large real-life trajectory datasets, GRFTrajRec has proven its efficacy in significantly boosting prediction accuracy and spatial consistency. 
    more » « less
  2. With the trend of vehicles becoming increasingly connected and potentially autonomous, vehicles are being equipped with rich sensing and communication devices. Various vehicular services based on shared real-time sensor data of vehicles from a fleet have been proposed to improve the urban efficiency, e.g., HD-live map, and traffic accident recovery. However, due to the high cost of data uploading (e.g., monthly fees for a cellular network), it would be impractical to make all well-equipped vehicles to upload real-time sensor data constantly. To better utilize these limited uploading resources and achieve an optimal road segment sensing coverage, we present a real-time sensing task scheduling framework, i.e., RISC, for Resource-Constraint modeling for urban sensing by scheduling sensing tasks of commercial vehicles with sensors based on the predictability of vehicles' mobility patterns. In particular, we utilize the commercial vehicles, including taxicabs, buses, and logistics trucks as mobile sensors to sense urban phenomena, e.g., traffic, by using the equipped vehicular sensors, e.g., dash-cam, lidar, automotive radar, etc. We implement RISC on a Chinese city Shenzhen with one-month real-world data from (i) a taxi fleet with 14 thousand vehicles; (ii) a bus fleet with 13 thousand vehicles; (iii) a truck fleet with 4 thousand vehicles. Further, we design an application, i.e., track suspect vehicles (e.g., hit-and-run vehicles), to evaluate the performance of RISC on the urban sensing aspect based on the data from a regular vehicle (i.e., personal car) fleet with 11 thousand vehicles. The evaluation results show that compared to the state-of-the-art solutions, we improved sensing coverage (i.e., the number of road segments covered by sensing vehicles) by 10% on average. 
    more » « less
  3. challenge as it may introduce uncertainties into the data analysis. Recent advances in matrix completion have shown competitive imputation performance when applied to many real-world domains. However, there are two major limitations when applying matrix completion methods to spatial data. First, they make a strong assumption that the entries are missing-at-random, which may not hold for spatial data. Second, they may not effectively utilize the underlying spatial structure of the data. To address these limitations, this paper presents a novel clustered adversarial matrix factorization method to explore and exploit the underlying cluster structure of the spatial data in order to facilitate effective imputation. The proposed method utilizes an adversarial network to learn the joint probability distribution of the variables and improve the imputation performance for the missing entries that are not randomly sampled. 
    more » « less
  4. null (Ed.)
    Abstract The problem of missingness in observational data is ubiquitous. When the confounders are missing at random, multiple imputation is commonly used; however, the method requires congeniality conditions for valid inferences, which may not be satisfied when estimating average causal treatment effects. Alternatively, fractional imputation, proposed by Kim 2011, has been implemented to handling missing values in regression context. In this article, we develop fractional imputation methods for estimating the average treatment effects with confounders missing at random. We show that the fractional imputation estimator of the average treatment effect is asymptotically normal, which permits a consistent variance estimate. Via simulation study, we compare fractional imputation’s accuracy and precision with that of multiple imputation. 
    more » « less
  5. null (Ed.)
    Network macroscopic fundamental diagrams (MFDs) have recently been shown to exist in real-world urban traffic networks. The existence of an MFD facilitates the modeling of urban traffic network dynamics at a regional level, which can be used to identify and refine large-scale network-wide control strategies. To be useful, MFD-based modeling frameworks require an estimate of the functional form of a network’s MFD. Analytical methods have been proposed to estimate a network’s MFD by abstracting the network as a single ring-road or corridor and modeling the flow–density relationship on that simplified element. However, these existing methods cannot account for the impact of turning traffic, as only a single corridor is considered. This paper proposes a method to estimate a network’s MFD when vehicles are allowed to turn into or out of a corridor. A two-ring abstraction is first used to analyze how turning will affect vehicle travel in a more general network, and then the model is further approximated using a single ring-road or corridor. This approximation is useful as it facilitates the application of existing variational theory-based methods (the stochastic method of cuts) to estimate the flow–density relationship on the corridor, while accounting for the stochastic nature of turning. Results of the approximation compared with a more realistic simulation that includes features that cannot be captured using variational theory—such as internal origins and destinations—suggest that this approximation works to estimate a network’s MFD when turning traffic is present. 
    more » « less