skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Advancing Temporal Multimodal Learning with Physics Informed Regularization
Estimating multimodal distributions of travel times from real-world data is critical for understanding and managing congestion. Mixture models can estimate the overall distribution when distinct peaks exist in the probability density function, but no transfer of mixture information under epistemic uncertainty across different spatiotemporal scales has been considered for capturing unobserved heterogeneity. In this paper, a physics-informed and -regularized prediction model is developed that shares observations across similarly distributed network segments across time and space. By grouping similar mixture models, the model uses a particular sample distribution at distant non-contiguous unexplored locations and improves TT prediction. Compared to traditional prediction without those updates, the proposed model's 19% of performance show the benefit of indirect learning. Different from traditional travel time prediction tools, the developed model can be used by traffic and planning agencies in knowing how far back in history and what sample size of historic data would be useful for current prediction.  more » « less
Award ID(s):
1910397
PAR ID:
10465570
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
2023 57th Annual Conference on Information Sciences and Systems (CISS)
Page Range / eLocation ID:
1 to 5
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Estimating multimodal distributions of travel times (TT) from real-world data is critical for understanding and managing congestion. Mixture models can estimate the overall distribution when distinct peaks exist in the probability density function, but no transfer of mixture information under epistemic uncertainty across different spatiotemporal scales has been considered for capturing unobserved heterogeneity. In this paper, a physics-informed and -regularized (PIR) prediction model is developed that shares observations across similarly distributed network segments over time and space. By grouping similar mixture models, the model uses a particular sample distribution at distant non-contiguous unexplored locations and improves TT prediction. The model includes hierarchical Kalman filtering (KF) updates using the traffic fundamental diagram to regulate any spurious correlation and estimates the mixture of TT distributions from observations at the current location and time sampled from the multimodal and multivariate TT distributions at other locations and times. In order to overcome the limitations of KF, this study developed dynamic graph neural network (GCN) model which uses time evolving spatial correlations. The KF model with PIR predicts traffic state with 19% more accuracy than TMML model in Park et al.(2022) and GCN model will further reduce the uncertainty in prediction. This study uses information gain from explored correlated links to obtain accurate predictions for unexplored ones. 
    more » « less
  2. An effective real-time estimation of the travel time for vehicles, using AVL (Automatic Vehicle Locators) has added a new dimension to the smart city planning. In this paper, the authors used data collected over several months from a transit agency and show how this data can be potentially used to learn patterns of travel time during specially planned events like NFL (National Football League) games and music award ceremonies. The impact of NFL games along with consideration of other factors like weather, traffic condition, distance is discussed with their relative importance to the prediction of travel time. Statistical learning models are used to predict travel time and subsequently assess the cascading effects of delay. The model performance is determined based on its predictive accuracy according to the out-of-sample error. In addition, the models help identify the most significant variables that influence the delay in the transit system. In order to compare the actual and predicted travel time for days having special events, heat maps are generated showing the delay impacts in different time windows between two timepoint-segments in comparison to a non-game day. This work focuses on the prediction and visualization of the delay in the public transit system and the analysis of its cascading effects on the entire transportation network. According to the study results, the authors are able to explain more than 80% of the variance in the bus travel time at each segment and can make future travel predictions during planned events with an out-of-sample error of 2.0 minutes using information on the bus schedule, traffic, weather, and scheduled events. According to the variable importance analysis, traffic information is most significant in predicting the delay in the transit system. 
    more » « less
  3. Zhang, Aidong; Rangwala, Huzefa (Ed.)
    Zero-inflated, heavy-tailed spatiotemporal data is common across science and engineering, from climate science to meteorology and seismology. A central modeling objective in such settings is to forecast the intensity, frequency, and timing of extreme and non-extreme events; yet in the context of deep learning, this objective presents several key challenges. First, a deep learning framework applied to such data must unify a mixture of distributions characterizing the zero events, moderate events, and extreme events. Second, the framework must be capable of enforcing parameter constraints across each component of the mixture distribution. Finally, the framework must be flexible enough to accommodate for any changes in the threshold used to define an extreme event after training. To address these challenges, we propose Deep Extreme Mixture Model (DEMM), fusing a deep learning-based hurdle model with extreme value theory to enable point and distribution prediction of zero-inflated, heavy-tailed spatiotemporal variables. The framework enables users to dynamically set a threshold for defining extreme events at inference-time without the need for retraining. We present an extensive experimental analysis applying DEMM to precipitation forecasting, and observe significant improvements in point and distribution prediction. All code is available at https://github.com/andrewmcdonald27/DeepExtremeMixtureModel. 
    more » « less
  4. null (Ed.)
    Early run-time prediction of co-running independent applications prior to application integration becomes challenging in multi-core processors. One of the most notable causes is the interference at the main memory subsystem, which results in significant degradation in application performance and response time in comparison to standalone execution. Currently available techniques for run-time prediction like traditional cycle-accurate simulations are slow, and analytical models are not accurate and time-consuming to build. By contrast, existing machine-learning-based approaches for run-time prediction simply do not account for interference. In this paper, we use a machine learning- based approach to train a model to correlate performance data (instructions and hardware performance counters) for a set of benchmark applications between the standalone and interference scenarios. After that, the trained model is used to predict the run-time of co-running applications in interference scenarios. In general, there is no straightforward one-to-one correspondence between samples obtained from the standalone and interference scenarios due to the different run-times, i.e. execution speeds. To address this, we developed a simple yet effective sample alignment algorithm, which is a key component in transforming interference prediction into a machine learning problem. In addition, we systematically identify the subset of features that have the highest positive impact on the model performance. Our approach is demonstrated to be effective and shows an average run-time prediction error, which is as low as 0.3% and 0.1% for two co-running applications. 
    more » « less
  5. Continuous-time event data are common in applications such as individual behavior data, financial transactions, and medical health records. Modeling such data can be very challenging, in particular for applications with many different types of events, since it requires a model to predict the event types as well as the time of occurrence. Recurrent neural networks that parameterize time-varying intensity functions are the current state-of-the-art for predictive modeling with such data. These models typically assume that all event sequences come from the same data distribution. However, in many applications event sequences are generated by different sources, or users, and their characteristics can be very different. In this paper, we extend the broad class of neural marked point process models to mixtures of latent embeddings, where each mixture component models the characteristic traits of a given user. Our approach relies on augmenting these models with a latent variable that encodes user characteristics, represented by a mixture model over user behavior that is trained via amortized variational inference. We evaluate our methods on four large real-world datasets and demonstrate systematic improvements from our approach over existing work for a variety of predictive metrics such as log-likelihood, next event ranking, and source-of-sequence identification. 
    more » « less