skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Few-shot Time-Series Forecasting with Application for Vehicular Traffic Flow
Few-shot machine learning attempts to predict outputs given only a very small number of training examples. The key idea behind most few-shot learning approaches is to pre-train the model with a large number of instances from a different but related class of data, classes for which a large number of instances are available for training. Few-shot learning has been most successfully demonstrated for classification problems using Siamese deep learning neural networks. Few-shot learning is less extensively applied to time-series forecasting. Few-shot forecasting is the task of predicting future values of a time-series even when only a small set of historic time-series is available. Few-shot forecasting has applications in domains where a long history of data is not available. This work describes deep neural network architectures for few-shot forecasting. All the architectures use a Siamese twin network approach to learn a difference function between pairs of time-series, rather than directly forecasting based on historical data as seen in traditional forecasting models. The networks are built using Long short-term memory units (LSTM). During forecasting, a model is able to forecast time-series types that were never seen in the training data by using the few available instances of the new time-series type as reference inputs. The proposed architectures are evaluated on Vehicular traffic data collected in California from the Caltrans Performance Measurement System (PeMS). The models were trained with traffic flow data collected at specific locations and then are evaluated by predicting traffic at different locations at different time horizons (0 to 12 hours). The Mean Absolute Error (MAE) was used as the evaluation metric and also as the loss function for training. The proposed architectures show lower prediction error than a baseline nearest neighbor forecast model. The prediction error increases at longer time horizons.  more » « less
Award ID(s):
2125654
PAR ID:
10355166
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Information Reuse and Integration for Data Science (IRI), 2022 IEEE 23rd International Conference on
Page Range / eLocation ID:
20-26
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Zero shot time series forecasting is the challenge of forecasting future values of a time dependent sequence without having access to any historical data from the target series during model training. This setting differs from the traditional domain of time series forecasting, where models are typically trained using large volumes of historical data, from the same distribution. Zero shot time series forecasting models are designed to generalize to unseen time series by leveraging their knowledge learned from other, similar series during training. This work proposes two architectures designed for zero shot time series forecasting: zSiFT and zSHiFT. Both architectures use transformer models arranged in a Siamese network configuration. The zSHiFT architecture differs from the zSiFT by the introduction of a hierarchical transformer component to the Siamese network. These architectures are evaluated on vehicular traffic data in California available from the Caltrans Performance Measurement System (PeMS). The models were trained with traffic flow data collected in one region of California and then are evaluated by forecasting traffic in other regions. Forecast accuracy was evaluated at different time horizons (4 to 48 hours). The zSiFT model achieves a Mean Absolute Error (MAE) that is 8.3% lower than the baseline LSTM with attention mechanism model. The zSiFT model achieves an MAE which is 6.6% lower than zSHiFT’s MAE. 
    more » « less
  2. Forecasting time series data is an important subject in economics, business, and finance. Traditionally, there are several techniques to effectively forecast the next lag of time series data such as univariate Autoregressive (AR), univariate Moving Average (MA), Simple Exponential Smoothing (SES), and more notably Autoregressive Integrated Moving Average (ARIMA) with its many variations. In particular, ARIMA model has demonstrated its outperformance in precision and accuracy of predicting the next lags of time series. With the recent advancement in computational power of computers and more importantly development of more advanced machine learning algorithms and approaches such as deep learning, new algorithms are developed to analyze and forecast time series data. The research question investigated in this article is that whether and how the newly developed deep learning-based algorithms for forecasting time series data, such as “Long Short-Term Memory (LSTM)”, are superior to the traditional algorithms. The empirical studies conducted and reported in this article show that deep learning-based algorithms such as LSTM outperform traditional-based algorithms such as ARIMA model. More specifically, the average reduction in error rates obtained by LSTM was between 84 - 87 percent when compared to ARIMA indicating the superiority of LSTM to ARIMA. Furthermore, it was noticed that the number of training times, known as “epoch” in deep learning, had no effect on the performance of the trained forecast model and it exhibited a truly random behavior. 
    more » « less
  3. null (Ed.)
    Predicting workload behavior during execution is essential for dynamic resource optimization of processor systems. Early studies used simple prediction algorithms such as a history tables. More recently, researchers have applied advanced machine learning regression techniques. Workload prediction can be cast as a time series forecasting problem. Time series forecasting is an active research area with recent advances that have not been studied in the context of workload prediction. In this paper, we first perform a comparative study of representative time series forecasting techniques to predict the dynamic workload of applications running on a CPU. We adapt state-of-the-art matrix profile and dynamic linear models (DLMs) not previously applied to workload prediction and compare them against traditional SVM and LSTM models that have been popular for handling non-stationary data. We find that all time series forecasting models struggle to predict abrupt workload changes. These changes occur because workloads go through phases, where prior work has studied workload phase detection, classification and prediction. We propose a novel approach that combines time series forecasting with phase prediction. We process each phase as a separate time series and train one forecasting model per phase. At runtime, forecasts from phase-specific models are selected and combined based on the predicted phase behavior. We apply our approach to forecasting of SPEC workloads running on a state-of-the-art Intel machine. Our results show that an LSTM-based phase-aware predictor can forecast workload CPI with less than 8% mean absolute error while reducing CPI error by more than 12% on average compared to a non-phase-aware approach. 
    more » « less
  4. Representation Learning), a novel multimodal meta-learning framework for few-shot learning in heterogeneous systems, designed for science and engineering problems where entities share a common underlying forward model but exhibit heterogeneity due to entity-specific characteristics. TAM-RL leverages an amortized training process with a modulation network and a base network to learn task-specific modulation parameters, enabling efficient adaptation to new tasks with limited data. We evaluate TAM-RL on two real-world environmental datasets: Gross Primary Product (GPP) prediction and streamflow forecasting, demonstrating significant improvements over existing meta-learning methods. On the FLUXNET dataset, TAM-RL improves RMSE by 18.9% over MMAML with just one month of few-shot data, while for streamflow prediction, it achieves an 8.21% improvement with one year of data. Synthetic data experiments further validate TAM-RL’s superior performance in heterogeneous task distributions, outperforming the baselines in the most heterogeneous setting. Notably, TAM-RL offers substantial computational efficiency, with at least 3x faster training times compared to gradient-based meta-learning approaches while being much simpler to train due to reduced complexity. Ablation studies highlight the importance of pretraining and adaptation mechanisms in TAM-RL’s performance. Keywords: Representation Learning, meta-learning, few-shot learning, environmental applications, time-series. DOI:10.1137/1.9781611978520.2 
    more » « less
  5. Abstract Heatwaves are projected to increase in frequency and severity with global warming. Improved warning systems would help reduce the associated loss of lives, wildfires, power disruptions, and reduction in crop yields. In this work, we explore the potential for deep learning systems trained on historical data to forecast extreme heat on short, medium and subseasonal time scales. To this purpose, we train a set of neural weather models (NWMs) with convolutional architectures to forecast surface temperature anomalies globally, 1 to 28 days ahead, at ∼200-km resolution and on the cubed sphere. The NWMs are trained using the ERA5 reanalysis product and a set of candidate loss functions, including the mean-square error and exponential losses targeting extremes. We find that training models to minimize custom losses tailored to emphasize extremes leads to significant skill improvements in the heatwave prediction task, relative to NWMs trained on the mean-square-error loss. This improvement is accomplished with almost no skill reduction in the general temperature prediction task, and it can be efficiently realized through transfer learning, by retraining NWMs with the custom losses for a few epochs. In addition, we find that the use of a symmetric exponential loss reduces the smoothing of NWM forecasts with lead time. Our best NWM is able to outperform persistence in a regressive sense for all lead times and temperature anomaly thresholds considered, and shows positive regressive skill relative to the ECMWF subseasonal-to-seasonal control forecast after 2 weeks. Significance StatementHeatwaves are projected to become stronger and more frequent as a result of global warming. Accurate forecasting of these events would enable the implementation of effective mitigation strategies. Here we analyze the forecast accuracy of artificial intelligence systems trained on historical surface temperature data to predict extreme heat events globally, 1 to 28 days ahead. We find that artificial intelligence systems trained to focus on extreme temperatures are significantly more accurate at predicting heatwaves than systems trained to minimize errors in surface temperatures and remain equally skillful at predicting moderate temperatures. Furthermore, the extreme-focused systems compete with state-of-the-art physics-based forecast systems in the subseasonal range, while incurring a much lower computational cost. 
    more » « less