skip to main content


Title: Few-shot Time-Series Forecasting with Application for Vehicular Traffic Flow
Few-shot machine learning attempts to predict outputs given only a very small number of training examples. The key idea behind most few-shot learning approaches is to pre-train the model with a large number of instances from a different but related class of data, classes for which a large number of instances are available for training. Few-shot learning has been most successfully demonstrated for classification problems using Siamese deep learning neural networks. Few-shot learning is less extensively applied to time-series forecasting. Few-shot forecasting is the task of predicting future values of a time-series even when only a small set of historic time-series is available. Few-shot forecasting has applications in domains where a long history of data is not available. This work describes deep neural network architectures for few-shot forecasting. All the architectures use a Siamese twin network approach to learn a difference function between pairs of time-series, rather than directly forecasting based on historical data as seen in traditional forecasting models. The networks are built using Long short-term memory units (LSTM). During forecasting, a model is able to forecast time-series types that were never seen in the training data by using the few available instances of the new time-series type as reference inputs. The proposed architectures are evaluated on Vehicular traffic data collected in California from the Caltrans Performance Measurement System (PeMS). The models were trained with traffic flow data collected at specific locations and then are evaluated by predicting traffic at different locations at different time horizons (0 to 12 hours). The Mean Absolute Error (MAE) was used as the evaluation metric and also as the loss function for training. The proposed architectures show lower prediction error than a baseline nearest neighbor forecast model. The prediction error increases at longer time horizons.  more » « less
Award ID(s):
2125654
PAR ID:
10355166
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Information Reuse and Integration for Data Science (IRI), 2022 IEEE 23rd International Conference on
Page Range / eLocation ID:
20-26
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Forecasting time series data is an important subject in economics, business, and finance. Traditionally, there are several techniques to effectively forecast the next lag of time series data such as univariate Autoregressive (AR), univariate Moving Average (MA), Simple Exponential Smoothing (SES), and more notably Autoregressive Integrated Moving Average (ARIMA) with its many variations. In particular, ARIMA model has demonstrated its outperformance in precision and accuracy of predicting the next lags of time series. With the recent advancement in computational power of computers and more importantly development of more advanced machine learning algorithms and approaches such as deep learning, new algorithms are developed to analyze and forecast time series data. The research question investigated in this article is that whether and how the newly developed deep learning-based algorithms for forecasting time series data, such as “Long Short-Term Memory (LSTM)”, are superior to the traditional algorithms. The empirical studies conducted and reported in this article show that deep learning-based algorithms such as LSTM outperform traditional-based algorithms such as ARIMA model. More specifically, the average reduction in error rates obtained by LSTM was between 84 - 87 percent when compared to ARIMA indicating the superiority of LSTM to ARIMA. Furthermore, it was noticed that the number of training times, known as “epoch” in deep learning, had no effect on the performance of the trained forecast model and it exhibited a truly random behavior. 
    more » « less
  2. null (Ed.)
    Predicting workload behavior during execution is essential for dynamic resource optimization of processor systems. Early studies used simple prediction algorithms such as a history tables. More recently, researchers have applied advanced machine learning regression techniques. Workload prediction can be cast as a time series forecasting problem. Time series forecasting is an active research area with recent advances that have not been studied in the context of workload prediction. In this paper, we first perform a comparative study of representative time series forecasting techniques to predict the dynamic workload of applications running on a CPU. We adapt state-of-the-art matrix profile and dynamic linear models (DLMs) not previously applied to workload prediction and compare them against traditional SVM and LSTM models that have been popular for handling non-stationary data. We find that all time series forecasting models struggle to predict abrupt workload changes. These changes occur because workloads go through phases, where prior work has studied workload phase detection, classification and prediction. We propose a novel approach that combines time series forecasting with phase prediction. We process each phase as a separate time series and train one forecasting model per phase. At runtime, forecasts from phase-specific models are selected and combined based on the predicted phase behavior. We apply our approach to forecasting of SPEC workloads running on a state-of-the-art Intel machine. Our results show that an LSTM-based phase-aware predictor can forecast workload CPI with less than 8% mean absolute error while reducing CPI error by more than 12% on average compared to a non-phase-aware approach. 
    more » « less
  3. Abstract

    Heatwaves are projected to increase in frequency and severity with global warming. Improved warning systems would help reduce the associated loss of lives, wildfires, power disruptions, and reduction in crop yields. In this work, we explore the potential for deep learning systems trained on historical data to forecast extreme heat on short, medium and subseasonal time scales. To this purpose, we train a set of neural weather models (NWMs) with convolutional architectures to forecast surface temperature anomalies globally, 1 to 28 days ahead, at ∼200-km resolution and on the cubed sphere. The NWMs are trained using the ERA5 reanalysis product and a set of candidate loss functions, including the mean-square error and exponential losses targeting extremes. We find that training models to minimize custom losses tailored to emphasize extremes leads to significant skill improvements in the heatwave prediction task, relative to NWMs trained on the mean-square-error loss. This improvement is accomplished with almost no skill reduction in the general temperature prediction task, and it can be efficiently realized through transfer learning, by retraining NWMs with the custom losses for a few epochs. In addition, we find that the use of a symmetric exponential loss reduces the smoothing of NWM forecasts with lead time. Our best NWM is able to outperform persistence in a regressive sense for all lead times and temperature anomaly thresholds considered, and shows positive regressive skill relative to the ECMWF subseasonal-to-seasonal control forecast after 2 weeks.

    Significance Statement

    Heatwaves are projected to become stronger and more frequent as a result of global warming. Accurate forecasting of these events would enable the implementation of effective mitigation strategies. Here we analyze the forecast accuracy of artificial intelligence systems trained on historical surface temperature data to predict extreme heat events globally, 1 to 28 days ahead. We find that artificial intelligence systems trained to focus on extreme temperatures are significantly more accurate at predicting heatwaves than systems trained to minimize errors in surface temperatures and remain equally skillful at predicting moderate temperatures. Furthermore, the extreme-focused systems compete with state-of-the-art physics-based forecast systems in the subseasonal range, while incurring a much lower computational cost.

     
    more » « less
  4. Abstract: Load forecasting plays a very crucial role in many aspects of electric power systems including the economic and social benefits. Previously, there have been many studies involving load forecasting using time series approach, including weather-load relationships. In one such approach to predict load, this paper investigates through different structures that aim to relate various daily parameters. These parameters include temperature, humidity and solar radiation that comprises the weather data. Along with natural phenomenon as weather, physical aspects such as traffic flow are also considered. Based on the relationship, a prediction algorithm is applied to check if prediction error decreases when such external factors are considered. Electricity consumption data is collected from the City of Tallahassee utilities. Traffic count is provided by the Florida Department of Transportation. Moreover, the weather data is obtained from Tallahassee regional Airport weather station. This paper aims to study and establish a cause and effect relationship between the mentioned variables using different causality models and to forecast load based on the external variables. Based on the relationship, a prediction algorithm is applied to check if prediction error decreases when such external factors are considered. 
    more » « less
  5. Predicting extreme events in chaotic systems, characterized by rare but intensely fluctuating properties, is of great importance due to their impact on the performance and reliability of a wide range of systems. Some examples include weather forecasting, traffic management, power grid operations, and financial market analysis, to name a few. Methods of increasing sophistication have been developed to forecast events in these systems. However, the boundaries that define the maximum accuracy of forecasting tools are still largely unexplored from a theoretical standpoint. Here, we address the question: What is the minimum possible error in the prediction of extreme events in complex, chaotic systems? We derive the minimum probability of error in extreme event forecasting along with its information-theoretic lower and upper bounds. These bounds are universal for a given problem, in that they hold regardless of the modeling approach for extreme event prediction: from traditional linear regressions to sophisticated neural network models. The limits in predictability are obtained from the cost-sensitive Fano’s and Hellman’s inequalities using the Rényi entropy. The results are also connected to Takens’ embedding theorem using the information can’t hurt inequality. Finally, the probability of error for a forecasting model is decomposed into three sources: uncertainty in the initial conditions, hidden variables, and suboptimal modeling assumptions. The latter allows us to assess whether prediction models are operating near their maximum theoretical performance or if further improvements are possible. The bounds are applied to the prediction of extreme events in the Rössler system and the Kolmogorov flow. 
    more » « less