Few-shot machine learning attempts to predict outputs given only a very small number of training examples. The key idea behind most few-shot learning approaches is to pre-train the model with a large number of instances from a different but related class of data, classes for which a large number of instances are available for training. Few-shot learning has been most successfully demonstrated for classification problems using Siamese deep learning neural networks. Few-shot learning is less extensively applied to time-series forecasting. Few-shot forecasting is the task of predicting future values of a time-series even when only a small set of historic time-series is available. Few-shot forecasting has applications in domains where a long history of data is not available. This work describes deep neural network architectures for few-shot forecasting. All the architectures use a Siamese twin network approach to learn a difference function between pairs of time-series, rather than directly forecasting based on historical data as seen in traditional forecasting models. The networks are built using Long short-term memory units (LSTM). During forecasting, a model is able to forecast time-series types that were never seen in the training data by using the few available instances of the new time-series type as reference inputs. The proposed architectures are evaluated on Vehicular traffic data collected in California from the Caltrans Performance Measurement System (PeMS). The models were trained with traffic flow data collected at specific locations and then are evaluated by predicting traffic at different locations at different time horizons (0 to 12 hours). The Mean Absolute Error (MAE) was used as the evaluation metric and also as the loss function for training. The proposed architectures show lower prediction error than a baseline nearest neighbor forecast model. The prediction error increases at longer time horizons.
more »
« less
Discrete Graph Structure Learning for Forecasting Multiple Time Series
Time series forecasting is an extensively studied subject in statistics, economics, and computer science. Exploration of the correlation and causation among the variables in a multivariate time series shows promise in enhancing the performance of a time series model. When using deep neural networks as forecasting models, we hypothesize that exploiting the pairwise information among multiple (multivariate) time series also improves their forecast. If an explicit graph structure is known, graph neural networks (GNNs) have been demonstrated as powerful tools to exploit the structure. In this work, we propose learning the structure simultaneously with the GNN if the graph is unknown. We cast the problem as learning a probabilistic graph model through optimizing the mean performance over the graph distribution. The distribution is parameterized by a neural network so that discrete graphs can be sampled differentiably through reparameterization. Empirical evaluations show that our method is simpler, more efficient, and better performing than a recently proposed bilevel learning approach for graph structure learning, as well as a broad array of forecasting models, either deep or non-deep learning based, and graph or non-graph based.
more »
« less
- Award ID(s):
- 1718738
- PAR ID:
- 10253603
- Date Published:
- Journal Name:
- Proceedings of International Conference on Learning Representations
- Page Range / eLocation ID:
- 1-14
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Graph structured data are abundant in the real world. Among different graph types, directed acyclic graphs (DAGs) are of particular interest to machine learning researchers, as many machine learning models are realized as computations on DAGs, including neural networks and Bayesian networks. In this paper, we study deep generative models for DAGs, and propose a novel DAG variational autoencoder (D-VAE). To encode DAGs into the latent space, we leverage graph neural networks. We propose an asynchronous message passing scheme that allows encoding the computations on DAGs, rather than using existing simultaneous message passing schemes to encode local graph structures. We demonstrate the effectiveness of our proposed DVAE through two tasks: neural architecture search and Bayesian network structure learning. Experiments show that our model not only generates novel and valid DAGs, but also produces a smooth latent space that facilitates searching for DAGs with better performance through Bayesian optimization.more » « less
-
Over the past two decades, machine learning and deep learning techniques for forecasting solar flares have generated great impact due to their ability to learn from a high dimensional data space. However, lack of high quality data from flaring phenomena becomes a constraining factor for such tasks. One of the methods to tackle this complex problem is utilizing trained classifiers with multivariate time series of magnetic field parameters. In this work, we compare the exceedingly popular multivariate time series classifiers applying deep learning techniques with commonly used machine learning classifiers (i.e., SVM). We intend to explore the role of data augmentation on time series oriented flare prediction techniques, specifically the deep learning-based ones. We utilize four time series data augmentation techniques and couple them with selected multivariate time series classifiers to understand how each of them affects the outcome. In the end, we show that the deep learning algorithms as well as augmentation techniques improve our classifiers performance. The resulting classifiers’ performance after augmentation outplayed the traditional flare forecasting techniques.more » « less
-
In machine learning applications, data are often high-dimensional and intricately related. It is often of interest to find the underlying structure and Granger causal relationships among the data and represent these relationships with directed graphs. In this paper, we study multivariate time series, where each series is associated with a node of a graph, and where the objective is to estimate the topology of a sparse graph that reflects how the nodes of the graph affect each other, if at all. We propose a novel fully Bayesian approach that employs a sparsity-encouraging prior on the hyperparameters. The proposed method allows for nonlinear and multiple lag relationships among the time series. The method is based on Gaussian processes, and it treats the entries of the graph adjacency matrix as hyperparameters. It utilizes a modified automatic relevance determination (ARD) kernel and allows for learning the mapping function from selected past data to current data as edges of a graph . We show that the resulting adjacency matrix provides the intrinsic structure of the graph and answers causality-related questions. Numerical tests show that the proposed method has comparable or better performance than state-of-the-art methods.more » « less
-
null (Ed.)Rank position forecasting in car racing is a challenging problem when using a Deep Learning-based model over timeseries data. It is featured with highly complex global dependency among the racing cars, with uncertainty resulted from existing and external factors; and it is also a problem with data scarcity. Existing methods, including statistical models, machine learning regression models, and several state-of-the-art deep forecasting models all perform not well on this problem. By an elaborate analysis of pit stop events, we find it critical to decompose the cause-and-effect relationship and model the rank position and pit stop events separately. In choosing a sub-model from different neural network models, we find the model with weak assumptions on the global dependency structure performs the best. Based on these observations, we propose RankNet, a combination of the encoder-decoder network and a separate Multilayer Perception network that is capable of delivering probabilistic forecasting to model the pit stop events and rank position in car racing. Further with the help of feature optimizations, RankNet demonstrates a significant performance improvement, where MAE improves 19% in two laps forecasting task and 7% in the stint forecasting task over the best baseline and is also more stable when adapting to unseen new data. Details of the model optimizations and performance profiling are presented. It is promising to provide useful interactions of neural networks in forecasting racing cars and shine a light on solutions to similar challenging issues in general forecasting problems.more » « less
An official website of the United States government

