Today’s data plane network telemetry systems en- able network operators to capture fine-grained data streams of many different network traffic features (e.g., loss or flow arrival rate) at line rate. This capability facilitates data-driven approaches to network management and motivates leveraging either statistical or machine learning models (e.g., for forecasting network data streams) for automating various network management tasks. However, current studies on network automation- related problems are in general not concerned with issues that arise when deploying these models in practice (e.g., (re)training overhead). In this paper, we examine various training-related aspects that affect the accuracy and overhead (and thus feasibility) of both LSTM and SARIMA, two popular types of models used for forecasting real-world network data streams in telemetry systems. In particular, we study the impact of the size, choice, and recency of the training data on accuracy and overhead and explore using separate models for different segments of a data stream (e.g., per-hour models). Using two real-world data streams, we show that (i) per-hour LSTM models exhibit high accuracy after training with only 24 hours of data, (ii) the accuracy of LSTM models does not depend on the recency of the training data (i.e., no frequent (re)training is required), (iii) SARIMA models can have comparable or lower accuracy than LSTM models, and (iv) certain segments of the data streams are inherently more challenging to forecast than others. While the specific findings reported in this paper depend on the considered data streams and specified models, we argue that irrespective of the data streams at hand, a similar examination of training-related aspects is needed before deploying any statistical or machine learning model in practice.
more »
« less
A Study on Graph-Structured Recurrent Neural Networks and Sparsification with Application to Epidemic Forecasting
We study epidemic forecasting on real-world health data by a graph-structured recurrent neural network (GSRNN). We achieve state-of-the-art forecasting accuracy on the benchmark CDC dataset. To improve model efficiency, we sparsify the network weights via a transformed-1 penalty without losing prediction accuracy in numerical experiments.
more »
« less
- Award ID(s):
- 1737770
- PAR ID:
- 10138641
- Date Published:
- Journal Name:
- WCGO 2019: Optimization of Complex Systems: Theory, Models, Algorithms and Applications
- Volume:
- 991
- Page Range / eLocation ID:
- 730-739
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Accurately forecasting well-being may enable people to make desirable behavioral changes that could improve their future well-being. In this paper, we evaluate how well an automated model can forecast the next-day’s well-being (specifically focusing on stress, health, and happiness) from static models (support vector machine and logistic regression) and time-series models (long short-term memory neural network models (LSTM)) using the previous seven days of physiological, mobile phone, and behavioral survey data. We especially examine how using only a portion of the day’s data (e.g. just night-time, or just daytime) influences the forecasting accuracy. The results show that accuracy is improved, across every condition tested, by using an LSTM instead of using static models. We find that daytime-only physiology data from wearable sensors, using an LSTM, can provide an accurate forecast of tomorrow’s well-being using students’ daily life data (stress: 80.4%, health: 86.0%, and happiness: 79.1%), achieving the same accuracy as using data collected from around the clock. These findings are valuable steps toward developing a practical and convenient well-being forecasting system.more » « less
-
Zero shot time series forecasting is the challenge of forecasting future values of a time dependent sequence without having access to any historical data from the target series during model training. This setting differs from the traditional domain of time series forecasting, where models are typically trained using large volumes of historical data, from the same distribution. Zero shot time series forecasting models are designed to generalize to unseen time series by leveraging their knowledge learned from other, similar series during training. This work proposes two architectures designed for zero shot time series forecasting: zSiFT and zSHiFT. Both architectures use transformer models arranged in a Siamese network configuration. The zSHiFT architecture differs from the zSiFT by the introduction of a hierarchical transformer component to the Siamese network. These architectures are evaluated on vehicular traffic data in California available from the Caltrans Performance Measurement System (PeMS). The models were trained with traffic flow data collected in one region of California and then are evaluated by forecasting traffic in other regions. Forecast accuracy was evaluated at different time horizons (4 to 48 hours). The zSiFT model achieves a Mean Absolute Error (MAE) that is 8.3% lower than the baseline LSTM with attention mechanism model. The zSiFT model achieves an MAE which is 6.6% lower than zSHiFT’s MAE.more » « less
-
We introduce the Discrete-Temporal Sobolev Network (DTSN), a neural network loss function that assists dynamical system forecasting by minimizing variational differences between the network output and the training data via a temporal Sobolev norm. This approach is entirely data-driven, architecture agnostic, and does not require derivative information from the estimated system. The DTSN is particularly well suited to chaotic dynamical systems as it minimizes noise in the network output which is crucial for such sensitive systems. For our test cases we consider discrete approximations of the Lorenz-63 system and the Chua circuit. For the network architectures we use the Long Short-Term Memory (LSTM) and the Transformer. The performance of the DTSN is compared with the standard MSE loss for both architectures, as well as with the Physics Informed Neural Network (PINN) loss for the LSTM. The DTSN loss is shown to substantially improve accuracy for both architectures, while requiring less information than the PINN and without noticeably increasing computational time, thereby demonstrating its potential to improve neural network forecasting of dynamical systems.more » « less
-
Abstract An efficient and cost‐effective near‐field tsunami warning system is crucial for coastal communities. The existing tsunami forecasting system is based on offshore Deep‐Ocean Assessment and Reporting of Tsunamis and Global Navigation Satellite System (GNSS) buoys which are not affordable for many countries. A potential cost‐effective solution is to utilize position data from ships traveling in coastal and offshore regions. In this study, we examine the feasibility of using ship‐borne GNSS data in tsunami forecasting. We carry out synthetic experiments by applying a data assimilation (DA) method with ship position (elevation and velocity) data. Our findings show that the DA method can recover the reference model with high accuracy if a dense network of ship elevation data is used. However, the use of ship velocity data alone is unable to recover the reference model. In addition, we carried out sensitivity studies of the DA method to the ship spatial distribution. We find that a 20 km gap between the ships works well in terms of accuracy and computational time for the example source model that we explored. The highest accuracy is obtained when data from a sufficient number of ships traveling in and around the tsunami source area are available.more » « less
An official website of the United States government

