skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: A Study on Graph-Structured Recurrent Neural Networks and Sparsification with Application to Epidemic Forecasting
We study epidemic forecasting on real-world health data by a graph-structured recurrent neural network (GSRNN). We achieve state-of-the-art forecasting accuracy on the benchmark CDC dataset. To improve model efficiency, we sparsify the network weights via a transformed-1 penalty without losing prediction accuracy in numerical experiments.  more » « less
Award ID(s):
1737770
PAR ID:
10138641
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
WCGO 2019: Optimization of Complex Systems: Theory, Models, Algorithms and Applications
Volume:
991
Page Range / eLocation ID:
730-739
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Today’s data plane network telemetry systems en- able network operators to capture fine-grained data streams of many different network traffic features (e.g., loss or flow arrival rate) at line rate. This capability facilitates data-driven approaches to network management and motivates leveraging either statistical or machine learning models (e.g., for forecasting network data streams) for automating various network management tasks. However, current studies on network automation- related problems are in general not concerned with issues that arise when deploying these models in practice (e.g., (re)training overhead). In this paper, we examine various training-related aspects that affect the accuracy and overhead (and thus feasibility) of both LSTM and SARIMA, two popular types of models used for forecasting real-world network data streams in telemetry systems. In particular, we study the impact of the size, choice, and recency of the training data on accuracy and overhead and explore using separate models for different segments of a data stream (e.g., per-hour models). Using two real-world data streams, we show that (i) per-hour LSTM models exhibit high accuracy after training with only 24 hours of data, (ii) the accuracy of LSTM models does not depend on the recency of the training data (i.e., no frequent (re)training is required), (iii) SARIMA models can have comparable or lower accuracy than LSTM models, and (iv) certain segments of the data streams are inherently more challenging to forecast than others. While the specific findings reported in this paper depend on the considered data streams and specified models, we argue that irrespective of the data streams at hand, a similar examination of training-related aspects is needed before deploying any statistical or machine learning model in practice. 
    more » « less
  2. Accurately forecasting well-being may enable people to make desirable behavioral changes that could improve their future well-being. In this paper, we evaluate how well an automated model can forecast the next-day’s well-being (specifically focusing on stress, health, and happiness) from static models (support vector machine and logistic regression) and time-series models (long short-term memory neural network models (LSTM)) using the previous seven days of physiological, mobile phone, and behavioral survey data. We especially examine how using only a portion of the day’s data (e.g. just night-time, or just daytime) influences the forecasting accuracy. The results show that accuracy is improved, across every condition tested, by using an LSTM instead of using static models. We find that daytime-only physiology data from wearable sensors, using an LSTM, can provide an accurate forecast of tomorrow’s well-being using students’ daily life data (stress: 80.4%, health: 86.0%, and happiness: 79.1%), achieving the same accuracy as using data collected from around the clock. These findings are valuable steps toward developing a practical and convenient well-being forecasting system. 
    more » « less
  3. We introduce the Discrete-Temporal Sobolev Network (DTSN), a neural network loss function that assists dynamical system forecasting by minimizing variational differences between the network output and the training data via a temporal Sobolev norm. This approach is entirely data-driven, architecture agnostic, and does not require derivative information from the estimated system. The DTSN is particularly well suited to chaotic dynamical systems as it minimizes noise in the network output which is crucial for such sensitive systems. For our test cases we consider discrete approximations of the Lorenz-63 system and the Chua circuit. For the network architectures we use the Long Short-Term Memory (LSTM) and the Transformer. The performance of the DTSN is compared with the standard MSE loss for both architectures, as well as with the Physics Informed Neural Network (PINN) loss for the LSTM. The DTSN loss is shown to substantially improve accuracy for both architectures, while requiring less information than the PINN and without noticeably increasing computational time, thereby demonstrating its potential to improve neural network forecasting of dynamical systems. 
    more » « less
  4. Abstract An efficient and cost‐effective near‐field tsunami warning system is crucial for coastal communities. The existing tsunami forecasting system is based on offshore Deep‐Ocean Assessment and Reporting of Tsunamis and Global Navigation Satellite System (GNSS) buoys which are not affordable for many countries. A potential cost‐effective solution is to utilize position data from ships traveling in coastal and offshore regions. In this study, we examine the feasibility of using ship‐borne GNSS data in tsunami forecasting. We carry out synthetic experiments by applying a data assimilation (DA) method with ship position (elevation and velocity) data. Our findings show that the DA method can recover the reference model with high accuracy if a dense network of ship elevation data is used. However, the use of ship velocity data alone is unable to recover the reference model. In addition, we carried out sensitivity studies of the DA method to the ship spatial distribution. We find that a 20 km gap between the ships works well in terms of accuracy and computational time for the example source model that we explored. The highest accuracy is obtained when data from a sufficient number of ships traveling in and around the tsunami source area are available. 
    more » « less
  5. Citations of scientific papers and patents reveal the knowledge flow and usually serve as the metric for evaluating their novelty and impacts in the field. Citation Forecasting thus has various applications in the real world. Existing works on citation forecasting typically exploit the sequential properties of citation events, without exploring the citation network. In this paper, we propose to explore both the citation network and the related citation event sequences which provide valuable information for future citation forecasting. We propose a novel Citation Network and Event Sequence (CINES) Model to encode signals in the citation network and related citation event sequences into various types of embeddings for decoding to the arrivals of future citations. Moreover, we propose a temporal network attention and three alternative designs of bidirectional feature propagation to aggregate the retrospective and prospective aspects of publications in the citation network, coupled with the citation event sequence embeddings learned by a two-level attention mechanism for the citation forecasting. We evaluate our models and baselines on both a U.S. patent dataset and a DBLP dataset. Experimental results show that our models outperform the state-of-the-art methods, i.e., RMTPP, CYAN-RNN, Intensity-RNN, and PC-RNN, reducing the forecasting error by 37.76% - 75.32%. 
    more » « less