To predict rare extreme events using deep neural networks, one encounters the so-called small data problem because even long-term observations often contain few extreme events. Here, we investigate a model-assisted framework where the training data are obtained from numerical simulations, as opposed to observations, with adequate samples from extreme events. However, to ensure the trained networks are applicable in practice, the training is not performed on the full simulation data; instead, we only use a small subset of observable quantities, which can be measured in practice. We investigate the feasibility of this model-assisted framework on three different dynamical systems (Rössler attractor, FitzHugh–Nagumo model, and a turbulent fluid flow) and three different deep neural network architectures (feedforward, long short-term memory, and reservoir computing). In each case, we study the prediction accuracy, robustness to noise, reproducibility under repeated training, and sensitivity to the type of input data. In particular, we find long short-term memory networks to be most robust to noise and to yield relatively accurate predictions, while requiring minimal fine-tuning of the hyperparameters.
more » « less- Award ID(s):
- 2051010
- PAR ID:
- 10365806
- Publisher / Repository:
- American Institute of Physics
- Date Published:
- Journal Name:
- Chaos: An Interdisciplinary Journal of Nonlinear Science
- Volume:
- 32
- Issue:
- 4
- ISSN:
- 1054-1500
- Page Range / eLocation ID:
- Article No. 043112
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
Unmanned Aerial Networks (UAVs) are prone to several cyber-attacks, including Global Positioning Spoofing attacks. For this purpose, numerous studies have been conducted to detect, classify, and mitigate these attacks, using Artificial Intelligence techniques; however, most of these studies provided techniques with low detection, high misdetection, and high bias rates. To fill this gap, in this paper, we propose three supervised deep learning techniques, namely Deep Neural Network, U Neural Network, and Long Short Term Memory. These models are evaluated in terms of Accuracy, Detection Rate, Misdetection Rate, False Alarm Rate, Training Time per Sample, Prediction Time, and Memory Size. The simulation results indicated that the U Neural Network outperforms other models with an accuracy of 98.80%, a probability of detection of 98.85%, a misdetection of 1.15%, a false alarm of 1.8%, a training time per sample of 0.22 seconds, a prediction time of 0.2 seconds, and a memory size of 199.87 MiB. In addition, these results depicted that the Long Short-Term Memory model provides the lowest performance among other models for detecting these attacks on UAVs.more » « less
-
Geospatio-temporal data are pervasive across numerous application domains.These rich datasets can be harnessed to predict extreme events such as disease outbreaks, flooding, crime spikes, etc.However, since the extreme events are rare, predicting them is a hard problem. Statistical methods based on extreme value theory provide a systematic way for modeling the distribution of extreme values. In particular, the generalized Pareto distribution (GPD) is useful for modeling the distribution of excess values above a certain threshold. However, applying such methods to large-scale geospatio-temporal data is a challenge due to the difficulty in capturing the complex spatial relationships between extreme events at multiple locations. This paper presents a deep learning framework for long-term prediction of the distribution of extreme values at different locations. We highlight its computational challenges and present a novel framework that combines convolutional neural networks with deep set and GPD. We demonstrate the effectiveness of our approach on a real-world dataset for modeling extreme climate events.more » « less
-
null (Ed.)Abstract Recurrent neural networks have led to breakthroughs in natural language processing and speech recognition. Here we show that recurrent networks, specifically long short-term memory networks can also capture the temporal evolution of chemical/biophysical trajectories. Our character-level language model learns a probabilistic model of 1-dimensional stochastic trajectories generated from higher-dimensional dynamics. The model captures Boltzmann statistics and also reproduces kinetics across a spectrum of timescales. We demonstrate how training the long short-term memory network is equivalent to learning a path entropy, and that its embedding layer, instead of representing contextual meaning of characters, here exhibits a nontrivial connectivity between different metastable states in the underlying physical system. We demonstrate our model’s reliability through different benchmark systems and a force spectroscopy trajectory for multi-state riboswitch. We anticipate that our work represents a stepping stone in the understanding and use of recurrent neural networks for understanding the dynamics of complex stochastic molecular systems.more » « less
-
In this study, we predicted the log returns of the top 10 cryptocurrencies based on market cap, using univariate and multivariate machine learning methods such as recurrent neural networks, deep learning neural networks, Holt’s exponential smoothing, autoregressive integrated moving average, ForecastX, and long short-term memory networks. The multivariate long short-term memory networks performed better than the univariate machine learning methods in terms of the prediction error measures.more » « less
-
Few-shot machine learning attempts to predict outputs given only a very small number of training examples. The key idea behind most few-shot learning approaches is to pre-train the model with a large number of instances from a different but related class of data, classes for which a large number of instances are available for training. Few-shot learning has been most successfully demonstrated for classification problems using Siamese deep learning neural networks. Few-shot learning is less extensively applied to time-series forecasting. Few-shot forecasting is the task of predicting future values of a time-series even when only a small set of historic time-series is available. Few-shot forecasting has applications in domains where a long history of data is not available. This work describes deep neural network architectures for few-shot forecasting. All the architectures use a Siamese twin network approach to learn a difference function between pairs of time-series, rather than directly forecasting based on historical data as seen in traditional forecasting models. The networks are built using Long short-term memory units (LSTM). During forecasting, a model is able to forecast time-series types that were never seen in the training data by using the few available instances of the new time-series type as reference inputs. The proposed architectures are evaluated on Vehicular traffic data collected in California from the Caltrans Performance Measurement System (PeMS). The models were trained with traffic flow data collected at specific locations and then are evaluated by predicting traffic at different locations at different time horizons (0 to 12 hours). The Mean Absolute Error (MAE) was used as the evaluation metric and also as the loss function for training. The proposed architectures show lower prediction error than a baseline nearest neighbor forecast model. The prediction error increases at longer time horizons.more » « less