This paper advances machine learning (ML)-based streamflow prediction by strategically selecting rainfall events, introducing a new loss function, and addressing rainfall forecast uncertainties. Focusing on the Iowa River Basin, we applied the stochastic storm transposition (SST) method to create realistic rainfall events, which were input into a hydrological model to generate corresponding streamflow data for training and testing deterministic and probabilistic ML models. Long short-term memory (LSTM) networks were employed to predict streamflow up to 12 h ahead. An active learning approach was used to identify the most informative rainfall events, reducing data generation effort. Additionally, we introduced a novel asymmetric peak loss function to improve peak streamflow prediction accuracy. Incorporating rainfall forecast uncertainties, our probabilistic LSTM model provided uncertainty quantification for streamflow predictions. Performance evaluation using different metrics improved the accuracy and reliability of our models. These contributions enhance flood forecasting and decision-making while significantly reducing computational time and costs.
more »
« less
Closing in on Hydrologic Predictive Accuracy: Combining the Strengths of High‐Fidelity and Physics‐Agnostic Models
Abstract Applications of process‐based models (PBM) for predictions are confounded by multiple uncertainties and computational burdens, resulting in appreciable errors. A novel modeling framework combining a high‐fidelity PBM with surrogate and machine learning (ML) models is developed to tackle these challenges and applied for streamflow prediction. A surrogate model permits high computational efficiency of a PBM solution at a minimum loss of its accuracy. A novel probabilistic ML model partitions the PBM‐surrogate prediction errors into reducible and irreducible types, quantifying their distributions that arise due to both explicitly perceived uncertainties (such as parametric) or those that are entirely hidden to the modeler (not included or unexpected). Using this approach, we demonstrate a substantial improvement of streamflow predictive accuracy for a case study urbanized watershed. Such a framework provides an efficient solution combining the strengths of high‐fidelity and physics‐agnostic models for a wide range of prediction problems in geosciences.
more »
« less
- Award ID(s):
- 2053429
- PAR ID:
- 10490270
- Publisher / Repository:
- Wiley
- Date Published:
- Journal Name:
- Geophysical Research Letters
- Volume:
- 50
- Issue:
- 17
- ISSN:
- 0094-8276
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract Many mechanical engineering applications call for multiscale computational modeling and simulation. However, solving for complex multiscale systems remains computationally onerous due to the high dimensionality of the solution space. Recently, machine learning (ML) has emerged as a promising solution that can either serve as a surrogate for, accelerate or augment traditional numerical methods. Pioneering work has demonstrated that ML provides solutions to governing systems of equations with comparable accuracy to those obtained using direct numerical methods, but with significantly faster computational speed. These high-speed, high-fidelity estimations can facilitate the solving of complex multiscale systems by providing a better initial solution to traditional solvers. This paper provides a perspective on the opportunities and challenges of using ML for complex multiscale modeling and simulation. We first outline the current state-of-the-art ML approaches for simulating multiscale systems and highlight some of the landmark developments. Next, we discuss current challenges for ML in multiscale computational modeling, such as the data and discretization dependence, interpretability, and data sharing and collaborative platform development. Finally, we suggest several potential research directions for the future.more » « less
-
Abstract We introduce the concept of decision‐focused surrogate modeling for solving computationally challenging nonlinear optimization problems in real‐time settings. The proposed data‐driven framework seeks to learn a simpler, for example, convex, surrogate optimization model that is trained to minimize thedecision prediction error, which is defined as the difference between the optimal solutions of the original and the surrogate optimization models. The learning problem, formulated as a bilevel program, can be viewed as a data‐driven inverse optimization problem to which we apply a decomposition‐based solution algorithm from previous work. We validate our framework through numerical experiments involving the optimization of common nonlinear chemical processes such as chemical reactors, heat exchanger networks, and material blending systems. We also present a detailed comparison of decision‐focused surrogate modeling with standard data‐driven surrogate modeling methods and demonstrate that our approach is significantly more data‐efficient while producing simple surrogate models with high decision prediction accuracy.more » « less
-
Prediction of flotation efficiency of metal sulfides using an original hybrid machine learning modelAbstract Froth flotation process is extensively used for selective separation of base metal sulfides from uneconomic mineral resources. Reliable prediction of process outcomes (metal recovery and grade) is vital to ensure peak performance. This work employs an innovative hybrid machine learning (ML) model—constructed by combining the random forest model and the firefly algorithm—to predict froth flotation efficiency of galena and chalcopyrite in relation to various experimental process parameters. The hybrid model's prediction performance was rigorously evaluated, and compared against four different standalone ML models. The outcomes of this study illustrate that the hybrid ML model has the prediction ability to process outcomes with high‐fidelity, while consistently outperforming the standalone ML models.more » « less
-
The widespread integration of deep neural networks in developing data-driven surrogate models for high-fidelity simulations of complex physical systems highlights the critical necessity for robust uncertainty quantification techniques and credibility assessment methodologies, ensuring the reliable deployment of surrogate models in consequential decision-making. This study presents the Occam Plausibility Algorithm for surrogate models (OPAL-surrogate), providing a systematic framework to uncover predictive neural network-based surrogate models within the large space of potential models, including various neural network classes and choices of architecture and hyperparameters. The framework is grounded in hierarchical Bayesian inferences and employs model validation tests to evaluate the credibility and prediction reliability of the surrogate models under uncertainty. Leveraging these principles, OPAL- surrogate introduces a systematic and efficient strategy for balancing the trade-off between model complexity, accuracy, and prediction uncertainty. The effectiveness of OPAL-surrogate is demonstrated through two modeling problems, including the deformation of porous materials for building insulation and turbulent combustion flow for ablation of solid fuels within hybrid rocket motors.more » « less
An official website of the United States government

