skip to main content


Title: MELPF version 1: Modeling Error Learning based Post-Processor Framework for Hydrologic Models Accuracy Improvement
Abstract. This paper studies how to improve the accuracy of hydrologic models using machine-learning models as post-processors and presents possibilities to reduce the workload to create an accurate hydrologic model by removing the calibration step. It is often challenging to develop an accurate hydrologic model due to the time-consuming model calibration procedure and the nonstationarity of hydrologic data. Our findings show that the errors of hydrologic models are correlated with model inputs. Thus motivated, we propose a modeling-error-learning-based post-processor framework by leveraging this correlation to improve the accuracy of a hydrologic model. The key idea is to predict the differences (errors) between the observed values and the hydrologic model predictions by using machine-learning techniques. To tackle the nonstationarity issue of hydrologic data, a moving-window-based machine-learning approach is proposed to enhance the machine-learning error predictions by identifying the local stationarity of the data using a stationarity measure developed based on the Hilbert–Huang transform. Two hydrologic models, the Precipitation–Runoff Modeling System (PRMS) and the Hydrologic Modeling System (HEC-HMS), are used to evaluate the proposed framework. Two case studies are provided to exhibit the improved performance over the original model using multiple statistical metrics.  more » « less
Award ID(s):
1838024
NSF-PAR ID:
10129638
Author(s) / Creator(s):
; ; ; ; ;
Date Published:
Journal Name:
Geoscientific Model Development
Volume:
12
Issue:
9
ISSN:
1991-9603
Page Range / eLocation ID:
4115 to 4131
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Machine learning (ML) has been applied to space weather problems with increasing frequency in recent years, driven by an influx of in-situ measurements and a desire to improve modeling and forecasting capabilities throughout the field. Space weather originates from solar perturbations and is comprised of the resulting complex variations they cause within the numerous systems between the Sun and Earth. These systems are often tightly coupled and not well understood. This creates a need for skillful models with knowledge about the confidence of their predictions. One example of such a dynamical system highly impacted by space weather is the thermosphere, the neutral region of Earth’s upper atmosphere. Our inability to forecast it has severe repercussions in the context of satellite drag and computation of probability of collision between two space objects in low Earth orbit (LEO) for decision making in space operations. Even with (assumed) perfect forecast of model drivers, our incomplete knowledge of the system results in often inaccurate thermospheric neutral mass density predictions. Continuing efforts are being made to improve model accuracy, but density models rarely provide estimates of confidence in predictions. In this work, we propose two techniques to develop nonlinear ML regression models to predict thermospheric density while providing robust and reliable uncertainty estimates: Monte Carlo (MC) dropout and direct prediction of the probability distribution, both using the negative logarithm of predictive density (NLPD) loss function. We show the performance capabilities for models trained on both local and global datasets. We show that the NLPD loss provides similar results for both techniques but the direct probability distribution prediction method has a much lower computational cost. For the global model regressed on the Space Environment Technologies High Accuracy Satellite Drag Model (HASDM) density database, we achieve errors of approximately 11% on independent test data with well-calibrated uncertainty estimates. Using an in-situ CHAllenging Minisatellite Payload (CHAMP) density dataset, models developed using both techniques provide test error on the order of 13%. The CHAMP models—on validation and test data—are within 2% of perfect calibration for the twenty prediction intervals tested. We show that this model can also be used to obtain global density predictions with uncertainties at a given epoch.

     
    more » « less
  2. Abstract Background

    Improving the prediction ability of a human-machine interface (HMI) is critical to accomplish a bio-inspired or model-based control strategy for rehabilitation interventions, which are of increased interest to assist limb function post neurological injuries. A fundamental role of the HMI is to accurately predict human intent by mapping signals from a mechanical sensor or surface electromyography (sEMG) sensor. These sensors are limited to measuring the resulting limb force or movement or the neural signal evoking the force. As the intermediate mapping in the HMI also depends on muscle contractility, a motivation exists to include architectural features of the muscle as surrogates of dynamic muscle movement, thus further improving the HMI’s prediction accuracy.

    Objective

    The purpose of this study is to investigate a non-invasive sEMG and ultrasound (US) imaging-driven Hill-type neuromuscular model (HNM) for net ankle joint plantarflexion moment prediction. We hypothesize that the fusion of signals from sEMG and US imaging results in a more accurate net plantarflexion moment prediction than sole sEMG or US imaging.

    Methods

    Ten young non-disabled participants walked on a treadmill at speeds of 0.50, 0.75, 1.00, 1.25, and 1.50 m/s. The proposed HNM consists of two muscle-tendon units. The muscle activation for each unit was calculated as a weighted summation of the normalized sEMG signal and normalized muscle thickness signal from US imaging. The HNM calibration was performed under both single-speed mode and inter-speed mode, and then the calibrated HNM was validated across all walking speeds.

    Results

    On average, the normalized moment prediction root mean square error was reduced by 14.58 % ($$p=0.012$$p=0.012) and 36.79 % ($$p<0.001$$p<0.001) with the proposed HNM when compared to sEMG-driven and US imaging-driven HNMs, respectively. Also, the calibrated models with data from the inter-speed mode were more robust than those from single-speed modes for the moment prediction.

    Conclusions

    The proposed sEMG-US imaging-driven HNM can significantly improve the net plantarflexion moment prediction accuracy across multiple walking speeds. The findings imply that the proposed HNM can be potentially used in bio-inspired control strategies for rehabilitative devices due to its superior prediction.

     
    more » « less
  3. This article describes the design, development, and evaluation of an undergraduate learning module that builds students’ skills on how data analysis and numerical modeling can be used to analyze and design water resources engineering projects. The module follows a project-based approach by using a hydrologic restoration project in a coastal basin in south Louisiana, USA. The module has two main phases, a feasibility analysis phase and a hydraulic design phase, and follows an active learning approach where students perform a set of quantitative learning activities that involve extensive data and modeling analyses. The module is designed using open resources, including online datasets, hydraulic simulation models and geographical information system software that are typically used by the engineering industry and research communities. Upon completing the module, students develop skills that involve model formulation, parameter calibration, sensitivity analysis, and the use of data and models to assess and design a hydrologic a proposed hydrologic engineering project. Guided by design-based research framework, the implementation and evaluation of the module focused primarily on assessing students’ perceptions of the module usability and its design attributes, their perceived contribution of the module to their learning, and their overall receptiveness of the module and how it impacts their interest in the subject and future careers. Following an improvement-focused evaluation approach, design attributes that were found most critical to students included the use of user-support resources and self-checking mechanisms. These aspects were identified as key features that facilitate students’ self-learning and independent completion of tasks, while still enriching their learning experiences when using data and modeling-rich applications. Evaluation data showed that the following attributes contributed the most to students’ learning and potential value for future careers: application of modern engineering data analysis; use of real-world hydrologic datasets; and appreciation of uncertainties and challenges imposed by data scarcity. The evaluation results were used to formulate a set of guiding principles on how to design effective and conducive undergraduate learning experiences that adopt technology-enhanced and data and modeling- based strategies, on how to enhance users’ experiences with free and open-source engineering analysis tools, and on how to strike a pedagogical balance between module complexity, student engagement, and flexibility to fit within existing curricula limitations. 
    more » « less
  4. Abstract

    The World Magnetic Model (WMM) is a geomagnetic main field model that is widely used for navigation by governments, industry and the general public. In recent years, the model has been derived using high accuracy magnetometer data from the Swarm mission. This study explores the possibility of developing future WMMs in the post-Swarm era using data from the Iridium satellite constellation. Iridium magnetometers are primarily used for attitude control, so they are not designed to produce the same level of accuracy as magnetic data from scientific missions. Iridium magnetometer errors range from 30 nT quantization to hundreds of nT errors due to spacecraft contamination and calibration uncertainty, whereas Swarm measurements are accurate to about 1 nT. The calibration uncertainty in the Iridium measurements is identified as a major error source, and a method is developed to calibrate the spacecraft measurements using data from a subset of the INTERMAGNET observatory network producing quasi-definitive data on a regular basis. After calibration, the Iridium data produced main field models with approximately 20 nT average error and 40 nT maximum error as compared to the CHAOS-7.2 model. For many scientific and precision navigation applications, highly accurate Swarm-like measurements are still necessary, however, the Iridium-based models were shown to meet the WMM error tolerances, indicating that Iridium is a viable data source for future WMMs.

    Graphical Abstract

     
    more » « less
  5. Heavy rainfall leads to severe flooding problems with catastrophic socio-economic impacts worldwide. Hydrologic forecasting models have been applied to provide alerts of extreme flood events and reduce damage, yet they are still subject to many uncertainties due to the complexity of hydrologic processes and errors in forecasted timing and intensity of the floods. This study demonstrates the efficacy of using eXtreme Gradient Boosting (XGBoost) as a state-of-the-art machine learning (ML) model to forecast gauge stage levels at a 5-min interval with various look-out time windows. A flood alert system (FAS) built upon the XGBoost models is evaluated by two historical flooding events for a flood-prone watershed in Houston, Texas. The predicted stage values from the FAS are compared with observed values with demonstrating good performance by statistical metrics (RMSE and KGE). This study further compares the performance from two scenarios with different input data settings of the FAS: (1) using the data from the gauges within the study area only and (2) including the data from additional gauges outside of the study area. The results suggest that models that use the gauge information within the study area only (Scenario 1) are sufficient and advantageous in terms of their accuracy in predicting the arrival times of the floods. One of the benefits of the FAS outlined in this study is that the XGBoost-based FAS can run in a continuous mode to automatically detect floods without requiring an external starting trigger to switch on as usually required by the conventional event-based FAS systems. This paper illustrates a data-driven FAS framework as a prototype that stakeholders can utilize solely based on their gauging information for local flood warning and mitigation practices. 
    more » « less