Abstract Neural networks (NN) have become an important tool for prediction tasks—both regression and classification—in environmental science. Since many environmental-science problems involve life-or-death decisions and policy making, it is crucial to provide not only predictions but also an estimate of the uncertainty in the predictions. Until recently, very few tools were available to provide uncertainty quantification (UQ) for NN predictions. However, in recent years the computer-science field has developed numerous UQ approaches, and several research groups are exploring how to apply these approaches in environmental science. We provide an accessible introduction to six of these UQ approaches, then focus on tools for the next step, namely, to answer the question:Once we obtain an uncertainty estimate (using any approach), how do we know whether it is good or bad?To answer this question, we highlight four evaluation graphics and eight evaluation scores that are well suited for evaluating and comparing uncertainty estimates (NN based or otherwise) for environmental-science applications. We demonstrate the UQ approaches and UQ-evaluation methods for two real-world problems: 1) estimating vertical profiles of atmospheric dewpoint (a regression task) and 2) predicting convection over Taiwan based onHimawari-8satellite imagery (a classification task). We also provide Jupyter notebooks with Python code for implementing the UQ approaches and UQ-evaluation methods discussed herein. This article provides the environmental-science community with the knowledge and tools to start incorporating the large number of emerging UQ methods into their research. Significance StatementNeural networks are used for many environmental-science applications, some involving life-or-death decision-making. In recent years new methods have been developed to provide much-needed uncertainty estimates for NN predictions. We seek to accelerate the adoption of these methods in the environmental-science community with an accessible introduction to 1) methods for computing uncertainty estimates in NN predictions and 2) methods for evaluating such estimates.
more »
« less
This content will become publicly available on June 30, 2026
Uncertainty-guided Active Learning with Theoretical Bounds on Label Complexity
In recent years, neural networks (NNs) have been embraced by several scientific and engineering disciplines for diverse modeling and inferencing applications. The importance of quantifying the confidence in NN predictions has escalated due to the increasing adoption of these decision models. Nevertheless, conventional NN do not furnish uncertainty estimates associated with their predictions and are therefore ill-calibrated. Uncertainty quantification techniques offer probability distributions or CIs to represent the uncertainty associated with NN predictions, instead of solely presenting the point predictions/estimates. Once the uncertainty in NN is quantified, it is crucial to leverage this information to modify training objectives and improve the accuracy and reliability of the corresponding decision models. This work presents a novel framework to utilize the knowledge of input and output uncertainties in NN to guide querying process in the context of Active Learning. We also derive the lower and upper bounds for label complexity. The efficacy of the proposed framework is established by conducting experiments across classification and regression tasks.
more »
« less
- Award ID(s):
- 2129617
- PAR ID:
- 10657830
- Publisher / Repository:
- ACM Transactions on Probabilistic Machine Learning
- Date Published:
- Journal Name:
- ACM Transactions on Probabilistic Machine Learning
- Volume:
- 1
- Issue:
- 2
- ISSN:
- 2836-8924
- Page Range / eLocation ID:
- 1 to 21
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Cyber-physical systems are starting to adopt neural network (NN) models for a variety of smart sensing applications. While several efforts seek better NN architectures for system performance improvement, few attempts have been made to study the deployment of these systems in the field. Proper deployment of these systems is critical to achieving ideal performance, but the current practice is largely empirical via trials and errors, lacking a measure of quality. Sensing quality should reflect the impact on the performance of NN models that drive machine perception tasks. However, traditional approaches either evaluate statistical difference that exists objectively, or model the quality subjectively via human perception. In this work, we propose an efficient sensing quality measure requiring limited data samples using smart voice sensing system as an example. We adopt recent techniques in uncertainty evaluation for NN to estimate audio sensing quality. Intuitively, a deployment at better sensing location should lead to less uncertainty in NN predictions. We design SQEE, Sensing Quality Evaluation at the Edge for NN models, which constructs a model ensemble through Monte-Carlo dropout and estimates posterior total uncertainty via average conditional entropy. We collected data from three indoor environments, with a total of 148 transmitting-receiving (t-r) locations experimented and more than 7,000 examples tested. SQEE achieves the best performance in terms of the top-1 ranking accuracy---whether the measure finds the best spot for deployment, in comparison with other uncertainty strategies. We implemented SQEE on a ReSpeaker to study SQEE's real-world efficacy. Experimental result shows that SQEE can effectively evaluate the data collected from each t-r location pair within 30 seconds and achieve an average top-3 ranking accuracy of over 94%. We further discuss generalization of our framework to other sensing schemes.more » « less
-
In hydrology, modeling streamflow remains a challenging task due to the limited availability of basin characteristics information such as soil geology and geomorphology. These characteristics may be noisy due to measurement errors or may be missing altogether. To overcome this challenge, we propose a knowledge-guided, probabilistic inverse modeling method for recovering physical characteristics from streamflow and weather data, which are more readily available. We compare our framework with state-of-the-art inverse models for estimating river basin characteristics. We also show that these estimates offer improvement in streamflow modeling as opposed to using the original basin characteristic values. Our inverse model offers a 3% improvement in R2 for the inverse model (basin characteristic estimation) and 6% for the forward model (streamflow prediction). Our framework also offers improved explainability since it can quantify uncertainty in both the inverse and the forward model. Uncertainty quantification plays a pivotal role in improving the explainability of machine learning models by providing additional insights into the reliability and limitations of model predictions. In our analysis, we assess the quality of the uncertainty estimates. Compared to baseline uncertainty quantification methods, our framework offers a 10% improvement in the dispersion of epistemic uncertainty and a 13% improvement in coverage rate. This information can help stakeholders understand the level of uncertainty associated with the predictions and provide a more comprehensive view of the potential outcomes.more » « less
-
null (Ed.)Traditional deep neural networks (NNs) have significantly contributed to the state-of-the-art performance in the task of classification under various application domains. However, NNs have not considered inherent uncertainty in data associated with the class probabilities where misclassification under uncertainty may easily introduce high risk in decision making in real-world contexts (e.g., misclassification of objects in roads leads to serious accidents). Unlike Bayesian NN that indirectly infer uncertainty through weight uncertainties, evidential NNs (ENNs) have been recently proposed to explicitly model the uncertainty of class probabilities and use them for classification tasks. An ENN offers the formulation of the predictions of NNs as subjective opinions and learns the function by collecting an amount of evidence that can form the subjective opinions by a deterministic NN from data. However, the ENN is trained as a black box without explicitly considering inherent uncertainty in data with their different root causes, such as vacuity (i.e., uncertainty due to a lack of evidence) or dissonance (i.e., uncertainty due to conflicting evidence). By considering the multidimensional uncertainty, we proposed a novel uncertainty-aware evidential NN called WGAN-ENN (WENN) for solving an out-of-distribution (OOD) detection problem. We took a hybrid approach that combines Wasserstein Generative Adversarial Network (WGAN) with ENNs to jointly train a model with prior knowledge of a certain class, which has high vacuity for OOD samples. Via extensive empirical experiments based on both synthetic and real-world datasets, we demonstrated that the estimation of uncertainty by WENN can significantly help distinguish OOD samples from boundary samples. WENN outperformed in OOD detection when compared with other competitive counterpartsmore » « less
-
Abstract Environmental decisions with substantial social and environmental implications are regularly informed by model predictions, incurring inevitable uncertainty. The selection of a set of model predictions to inform a decision is usually based on model performance, measured by goodness‐of‐fit metrics. Yet goodness‐of‐fit metrics have a questionable relationship to a model's value to end users, particularly when validation data are themselves uncertain. For example, decisions based on flow frequency models are not necessarily improved by adopting models with the best overall goodness of fit. We propose an alternative model evaluation approach based on the conditional value of sample information, first defined in 1961, which has found extensive use in sampling design optimization but which has not previously been used for model evaluation. The metric uses observations from a validation set to estimate the expected monetary costs associated with model prediction uncertainties. A model is only considered superior to alternatives if (i) its predictions reduce these costs and (ii) sufficient validation data are available to distinguish its performance from alternative models. By describing prediction uncertainties in monetary terms, the metric facilitates the communication of prediction uncertainty by end users, supporting the inclusion of uncertainty analysis in decision making.more » « less
An official website of the United States government
