skip to main content


This content will become publicly available on March 25, 2025

Title: Uncertainty Regularized Evidential Regression

The Evidential Regression Network (ERN) represents a novel approach that integrates deep learning with Dempster-Shafer's theory to predict a target and quantify the associated uncertainty. Guided by the underlying theory, specific activation functions must be employed to enforce non-negative values, which is a constraint that compromises model performance by limiting its ability to learn from all samples. This paper provides a theoretical analysis of this limitation and introduces an improvement to overcome it. Initially, we define the region where the models can't effectively learn from the samples. Following this, we thoroughly analyze the ERN and investigate this constraint. Leveraging the insights from our analysis, we address the limitation by introducing a novel regularization term that empowers the ERN to learn from the whole training set. Our extensive experiments substantiate our theoretical findings and demonstrate the effectiveness of the proposed solution.

 
more » « less
Award ID(s):
2045848 2319450
NSF-PAR ID:
10518773
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
Proceedings of the AAAI Conference on Artificial Intelligence
Date Published:
Journal Name:
Proceedings of the AAAI Conference on Artificial Intelligence
Volume:
38
Issue:
15
ISSN:
2159-5399
Page Range / eLocation ID:
16460 to 16468
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We extend the learning from demonstration paradigm by providing a method for learning unknown constraints shared across tasks, using demonstrations of the tasks, their cost functions, and knowledge of the system dynamics and control constraints. Given safe demonstrations, our method uses hit-and-run sampling to obtain lower cost, and thus unsafe, trajectories. Both safe and unsafe trajectories are used to obtain a consistent representation of the unsafe set via solving an integer program. Our method generalizes across system dynamics and learns a guaranteed subset of the constraint. In addition, by leveraging a known parameterization of the constraint, we modify our method to learn parametric constraints in high dimensions. We also provide theoretical analysis on what subset of the constraint and safe set can be learnable from safe demonstrations. We demonstrate our method on linear and nonlinear system dynamics, show that it can be modified to work with suboptimal demonstrations, and that it can also be used to learn constraints in a feature space. 
    more » « less
  2. Summary

    The classical theory of stomatal optimization stipulates that stomata should act to maximize photosynthesis while minimizing transpiration. This theory, despite its remarkable success in reproducing empirical patterns, does not account for the fact that the available water to plants is dynamically regulated by plants themselves, and that plants compete for water in most locations.

    Here, we develop an alternative theory in which plants maximize the expected carbon gain under stochastic rainfall in a competitive environment. We further incorporate xylem hydraulic limitation as an additional constraint to transpiration and evaluate its impacts on stomatal optimization by incorporating the direct carbon cost of xylem recovery and the opportunity cost of reduced future photosynthesis as a result of irrecoverable xylem damage.

    We predict stomatal behaviour to be more conservative with a higher cost induced by xylem damage. By varying the unit carbon cost and extent of xylem recovery, characterizing the direct and opportunity cost of xylem damage, respectively, our model can reproduce several key patterns of stomatal‐hydraulic trait covariations.

    By addressing the key elements of water limitation in plant gas exchange simultaneously, including plants’ self‐regulation of water availability, competition for water and hydraulic risk, our study provides a comprehensive theoretical basis for understanding stomatal behaviour.

     
    more » « less
  3. Daumé III, Hal ; Singh, Aarti (Ed.)
    Thompson sampling for multi-armed bandit problems is known to enjoy favorable performance in both theory and practice. However, its wider deployment is restricted due to a significant computational limitation: the need for samples from posterior distributions at every iteration. In practice, this limitation is alleviated by making use of approximate sampling methods, yet provably incorporating approximate samples into Thompson Sampling algorithms remains an open problem. In this work we address this by proposing two efficient Langevin MCMC algorithms tailored to Thompson sampling. The resulting approximate Thompson Sampling algorithms are efficiently implementable and provably achieve optimal instance-dependent regret for the Multi-Armed Bandit (MAB) problem. To prove these results we derive novel posterior concentration bounds and MCMC convergence rates for log-concave distributions which may be of independent interest. 
    more » « less
  4. Understanding the neural basis of the remarkable human cognitive capacity to learn novel concepts from just one or a few sensory experiences constitutes a fundamental problem. We propose a simple, biologically plausible, mathematically tractable, and computationally powerful neural mechanism for few-shot learning of naturalistic concepts. We posit that the concepts that can be learned from few examples are defined by tightly circumscribed manifolds in the neural firing-rate space of higher-order sensory areas. We further posit that a single plastic downstream readout neuron learns to discriminate new concepts based on few examples using a simple plasticity rule. We demonstrate the computational power of our proposal by showing that it can achieve high few-shot learning accuracy on natural visual concepts using both macaque inferotemporal cortex representations and deep neural network (DNN) models of these representations and can even learn novel visual concepts specified only through linguistic descriptors. Moreover, we develop a mathematical theory of few-shot learning that links neurophysiology to predictions about behavioral outcomes by delineating several fundamental and measurable geometric properties of neural representations that can accurately predict the few-shot learning performance of naturalistic concepts across all our numerical simulations. This theory reveals, for instance, that high-dimensional manifolds enhance the ability to learn new concepts from few examples. Intriguingly, we observe striking mismatches between the geometry of manifolds in the primate visual pathway and in trained DNNs. We discuss testable predictions of our theory for psychophysics and neurophysiological experiments.

     
    more » « less
  5. Many cyber attack actions can be observed but the observables often exhibit intricate feature dependencies, non-homogeneity, and potential for rare yet critical samples. This work tests the ability to model and synthesize cyber intrusion alerts through Generative Adversarial Networks (GANs), which explore the feature space through reconciling between randomly generated samples and the given data that reflects a mixture of diverse attack behaviors. Through a comprehensive analysis using Jensen-Shannon Divergence (JSD), conditional and joint entropy, and mode drops and additions, we show that the Wasserstein-GAN with Gradient Penalty and Mutual Information (WGAN-GPMI) is more effective in learning to generate realistic alerts than models without Mutual Information constraints. The added Mutual Information constraint pushes the model to explore the feature space more thoroughly and increases the generation of low probability yet critical alert features. By mapping alerts to a set of attack stages it is shown that the output of these low probability alerts has a direct contextual meaning for cyber security analysts. Overall, our results show the promising novel use of GANs to learn from limited yet diverse intrusion alerts to generate synthetic ones that emulate critical dependencies, opening the door to data driven network threat models. 
    more » « less