skip to main content

Search for: All records

Creators/Authors contains: "Jha, Susmit"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. null (Ed.)
    Several methods have recently been developed for computing attributions of a neural network's prediction over the input features. However, these existing approaches for computing attributions are noisy and not robust to small perturbations of the input. This paper uses the recently identified connection between dynamical systems and residual neural networks to show that the attributions computed over neural stochastic differential equations (SDEs) are less noisy, visually sharper, and quantitatively more robust. Using dynamical systems theory, we theoretically analyze the robustness of these attributions. We also experimentally demonstrate the efficacy of our approach in providing smoother, visually sharper and quantitatively robust attributions by computing attributions for ImageNet images using ResNet-50, WideResNet-101 models and ResNeXt-101 models.
  2. Deep neural networks (DNNs) are known to produce incorrect predictions with very high confidence on out-of-distribution inputs (OODs). This limitation is one of the key challenges in the adoption of DNNs in high-assurance systems such as autonomous driving, air traffic management, and medical diagnosis. This challenge has received significant attention recently, and several techniques have been developed to detect inputs where the model’s prediction cannot be trusted. These techniques detect OODs as datapoints with either high epistemic uncertainty or high aleatoric uncertainty. We demonstrate the difference in the detection ability of these techniques and propose an ensemble approach for detection of OODs as datapoints with high uncertainty (epistemic or aleatoric). We perform experiments on vision datasets with multiple DNN architectures, achieving state-of-the-art results in most cases.
  3. The functions of an autonomous system can generally be partitioned into those concerned with perception and those concerned with action. Perception builds and maintains an internal model of the world (i.e., the system's environment) that is used to plan and execute actions to accomplish a goal established by human supervisors. Accordingly, assurance decomposes into two parts: a) ensuring that the model is an accurate representation of the world as it changes through time and b) ensuring that the actions are safe (and e ective), given the model. Both perception and action may employ AI, including machine learning (ML), and these present challenges to assurance. However, it is usually feasible to guard the actions with traditionally engineered and assured monitors, and thereby ensure safety, given the model. Thus, the model becomes the central focus for assurance. We propose an architecture and methods to ensure the accuracy of models derived from sensors whose interpretation uses AI and ML. Rather than derive the model from sensors bottom-up, we reverse the process and use the model to predict sensor interpretation. Small prediction errors indicate the world is evolving as expected and the model is updated accordingly. Large prediction errors indicate surprise, which may bemore »due to errors in sensing or interpretation, or unexpected changes in the world (e.g., a pedestrian steps into the road). The former initiate error masking or recovery, while the latter requires revision to the model. Higher-level AI functions assist in diagnosis and execution of these tasks. Although this two-level architecture where the lower level does \predictive processing" and the upper performs more re ective tasks, both focused on maintenance of a world model, is derived by engineering considerations, it also matches a widely accepted theory of human cognition.« less
  4. We present TrojDRL, a tool for exploring and evaluating backdoor attacks on deep reinforcement learning agents.TrojDRL exploits the sequential nature of deep reinforcement learning (DRL) and considers different gradations of threat models. We show that untargeted attacks on state-of-the-art actor-critic algorithms can circumvent existing defenses built on the assumption of backdoors being targeted. We evaluated TrojDRL on a broad set of DRL benchmarks and showed that the attacks require only poisoning as little as 0.025% of the training data. Compared with existing works of backdoor attacks on classification models, TrojDRL provides a first step towards understanding the vulnerability of DRL agents.
  5. Given a Boolean formula ϕ(x) in conjunctive normal form (CNF), the density of states counts the number of variable assignments that violate exactly e clauses, for all values of e. Thus, the density of states is a histogram of the number of unsatisfied clauses over all possible assignments. This computation generalizes both maximum-satisfiability (MAX-SAT) and model counting problems and not only provides insight into the entire solution space, but also yields a measure for the hardness of the problem instance. Consequently, in real-world scenarios, this problem is typically infeasible even when using state-of-the-art algorithms. While finding an exact answer to this problem is a computationally intensive task, we propose a novel approach for estimating density of states based on the concentration of measure inequalities. The methodology results in a quadratic unconstrained binary optimization (QUBO), which is particularly amenable to quantum annealing-based solutions. We present the overall approach and compare results from the D-Wave quantum annealer against the best-known classical algorithms such as the Hamze-de Freitas-Selby (HFS) algorithm and satisfiability modulo theory (SMT) solvers.
  6. ML algorithms or models, especially deep neural networks (DNNs), have shown significant promise in several areas. However, recently researchers have demonstrated that ML algorithms, especially DNNs, are vulnerable to adversarial examples (slightly perturbed samples that cause mis-classification). Existence of adversarial examples has hindered deployment of ML algorithms in safety-critical sectors, such as security. Several defenses for adversarial examples exist in the literature. One of the important classes of defenses are manifold-based defenses, where a sample is “pulled back” into the data manifold before classifying. These defenses rely on the manifold assumption (data lie in a manifold of lower dimension than the input space). These defenses use a generative model to approximate the input distribution. This paper asks the following question: do the generative models used in manifold-based defenses need to be topology-aware? Our paper suggests the answer is yes. We provide theoretical and empirical evidence to support our claim.
  7. In this tutorial, we present our recent work on building trusted, resilient and interpretable AI models by combining symbolic methods developed for automated reasoning with connectionist learning methods that use deep neural networks. The increasing adoption of artificial intelligence and machine learning in systems, including safety-critical systems, has created a pressing need for developing scalable techniques that can be used to establish trust over their safe behavior, resilience to adversarial attacks, and interpretability to enable human audits. This tutorial is comprised of three components: review of techniques for verification of neural networks, methods for using geometric invariants to defend against adversarial attacks, and techniques for extracting logical symbolic rules by reverse engineering machine learning models. These techniques form the core of TRINITY: Trusted, Resilient and Interpretable AI framework being developed at SRI. In this tutorial, we identify the key challenges in building the TRINITY framework, and report recent results on each of these three fronts.
  8. Shared intentionality is a critical component in developing conscious AI agents capable of collaboration, self-reflection, deliberation, and reasoning. We formulate inference of shared intentionality as an inverse reinforcement learning problem with logical reward specifications. We show how the approach can infer task descriptions from demonstrations. We also extend our approach to actively convey intentionality. We demonstrate the approach on a simple grid-world example.
  9. We present an approach for the synthesis and verification of neural network controllers for closed loop dynamical systems, modelled as an ordinary differential equation. Feedforward neural networks are ubiquitous when it comes to approximating functions, especially in the machine learning literature. The proposed verification technique tries to construct an over-approximation of the system trajectories using a combination of tools, such as, Sherlock and Flow*. In addition to computing reach sets, we incorporate counter examples or bad traces into the synthesis phase of the controller as well. We go back and forth between verification and counter example generation until the system outputs a fully verified controller, or the training fails to terminate in a neural network which is compliant with the desired specifications. We demonstrate the effectiveness of our approach over a suite of benchmarks ranging from 2 to 17 variables.
  10. We propose a novel passive learning approach, TeLex, to infer signal temporal logic (STL) formulas that characterize the behavior of a dynamical system using only observed signal traces of the system. First, we present a template-driven learning approach that requires two inputs: a set of observed traces and a template STL formula. The unknown parameters in the template can include time-bounds of the temporal operators, as well as the thresholds in the inequality predicates. TeLEx finds the value of the unknown parameters such that the synthesized STL property is satisfied by all the provided traces and it is tight. This requirement of tightness is essential to generating interesting properties when only positive examples are provided and there is no option to actively query the dynamical system to discover the boundaries of legal behavior. We propose a novel quantitative semantics for satisfaction of STL properties which enables TeLEx to learn tight STL properties without multidimensional optimization. The proposed new metric is also smooth. This is critical to enable the use of gradient-based numerical optimization engines and it produces a 30x to 100x speed-up with respect to the state-of-art gradient-free optimization. Second, we present a novel technique for automatically learning the structuremore »of the STL formula by incrementally constructing more complex formula guided by the robustness metric of subformula. We demonstrate the effectiveness of the overall approach for learning STL formulas from only positive examples on a set of synthetic and real-world benchmarks.« less