skip to main content


Title: Incorporating Label Uncertainty in Understanding Adversarial Robustness.
A fundamental question in adversarial machine learning is whether a robust classifier exists for a given task. A line of research has made some progress towards this goal by studying the concentration of measure, but we argue standard concentration fails to fully characterize the intrinsic robustness of a classification problem since it ignores data labels which are essential to any classification task. Building on a novel definition of label uncertainty, we empirically demonstrate that error regions induced by state-of-the-art models tend to have much higher label uncertainty than randomly-selected subsets. This observation motivates us to adapt a concentration estimation algorithm to account for label uncertainty, resulting in more accurate intrinsic robustness measures for benchmark image classification problems.  more » « less
Award ID(s):
1804603
NSF-PAR ID:
10333900
Author(s) / Creator(s):
;
Date Published:
Journal Name:
International Conference on Learning Representations (ICLR)
Volume:
2022
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The increasing uncertainty of distributed energy resources promotes the risks of transient events for power systems. To capture event dynamics, Phasor Measurement Unit (PMU) data is widely utilized due to its high resolutions. Notably, Machine Learning (ML) methods can process PMU data with feature learning techniques to identify events. However, existing ML-based methods face the following challenges due to salient characteristics from both the measurement and the label sides: (1) PMU streams have a large size with redundancy and correlations across temporal, spatial, and measurement type dimensions. Nevertheless, existing work cannot effectively uncover the structural correlations to remove redundancy and learn useful features. (2) The number of event labels is limited, but most models focus on learning with labeled data, suffering risks of non-robustness to different system conditions. To overcome the above issues, we propose an approach called Kernelized Tensor Decomposition and Classification with Semi-supervision (KTDC-Se). Firstly, we show that the key is to tensorize data storage, information filtering via decomposition, and discriminative feature learning via classification. This leads to an efficient exploration of structural correlations via high-dimensional tensors. Secondly, the proposed KTDC-Se can incorporate rich unlabeled data to seek decomposed tensors invariant to varying operational conditions. Thirdly, we make KTDC-Se a joint model of decomposition and classification so that there are no biased selections of the two steps. Finally, to boost the model accuracy, we add kernels for non-linear feature learning. We demonstrate the KTDC-Se superiority over the state-of-the-art methods for event identification using PMU data. 
    more » « less
  2. Abstract

    We investigate how disagreement in natural language inference (NLI) annotation arises. We developed a taxonomy of disagreement sources with 10 categories spanning 3 high- level classes. We found that some disagreements are due to uncertainty in the sentence meaning, others to annotator biases and task artifacts, leading to different interpretations of the label distribution. We explore two modeling approaches for detecting items with potential disagreement: a 4-way classification with a “Complicated” label in addition to the three standard NLI labels, and a multilabel classification approach. We found that the multilabel classification is more expressive and gives better recall of the possible interpretations in the data.

     
    more » « less
  3. Many recent works have shown that adversarial examples that fool classifiers can be found by minimally perturbing a normal input. Recent theoretical results, starting with Gilmer et al. (2018), show that if the inputs are drawn from a concentrated metric probability space, then adversarial examples with small perturbation are inevitable. A concentrated space has the property that any subset with Ω(1) (e.g., 1/100) measure, according to the imposed distribution, has small distance to almost all (e.g., 99/100) of the points in the space. It is not clear, however, whether these theoretical results apply to actual distributions such as images. This paper presents a method for empirically measuring and bounding the concentration of a concrete dataset which is proven to converge to the actual concentration. We use it to empirically estimate the intrinsic robustness to ℓ∞ and ℓ2 perturbations of several image classification benchmarks. 
    more » « less
  4. Keathley, H. ; Enos, J. ; Parrish, M. (Ed.)
    The role of human-machine teams in society is increasing, as big data and computing power explode. One popular approach to AI is deep learning, which is useful for classification, feature identification, and predictive modeling. However, deep learning models often suffer from inadequate transparency and poor explainability. One aspect of human systems integration is the design of interfaces that support human decision-making. AI models have multiple types of uncertainty embedded, which may be difficult for users to understand. Humans that use these tools need to understand how much they should trust the AI. This study evaluates one simple approach for communicating uncertainty, a visual confidence bar ranging from 0-100%. We perform a human-subject online experiment using an existing image recognition deep learning model to test the effect of (1) providing single vs. multiple recommendations from the AI and (2) including uncertainty information. For each image, participants described the subject in an open textbox and rated their confidence in their answers. Performance was evaluated at four levels of accuracy ranging from the same as the image label to the correct category of the image. The results suggest that AI recommendations increase accuracy, even if the human and AI have different definitions of accuracy. In addition, providing multiple ranked recommendations, with or without the confidence bar, increases operator confidence and reduces perceived task difficulty. More research is needed to determine how people approach uncertain information from an AI system and develop effective visualizations for communicating uncertainty. 
    more » « less
  5. How can we collect the most useful labels to learn a model selection policy, when presented with arbitrary heterogeneous data streams? In this paper, we formulate this task as a contextual active model selection problem, where at each round the learner receives an unlabeled data point along with a context. The goal is to output the best model for any given context without obtaining an excessive amount of labels. In particular, we focus on the task of selecting pre-trained classifiers, and propose a contextual active model selection algorithm (CAMS), which relies on a novel uncertainty sampling query criterion defined on a given policy class for adaptive model selection. In comparison to prior art, our algorithm does not assume a globally optimal model. We provide rigorous theoretical analysis for the regret and query complexity under both adversarial and stochastic settings. Our experiments on several benchmark classification datasets demonstrate the algorithm’s effectiveness in terms of both regret and query complexity. Notably, to achieve the same accuracy, CAMS incurs less than 10% of the label cost when compared to the best online model selection baselines on CIFAR10. 
    more » « less