skip to main content


Title: Empirically Measuring Concentration: Fundamental Limits on Intrinsic Robustness
Many recent works have shown that adversarial examples that fool classifiers can be found by minimally perturbing a normal input. Recent theoretical results, starting with Gilmer et al. (2018), show that if the inputs are drawn from a concentrated metric probability space, then adversarial examples with small perturbation are inevitable. A concentrated space has the property that any subset with Ω(1) (e.g., 1/100) measure, according to the imposed distribution, has small distance to almost all (e.g., 99/100) of the points in the space. It is not clear, however, whether these theoretical results apply to actual distributions such as images. This paper presents a method for empirically measuring and bounding the concentration of a concrete dataset which is proven to converge to the actual concentration. We use it to empirically estimate the intrinsic robustness to ℓ∞ and ℓ2 perturbations of several image classification benchmarks.  more » « less
Award ID(s):
1804603
NSF-PAR ID:
10110799
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
ICLR Workshop on Debugging Machine Learning Models
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Large-scale training of modern deep learning models heavily relies on publicly available data on the web. This potentially unauthorized usage of online data leads to concerns regarding data privacy. Recent works aim to make unlearnable data for deep learning models by adding small, specially designed noises to tackle this issue. However, these methods are vulnerable to adversarial training (AT) and/or are computationally heavy. In this work, we propose a novel, model-free, Convolution-based Unlearnable DAtaset (CUDA) generation technique. CUDA is generated using controlled class-wise convolutions with filters that are randomly generated via a private key. CUDA encourages the network to learn the relation between filters and labels rather than informative features for classifying the clean data. We develop some theoretical analysis demonstrating that CUDA can successfully poison Gaussian mixture data by reducing the clean data performance of the optimal Bayes classifier. We also empirically demonstrate the effectiveness of CUDA with various datasets (CIFAR-10, CIFAR-100, ImageNet-100, and Tiny-ImageNet), and architectures (ResNet-18, VGG-16, Wide ResNet-34-10, DenseNet-121, DeIT, EfficientNetV2-S, and MobileNetV2). Our experiments show that CUDA is robust to various data augmentations and training approaches such as smoothing, AT with different budgets, transfer learning, and fine-tuning. For instance, training a ResNet-18 on ImageNet-100 CUDA achieves only 8.96\%, 40.08\%, and 20.58\% clean test accuracies with empirical risk minimization (ERM), L_{\infty} AT, and L_{2} AT, respectively. Here, ERM on the clean training data achieves a clean test accuracy of 80.66\%. CUDA exhibits unlearnability effect with ERM even when only a fraction of the training dataset is perturbed. Furthermore, we also show that CUDA is robust to adaptive defenses designed specifically to break it. 
    more » « less
  2. Drop condensation and evaporation as a result of the gradient in vapor concentration are important in both engineering and natural systems. One of the interesting natural examples is transpiration on plant leaves. Most of the water in the inner space of the leaves escapes through stomata, whose rate depends on the surface topography and a difference in vapor concentrations inside and just outside of the leaves. Previous research on the vapor flux on various surfaces has focused on numerically solving the vapor diffusion equation or using scaling arguments based on a simple solution with a flat surface. In this present work, we present and discuss simple analytical solutions on various 2D surface shapes (e.g., semicylinder, semiellipse, hair). The method of solving the diffusion equation is to use the complex potential theory, which provides analytical solutions for vapor concentration and flux. We find that a high mass flux of vapor is formed near the top of the microstructures while a low mass flux is developed near the stomata at the leaf surface. Such a low vapor flux near the stomata may affect transpiration in two ways. First, condensed droplets on the stomata will not grow due to a low mass flux of vapor, which will not inhibit the gas exchange through the stomatal opening. Second, the low mass flux from the atmosphere will facilitate the release of highly concentrated vapor from the substomatal space. 
    more » « less
  3. Abstract. The tropical tropopause layer (TTL) is a sea of vertical motions. Convectively generated gravity waves create vertical winds on scales of a few to thousands of kilometers as they propagate in a stable atmosphere. Turbulence from gravity wave breaking, radiatively driven convection, and Kelvin–Helmholtz instabilities stirs up the TTL on the kilometer scale. TTL cirrus clouds, which moderate the water vapor concentration in the TTL and stratosphere, form in the cold phases of large-scale (> 100 km) wave activity. It has been proposed in several modeling studies that small-scale (< 100 km) vertical motions control the ice crystal number concentration and the dehydration efficiency of TTL cirrus clouds. Here, we present the first observational evidence for this. High-rate vertical winds measured by aircraft are a valuable and underutilized tool for constraining small-scale TTL vertical wind variability, examining its impacts on TTL cirrus clouds, and evaluating atmospheric models. We use 20 Hz data from five National Aeronautics and Space Administration (NASA) campaigns to quantify small-scale vertical wind variability in the TTL and to see how it varies with ice water content, distance from deep convective cores, and height in the TTL. We find that 1 Hz vertical winds are well represented by a normal distribution, with a standard deviation of 0.2–0.4 m s−1. Consistent with a previous observational study that analyzed two out of the five aircraft campaigns that we analyze here, we find that turbulence is enhanced over the tropical west Pacific and within 100 km of convection and is most common in the lower TTL (14–15.5 km), closer to deep convection, and in the upper TTL (15.5–17 km), further from deep convection. An algorithm to classify turbulence and long-wavelength (5 km < λ < 100 km) and short-wavelength (λ < 5 km) gravity wave activity during level flight legs is applied to data from the Airborne Tropical TRopopause EXperiment (ATTREX). The most commonly sampled conditions are (1) a quiescent atmosphere with negligible small-scale vertical wind variability, (2) long-wavelength gravity wave activity (LW GWA), and (3) LW GWA with turbulence. Turbulence rarely occurs in the absence of gravity wave activity. Cirrus clouds with ice crystal number concentrations exceeding 20 L−1 and ice water content exceeding 1 mg m−3 are rare in a quiescent atmosphere but about 20 times more likely when there is gravity wave activity and 50 times more likely when there is also turbulence, confirming the results of the aforementioned modeling studies. Our observational analysis shows that small-scale gravity waves strongly influence the ice crystal number concentration and ice water content within TTL cirrus clouds. Global storm-resolving models have recently been run with horizontal grid spacing between 1 and 10 km, which is sufficient to resolve some small-scale gravity wave activity. We evaluate simulated vertical wind spectra (10–100 km) from four global storm-resolving simulations that have horizontal grid spacing of 3–5 km with aircraft observations from ATTREX. We find that all four models have too little resolved vertical wind at horizontal wavelengths between 10 and 100 km and thus too little small-scale gravity wave activity, although the bias is much less pronounced in global SAM than in the other models. We expect that deficient small-scale gravity wave activity significantly limits the realism of simulated ice microphysics in these models and that improved representation requires moving to finer horizontal and vertical grid spacing. 
    more » « less
  4. Given a matrix D describing the pairwise dissimilarities of a data set, a common task is to embed the data points into Euclidean space. The classical multidimensional scaling (cMDS) algorithm is a widespread method to do this. However, theoretical analysis of the robustness of the algorithm and an in-depth analysis of its performance on non-Euclidean metrics is lacking. In this paper, we derive a formula, based on the eigenvalues of a matrix obtained from D, for the Frobenius norm of the difference between D and the metric Dcmds returned by cMDS. This error analysis leads us to the conclusion that when the derived matrix has a significant number of negative eigenvalues, then ∥D−Dcmds∥F, after initially decreasing, willeventually increase as we increase the dimension. Hence, counterintuitively, the quality of the embedding degrades as we increase the dimension. We empirically verify that the Frobenius norm increases as we increase the dimension for a variety of non-Euclidean metrics. We also show on several benchmark datasets that this degradation in the embedding results in the classification accuracy of both simple (e.g., 1-nearest neighbor) and complex (e.g., multi-layer neural nets) classifiers decreasing as we increase the embedding dimension.Finally, our analysis leads us to a new efficiently computable algorithm that returns a matrix Dl that is at least as close to the original distances as Dt (the Euclidean metric closest in ℓ2 distance). While Dl is not metric, when given as input to cMDS instead of D, it empirically results in solutions whose distance to D does not increase when we increase the dimension and the classification accuracy degrades less than the cMDS solution. 
    more » « less
  5. Asynchronous Gibbs sampling has been recently shown to be fast-mixing and an accurate method for estimating probabilities of events on a small number of variables of a graphical model satisfying Dobrushin's condition~\cite{DeSaOR16}. We investigate whether it can be used to accurately estimate expectations of functions of {\em all the variables} of the model. Under the same condition, we show that the synchronous (sequential) and asynchronous Gibbs samplers can be coupled so that the expected Hamming distance between their (multivariate) samples remains bounded by O(τlogn), where n is the number of variables in the graphical model, and τ is a measure of the asynchronicity. A similar bound holds for any constant power of the Hamming distance. Hence, the expectation of any function that is Lipschitz with respect to a power of the Hamming distance, can be estimated with a bias that grows logarithmically in n. Going beyond Lipschitz functions, we consider the bias arising from asynchronicity in estimating the expectation of polynomial functions of all variables in the model. Using recent concentration of measure results, we show that the bias introduced by the asynchronicity is of smaller order than the standard deviation of the function value already present in the true model. We perform experiments on a multi-processor machine to empirically illustrate our theoretical findings. 
    more » « less