Children’s automatic speech recognition (ASR) is always difficult due to, in part, the data scarcity problem, especially for kindergarten-aged kids. When data are scarce, the model might overfit to the training data, and hence good starting points for training are essential. Recently, meta-learning was proposed to learn model initialization (MI) for ASR tasks of different languages. This method leads to good performance when the model is adapted to an unseen language. How-ever, MI is vulnerable to overfitting on training tasks (learner overfitting). It is also unknown whether MI generalizes to other low-resource tasks. In this paper, we validate the effectiveness of MI in children’s ASR and attempt to alleviate the problem of learner overfitting. To achieve model-agnostic meta-learning (MAML), we regard children’s speech at each age as a different task. In terms of learner overfitting, we propose a task-level augmentation method by simulating new ages using frequency warping techniques. Detailed experiments are conducted to show the impact of task augmentation on each age for kindergarten-aged speech. As a result, our approach achieves a relative word error rate (WER) improvement of 51% over the baseline system with no augmentation or initialization.
more »
« less
Learning to Generate Image Source-Agnostic Universal Adversarial Perturbations
Adversarial perturbations are critical for certifying the robustness of deep learning models. A ``universal adversarial perturbation'' (UAP) can simultaneously attack multiple images, and thus offers a more unified threat model, obviating an image-wise attack algorithm. However, the existing UAP generator is underdeveloped when images are drawn from different image sources (e.g., with different image resolutions). Towards an authentic universality across image sources, we take a novel view of UAP generation as a customized instance of ``few-shot learning'', which leverages bilevel optimization and learning-to-optimize (L2O) techniques for UAP generation with improved attack success rate (ASR). We begin by considering the popular model agnostic meta-learning (MAML) framework to meta-learn a UAP generator. However, we see that the MAML framework does not directly offer the universal attack across image sources, requiring us to integrate it with another meta-learning framework of L2O. The resulting scheme for meta-learning a UAP generator (i) has better performance (50% higher ASR) than baselines such as Projected Gradient Descent, (ii) has better performance (37% faster) than the vanilla L2O and MAML frameworks (when applicable), and (iii) is able to simultaneously handle UAP generation for different victim models and data sources.
more »
« less
- Award ID(s):
- 1932351
- PAR ID:
- 10378384
- Date Published:
- Journal Name:
- Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence (IJCAI-22)
- Page Range / eLocation ID:
- 1714 to 1720
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
With rich visual data, such as images, becoming readily associated with items, visually-aware recommendation systems (VARS) have been widely used in different applications. Recent studies have shown that VARS are vulnerable to item-image adversarial attacks, which add human-imperceptible perturbations to the clean images associated with those items. Attacks on VARS pose new security challenges to a wide range of applications, such as e-commerce and social media, where VARS are widely used. How to secure VARS from such adversarial attacks becomes a critical problem. Currently, there is still a lack of systematic studies on how to design defense strategies against visual attacks on VARS. In this article, we attempt to fill this gap by proposing anadversarial image denoising and detectionframework to secure VARS. Our proposed method can simultaneously (1) secure VARS from adversarial attacks characterized bylocalperturbations by image denoising based onglobalvision transformers; and (2) accurately detect adversarial examples using a novel contrastive learning approach. Meanwhile, our framework is designed to be used as both a filter and a detector so that they can bejointlytrained to improve the flexibility of our defense strategy to a variety of attacks and VARS models. Our approach is uniquely tailored for VARS, addressing the distinct challenges in scenarios where adversarial attacks can differ across industries, for instance, causing misclassification in e-commerce or misrepresentation in real estate. We have conducted extensive experimental studies with two popular attack methods (FGSM and PGD). Our experimental results on two real-world datasets show that our defense strategy against visual attacks is effective and outperforms existing methods on different attacks. Moreover, our method demonstrates high accuracy in detecting adversarial examples, complementing its robustness across various types of adversarial attacks.more » « less
-
This paper considers the trajectory design problem for unmanned aerial vehicles (UAVs) via meta-reinforcement learning. It is assumed that the UAV can move in different directions to explore a specific area and collect data from the ground nodes (GNs) located in the area. The goal of the UAV is to reach the destination and maximize the total data collected during the flight on the trajectory while avoiding collisions with other UAVs. In the literature on UAV trajectory designs, vanilla learning algorithms are typically used to train a task-specific model, and provide near-optimal solutions for a specific spatial distribution of the GNs. However, this approach requires retraining from scratch when the locations of the GNs vary. In this work, we propose a meta reinforcement learning framework that incorporates the method of Model-Agnostic Meta-Learning (MAML). Instead of training task-specific models, we train a common initialization for different distributions of GNs and different channel conditions. From the initialization, only a few gradient descents are required for adapting to different tasks with different GN distributions and channel conditions. Additionally, we also explore when the proposed MAML framework is preferred and can outperform the compared algorithms.more » « less
-
Learning to optimize (L2O) has gained increasing popularity, which automates the design of optimizers by data-driven approaches. However, current L2O methods often suffer from poor generalization performance in at least two folds: (i) applying the L2O-learned optimizer to unseen optimizees, in terms of lowering their loss function values (optimizer generalization, or “generalizable learning of optimizers”); and (ii) the test performance of an optimizee (itself as a machine learning model), trained by the optimizer, in terms of the accuracy over unseen data (optimizee generalization, or “learning to generalize”). While the optimizer generalization has been recently studied, the optimizee generalization (or learning to generalize) has not been rigorously studied in the L2O context, which is the aim of this paper. We first theoretically establish an implicit connection between the local entropy and the Hessian, and hence unify their roles in the handcrafted design of generalizable optimizers as equivalent metrics of the landscape flatness of loss functions. We then propose to incorporate these two metrics as flatness-aware regularizers into the L2O framework in order to meta-train optimizers to learn to generalize, and theoretically show that such generalization ability can be learned during the L2O meta-training process and then transformed to the optimizee loss function. Extensive experiments consistently validate the effectiveness of our proposals with substantially improved generalization on multiple sophisticated L2O models and diverse optimizees.more » « less
-
The deep neural network (DNN) model for computer vision tasks (object detection and classification) is widely used in autonomous vehicles, such as driverless cars and unmanned aerial vehicles. However, DNN models are shown to be vulnerable to adversarial image perturbations. The generation of adversarial examples against inferences of DNNs has been actively studied recently. The generation typically relies on optimizations taking an entire image frame as the decision variable. Hence, given a new image, the computationally expensive optimization needs to start over as there is no learning between the independent optimizations. Very few approaches have been developed for attacking online image streams while taking into account the underlying physical dynamics of autonomous vehicles, their mission, and the environment. The article presents a multi-level reinforcement learning framework that can effectively generate adversarial perturbations to misguide autonomous vehicles’ missions. In the existing image attack methods against autonomous vehicles, optimization steps are repeated for every image frame. This framework removes the need for fully converged optimization at every frame. Using multi-level reinforcement learning, we integrate a state estimator and a generative adversarial network that generates the adversarial perturbations. Due to the reinforcement learning agent consisting of state estimator, actor, and critic that only uses image streams, the proposed framework can misguide the vehicle to increase the adversary’s reward without knowing the states of the vehicle and the environment. Simulation studies and a robot demonstration are provided to validate the proposed framework’s performance.more » « less
An official website of the United States government

