We study adversarially robust transfer learning, wherein, given labeled data on multiple (source) tasks, the goal is to train a model with small robust error on a previously unseen (target) task. In particular, we consider a multi-task representation learning (MTRL) setting, i.e., we assume that the source and target tasks admit a simple (linear) predictor on top of a shared representation (e.g., the final hidden layer of a deep neural network). In this general setting, we provide rates on the excess adversarial (transfer) risk for Lipschitz losses and smooth nonnegative losses. These rates show that learning a representation using adversarial training on diverse tasks helps protect against inference-time attacks in data-scarce environments. Additionally, we provide novel rates for the single-task setting.
more »
« less
Towards Explainable Networked Prediction
Networked prediction has attracted lots of research attention in recent years. Compared with the traditional learning setting, networked prediction is even harder to understand due to its coupled, \em multi-level nature. The learning process propagates top-down through the underlying network from the macro level (the entire learning system), to meso level (learning tasks), and to micro level (individual learning examples). In the meanwhile, the networked prediction setting also offers rich context to explain the learning process through the lens of \em multi-aspect, including training examples ( e.g., what are the most influential examples ), the learning tasks ( e.g., which tasks are most important ) and the task network ( e.g., which task connections are the keys ). Thus, we propose a multi-aspect, multi-level approach to explain networked prediction. The key idea is to efficiently quantify the influence on different levels of the learning system due to the perturbation of various aspects. The proposed method offers two distinctive advantages: (1) multi-aspect, multi-level: it is able to explain networked prediction from multiple aspects (i.e., example-task-network) at multiple levels (i.e., macro-meso-micro); (2) efficiency: it has a linear complexity by efficiently evaluating the influences of changes to the networked prediction without retraining.
more »
« less
- PAR ID:
- 10099223
- Date Published:
- Journal Name:
- CIKM '18 Proceedings of the 27th ACM International Conference on Information and Knowledge Management
- Page Range / eLocation ID:
- 1819 to 1822
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Numerous computer-based collaborative learning environments have been developed to support collaborative problem-solving. Yet, understanding the complexity and dynamic nature of the collaboration process remains a challenge. This is particularly true in open-ended immersive learning environments, where students navigate both physical and virtual spaces, pursuing diverse paths to solve problems. In response, we aimed to unpack these complex collaborative learning processes by investigating 16 groups of college students (n = 77) who utilized an immersive astronomy simulation in their introductory astronomy course. Our specific focus is on joint attention as a multi-level indicator to index collaboration. To examine the interplay between joint attention and other multimodal traces (conceptual discussions and gestures) in students’ interactions with peers and the simulation, we employed a multi-granular approach. This approach encompasses macro-level correlations, meso-level network trends, and micro-level qualitative insights from vignettes to capture nuances at different levels. Distinct multimodal engagement patterns emerged between low- and high-achieving groups, evolving over time across a series of tasks. Our findings contribute to the understanding of the notion of timely joint attention and emphasize the importance of individual exploration during the early stages of collaborative problem-solving, demonstrating its contribution to productive knowledge coconstruction. This research overall provides valuable insights into the complexities of collaboration dynamics within and beyond digital space. The empirical evidence we present in our study lays a strong foundation for developing instructional designs aimed at fostering productive collaboration in immersive learning environments.more » « less
-
Machine learning on graph structured data has attracted much research interest due to its ubiquity in real world data. However, how to efficiently represent graph data in a general way is still an open problem. Traditional methods use handcraft graph features in a tabular form but suffer from the defects of domain expertise requirement and information loss. Graph representation learning overcomes these defects by automatically learning the continuous representations from graph structures, but they require abundant training labels, which are often hard to fulfill for graph-level prediction problems. In this work, we demonstrate that, if available, the domain expertise used for designing handcraft graph features can improve the graph-level representation learning when training labels are scarce. Specifically, we proposed a multi-task knowledge distillation method. By incorporating network-theory-based graph metrics as auxiliary tasks, we show on both synthetic and real datasets that the proposed multi-task learning method can improve the prediction performance of the original learning task, especially when the training data size is small.more » « less
-
Recent advances in multi-rotor vehicle control and miniaturization of hardware, sensing, and battery technologies have enabled cheap, practical design of micro air vehicles for civilian and hobby applications. In parallel, several applications are being envisioned that bring together a swarm of multiple networked micro air vehicles to accomplish large tasks in coordination. However, it is still very challenging to deploy multiple micro air vehicles concurrently. To address this challenge, we have developed an open software/hardware platform called the University at Buffalo’s Airborne Networking and Communications Testbed (UB-ANC), and an associated emulation framework called the UB-ANC Emulator. In this paper, we present the UB-ANC Emulator, which combines multi-micro air vehicle planning and control with high-fidelity network simulation, enables practitioners to design micro air vehicle swarm applications in software and provides seamless transition to deployment on actual hardware. We demonstrate the UB-ANC Emulator’s accuracy against experimental data collected in two mission scenarios: a simple mission with three networked micro air vehicles and a sophisticated coverage path planning mission with a single micro air vehicle. To accurately reflect the performance of a micro air vehicle swarm where communication links are subject to interference and packet losses, and protocols at the data link, network, and transport layers affect network throughput, latency, and reliability, we integrate the open-source discrete-event network simulator ns-3 into the UB-ANC Emulator. We demonstrate through node-to-node and end-to-end measurements how the UB-ANC Emulator can be used to simulate multiple networked micro air vehicles with accurate modeling of mobility, control, wireless channel characteristics, and network protocols defined in ns-3.more » « less
-
Deep Neural Networks (DNN) could forget the knowledge about earlier tasks when learning new tasks, and this is known as catastrophic forgetting. To learn new task without forgetting, recently, the mask-based learning method (e.g. piggyback ) is proposed to address these issues by learning only a binary element-wise mask, while keeping the backbone model fixed. However, the binary mask has limited modeling capacity for new tasks. A more recent work proposes a compress-grow-based method (CPG) to achieve better accuracy for new tasks by partially training backbone model, but with order-higher training cost, which makes it infeasible to be deployed into popular state-of-the-art edge-/mobile-learning. The primary goal of this work is to simultaneously achieve fast and high-accuracy multi-task adaption in a continual learning setting. Thus motivated, we propose a new training method called Kernel-wise Soft Mask (KSM), which learns a kernel-wise hybrid binary and real-value soft mask for each task. Such a soft mask can be viewed as a superposition of a binary mask and a properly scaled real-value tensor, which offers a richer representation capability without low-level kernel support to meet the objective of low hardware overhead. We validate KSM on multiple benchmark datasets against recent state-of-the-art methods (e.g. Piggyback, Packnet, CPG, etc.), which shows good improvement in both accuracy and training cost.more » « less
An official website of the United States government

