skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Indirectly Supervised Natural Language Processing
This tutorial targets researchers and practitioners who are interested in ML technologies for NLP from indirect supervision. In particular, we will present a diverse thread of indirect supervision studies that try to answer the following questions: (i) when and how can we provide supervision for a target task T, if all we have is data that corresponds to a “related” task T′? (ii) humans do not use exhaustive supervision; they rely on occasional feedback, and learn from incidental signals from various sources; how can we effectively incorporate such supervision in machine learning? (iii) how can we leverage multi-modal supervision to help NLP? To the end, we will discuss several lines of research that address those challenges, including (i) indirect supervision from T ′ that handles T with outputs spanning from a moderate size to an open space, (ii) the use of sparsely occurring and incidental signals, such as partial labels, noisy labels, knowledge-based constraints, and cross-domain or cross-task annotations—all having statistical associations with the task, (iii) principled ways to measure and understand why these incidental signals can contribute to our target tasks, and (iv) indirect supervision from vision-language signals. We will conclude the tutorial by outlining directions for further investigation.  more » « less
Award ID(s):
2105329
PAR ID:
10440670
Author(s) / Creator(s):
; ; ; ; ;
Date Published:
Journal Name:
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 6: Tutorial Abstracts)
Page Range / eLocation ID:
32 to 40
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. This tutorial targets researchers and practitioners who are interested in AI and ML technologies for structural information extraction (IE) from unstructured textual sources. Particularly, this tutorial will provide audience with a systematic introduction to recent advances of IE, by answering several important research questions. These questions include (i) how to develop an robust IE system from noisy, insufficient training data, while ensuring the reliability of its prediction? (ii) how to foster the generalizability of IE through enhancing the system’s cross-lingual, cross-domain, cross-task and cross-modal transferability? (iii) how to precisely support extracting structural information with extremely fine-grained, diverse and boundless labels? (iv) how to further improve IE by leveraging indirect supervision from other NLP tasks, such as NLI, QA or summarization, and pre-trained language models? (v) how to acquire knowledge to guide the inference of IE systems? We will discuss several lines of frontier research that tackle those challenges, and will conclude the tutorial by outlining directions for further investigation. 
    more » « less
  2. We explore the effect of auxiliary labels in improving the classification accuracy of wearable sensor-based human activity recognition (HAR) systems, which are primarily trained with the supervision of the activity labels (e.g. running, walking, jumping). Supplemental meta-data are often available during the data collection process such as body positions of the wearable sensors, subjects' demographic information (e.g. gender, age), and the type of wearable used (e.g. smartphone, smart-watch). This information, while not directly related to the activity classification task, can nonetheless provide auxiliary supervision and has the potential to significantly improve the HAR accuracy by providing extra guidance on how to handle the introduced sample heterogeneity from the change in domains (i.e positions, persons, or sensors), especially in the presence of limited activity labels. However, integrating such meta-data information in the classification pipeline is non-trivial - (i) the complex interaction between the activity and domain label space is hard to capture with a simple multi-task and/or adversarial learning setup, (ii) meta-data and activity labels might not be simultaneously available for all collected samples. To address these issues, we propose a novel framework Conditional Domain Embeddings (CoDEm). From the available unlabeled raw samples and their domain meta-data, we first learn a set of domain embeddings using a contrastive learning methodology to handle inter-domain variability and inter-domain similarity. To classify the activities, CoDEm then learns the label embeddings in a contrastive fashion, conditioned on domain embeddings with a novel attention mechanism, enforcing the model to learn the complex domain-activity relationships. We extensively evaluate CoDEm in three benchmark datasets against a number of multi-task and adversarial learning baselines and achieve state-of-the-art performance in each avenue. 
    more » « less
  3. In this work, we consider a setting where the goal is to achieve adversarial robustness on a target task, given only unlabeled training data from the task distribution, by leveraging a labeled training data from a different yet related source task distribution. The absence of the labels on training data for the target task poses a unique challenge as conventional adversarial robustness defenses cannot be directly applied. To address this challenge, we first bound the adversarial population 0-1 robust loss on the target task in terms of (i) empirical 0-1 loss on the source task, (ii) joint loss on source and target tasks of an ideal classifier, and (iii) a measure of worst-case domain divergence. Motivated by this bound, we develop a novel unified defense framework called Divergence-Aware adveRsarial Training (DART), which can be used in conjunction with a variety of standard UDA methods; e.g., DANN. DART is applicable to general threat models, including the popular \ell_p-norm model, and does not require heuristic regularizers or architectural changes. We also release DomainRobust, a testbed for evaluating robustness of UDA models to adversarial attacks. DomainRobust consists of 4 multidomain benchmark datasets (with 46 source-target pairs) and 7 meta-algorithms with a total of 11 variants. Our large-scale experiments demonstrate that, on average, DART significantly enhances model robustness on all benchmarks compared to the state of the art, while maintaining competitive standard accuracy. The relative improvement in robustness from DART reaches up to 29.2% on the source-target domain pairs considered. 
    more » « less
  4. This tutorial will introduce our Accessibility Learning Labs (ALL). The objectives of this collaborative project with The National Technical Institute for the Deaf (NTID) are to both inform participants about foundational topics in accessibility and to demonstrate the importance of creating accessible software. The labs enable easy classroom inclusion by providing instructors all necessary materials including lecture and activity slides and videos. Each lab addresses an accessibility issue and contains: I) Relevant background information on the examined issue II) An example web-based application containing the accessibility problem III) A process to emulate this accessibility problem IV) Details about how to repair the problem from a technical perspective V) Incidents from people who encountered this accessibility issue and how it has impacted their life. The labs may be easily integrated into a wide variety of curriculum at high schools (9-12), and in undergraduate and graduate courses. The labs will be easily adoptable due to their selfcontained nature and their inclusion of all necessary instructional material (e.g., slides, quizzes, etc.). No special software is required to use any portion of the labs since they are web-based and are able to run on any computer with a reasonably recent web browser. There are currently four available labs on the topics of: Colorblindness, Hearing, Blindness and Dexterity. Material is available on our website: http://all.rit.edu This tutorial will provide an overview of the created labs and usage instructions and information for adaptors. 
    more » « less
  5. Graph-theoretic algorithms and graph machine learning models are essential tools for addressing many real-life problems, such as social network analysis and bioinformatics. To support large-scale graph analytics, graph-parallel systems have been actively developed for over one decade, such as Google’s Pregel and Spark’s GraphX, which (i) promote a think-like-a-vertex computing model and target (ii) iterative algorithms and (iii) those problems that output a value for each vertex. However, this model is too restricted for supporting the rich set of heterogeneous operations for graph analytics and machine learning that many real applications demand. In recent years, two new trends emerge in graph-parallel systems research: (1) a novel think-like-a-task computing model that can efficiently support the various computationally expensive problems of subgraph search; and (2) scalable systems for learning graph neural networks. These systems effectively complement the diversity needs of graph-parallel tools that can flexibly work together in a comprehensive graph processing pipeline for real applications, with the capability of capturing structural features. This tutorial will provide an effective categorization of the recent systems in these two directions based on their computing models and adopted techniques, and will review the key design ideas of these systems. 
    more » « less