skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Interactive Task Learning from GUI-Grounded Natural Language Instructions and Demonstrations
We summarize our past five years of work on designing, building, and studying Sugilite, an interactive task learning agent that can learn new tasks and relevant associated concepts interactively from the user’s natural language instructions and demonstrations leveraging the graphical user interfaces (GUIs) of third-party mobile apps. Through its multi-modal and mixed-initiative approaches for Human- AI interaction, Sugilite made important contributions in improving the usability, applicability, generalizability, flexibility, robustness, and shareability of interactive task learning agents. Sugilite also represents a new human-AI interaction paradigm for interactive task learning, where it uses existing app GUIs as a medium for users to communicate their intents with an AI agent instead of the interfaces for users to interact with the underlying computing services. In this chapter, we describe the Sugilite system, explain the design and implementation of its key features, and show a prototype in the form of a conversational assistant on Android.  more » « less
Award ID(s):
1814472
PAR ID:
10302231
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
The AAAI-20 Workshop on Intelligent Process Automation (IPA-20)
Page Range / eLocation ID:
215 to 223
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Li, Yang; Hilliges, Otmar (Ed.)
    We summarize our past five years of work on designing, building, and studying Sugilite, an interactive task learning agent that can learn new tasks and relevant associated concepts interactively from the user’s natural language instructions and demonstrations leveraging the graphical user interfaces (GUIs) of third-party mobile apps. Through its multi-modal and mixed-initiative approaches for Human-AI interaction, Sugilite made important contributions in improving the usability, applicability, generalizability, flexibility, robustness, and shareability of interactive task learning agents. Sugilite also represents a new human-AI interaction paradigm for interactive task learning, where it uses existing app GUIs as a medium for users to communicate their intents with an AI agent instead of the interfaces for users to interact with the underlying computing services. In this chapter, we describe the Sugilite system, explain the design and implementation of its key features, and show a prototype in the form of a conversational assistant on Android. 
    more » « less
  2. Regardless of how much data artificial intelligence agents have available, agents will inevitably encounter previously unseen situations in real-world deployments. Reacting to novel situations by acquiring new information from other people—socially situated learning—is a core faculty of human development. Unfortunately, socially situated learning remains an open challenge for artificial intelligence agents because they must learn how to interact with people to seek out the information that they lack. In this article, we formalize the task of socially situated artificial intelligence—agents that seek out new information through social interactions with people—as a reinforcement learning problem where the agent learns to identify meaningful and informative questions via rewards observed through social interaction. We manifest our framework as an interactive agent that learns how to ask natural language questions about photos as it broadens its visual intelligence on a large photo-sharing social network. Unlike active-learning methods, which implicitly assume that humans are oracles willing to answer any question, our agent adapts its behavior based on observed norms of which questions people are or are not interested to answer. Through an 8-mo deployment where our agent interacted with 236,000 social media users, our agent improved its performance at recognizing new visual information by 112%. A controlled field experiment confirmed that our agent outperformed an active-learning baseline by 25.6%. This work advances opportunities for continuously improving artificial intelligence (AI) agents that better respect norms in open social environments. 
    more » « less
  3. null (Ed.)
    Nonverbal task learning is defined here as a variant of interactive task learning in which an agent learns the definition of a new task without any verbal information such as task instructions. Instead, the agent must 1) learn the task definition using only a single solved example problem as its training input, and then 2) generalize this definition in order to successfully parse new problems. In this paper, we present a conceptual framework for nonverbal task learning, and we compare and contrast this type of learning with existing learning paradigms in AI. We also discuss nonverbal task learning in the context of nonverbal human intelligence tests, which are standardized tests designed to be given without any verbal instructions so that they can be used by people with language difficulties. 
    more » « less
  4. null (Ed.)
    Across a wide variety of domains, artificial agents that can adapt and personalize to users have potential to improve and transform how social services are provided. Because of the need for personalized interaction data to drive this process, long-term (or longitudinal) interactions between users and agents, which unfold over a series of distinct interaction sessions, have attracted substantial research interest. In recognition of the expanded scope and structure of a long-term interaction, researchers are also adjusting the personalization models and algorithms used, orienting toward “continual learning” methods, which do not assume a stationary modeling target and explicitly account for the temporal context of training data. In parallel, researchers have also studied the effect of “multitask personalization,” an approach in which an agent interacts with users over multiple different tasks contexts throughout the course of a long-term interaction and learns personalized models of a user that are transferrable across these tasks. In this paper, we unite these two paradigms under the framework of “Lifelong Personalization,” analyzing the effect of multitask personalization applied to dynamic, non-stationary targets. We extend the multi-task personalization approach to the more complex and realistic scenario of modeling dynamic learners over time, focusing in particular on interactive scenarios in which the modeling agent plays an active role in teaching the student whose knowledge the agent is simultaneously attempting to model. Inspired by the way in which agents use active learning to select new training data based on domain context, we augment a Gaussian Process-based multitask personalization model with a mechanism to actively and continually manage its own training data, allowing a modeling agent to remove or reduce the weight of observed data from its training set, based on interactive context cues. We evaluate this method in a series of simulation experiments comparing different approaches to continual and multitask learning on simulated student data. We expect this method to substantially improve learning in Gaussian Process models in dynamic domains, establishing Gaussian Processes as another flexible modeling tool for Long-term Human-Robot Interaction (HRI) Studies. 
    more » « less
  5. People form perceptions and interpretations of AI through external sources prior to their interaction with new technology. For example, shared anecdotes and media stories influence prior beliefs that may or may not accurately represent the true nature of AI systems. We hypothesize people's prior perceptions and beliefs will affect human-AI interactions and usage behaviors when using new applications. This paper presents a user experiment to explore the interplay between user's pre-existing beliefs about AI technology, individual differences, and previously established sources of cognitive bias from first impressions with an interactive AI application. We employed questionnaire measures as features to categorize users into profiles based on their prior beliefs and attitudes about technology. In addition, participants were assigned to one of two controlled conditions designed to evoke either positive or negative first impressions during an AI-assisted judgment task using an interactive application. The experiment and results provide empirical evidence that profiling users by surveying them on their prior beliefs and differences can be a beneficial approach for bias (and/or unanticipated usage) mitigation instead of seeking one-size-fits-all solutions. 
    more » « less