End-user programmable intelligent agents that can learn new tasks and concepts from users’ explicit instructions are desired. This paper presents our progress on expanding the capabilities of such agents in the areas of task applicability, task generalizability, user intent disambiguation and support for IoT devices through our multi-modal approach of combining programming by demonstration (PBD) with learning from natural language instructions. Our future directions include facilitating better script reuse and sharing, and supporting greater user expressiveness in instructions. 
                        more » 
                        « less   
                    
                            
                            APPINITE: A Multi-Modal Interface for Specifying Data Descriptions in Programming by Demonstration Using Natural Language Instructions
                        
                    
    
            A key challenge for generalizing programming-by-demonstration (PBD) scripts is the data description problem - when a user demonstrates performing an action, the system needs to determine features for describing this action and the target object in a way that can reflect the user's intention for the action. However, prior approaches for creating data descriptions in PBD systems have problems with usability, applicability, feasibility, transparency and/or user control. Our APPINITE system introduces a multimodal interface with which users can specify data descriptions verbally using natural language instructions. APPINITE guides users to describe their intentions for the demonstrated actions through mixed-initiative conversations. APPINITE constructs data descriptions for these actions from the natural language instructions. Our evaluation showed that APPINITE is easy-to-use and effective in creating scripts for tasks that would otherwise be difficult to create with prior PBD systems, due to ambiguous data descriptions in demonstrations on GUIs. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 1814472
- PAR ID:
- 10106726
- Date Published:
- Journal Name:
- 2018 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC'18)
- Page Range / eLocation ID:
- 105 to 114
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            Though conditionals are an integral component of programming, providing an easy means of creating conditionals remains a challenge for programming-by-demonstration (PBD) systems for task automation. We hypothesize that a promising method for implementing conditionals in such systems is to incorporate the use of verbal instructions. Verbal instructions supplied concurrently with demonstrations have been shown to improve the generalizability of PBD. However, the challenge of supporting conditional creation using this multi-modal approach has not been addressed. In this extended abstract, we present our study on understanding how end users describe conditionals in natural language for mobile app tasks. We conducted a formative study of 56 participants asking them to verbally describe conditionals in different settings for 9 sample tasks and to invent conditional tasks. Participant responses were analyzed using open coding and revealed that, in the context of mobile apps, end users often omit desired else statements when explaining conditionals, sometimes use ambiguous concepts in expressing conditionals, and often desire to implement complex conditionals. Based on these findings, we discuss the implications for designing a multimodal PBD interface to support the creation of conditionals.more » « less
- 
            Grounding natural language instructions on the web to perform previously unseen tasks enables accessibility and automation. We introduce a task and dataset to train AI agents from open-domain, step-by-step instructions originally written for people. We build RUSS (Rapid Universal Support Service) to tackle this problem. RUSS consists of two models: First, a BERT-LSTM with pointers parses instructions to ThingTalk, a domain-specific language we design for grounding natural language on the web. Then, a grounding model retrieves the unique IDs of any webpage elements requested in ThingTalk. RUSS may interact with the user through a dialogue (e.g. ask for an address) or execute a web operation (e.g. click a button) inside the web runtime. To augment training, we synthesize natural language instructions mapped to ThingTalk. Our dataset consists of 80 different customer service problems from help websites, with a total of 741 step-by-step instructions and their corresponding actions. RUSS achieves 76.7% end-to-end accuracy predicting agent actions from single instructions. It outperforms state-of-the-art models that directly map instructions to actions without ThingTalk. Our user study shows that RUSS is preferred by actual users over web navigation.more » « less
- 
            The Mixed-Reality Integrated Learning Environment (MILE) developed at Florida State University is a virtual reality based, inclusive and immersive e-learning environment that promotes engaging and effective learning interactions for a diversified learner population. MILE uses a large number of interactive Non-Player Characters (NPCs) to represent diverse research-based learner archetypes and groups, and to prompt and provide feedback for in situ teaching practice. The NPC scripts in MILE are written in Linden Scripting Language (LSL), and can be quite complex, creating a significant challenge in the development and maintenance of the system. To address this challenge, we develop NPC_GEN, an automatic NPC script generation tool that takes high-level NPC descriptions as input and automatically produces LSL scripts for NPCs. In this work, we introduce NPCDL, a language that we design for NPC_GEN to give high-level descriptions of NPCs, describe how NPC_GEN translates an NPCDL description into an LSL script, and report a user study of NPC_GEN. The results of our user study indicate that with minimal training, non-technical people are able to write and modify NPCDL descriptions, which can then be used to generate LSL scripts for the NPCs: the development and maintenance of NPCs is greatly simplified with NPC_GEN.more » « less
- 
            Abstract We study continual learning for natural language instruction generation, by observing human users’ instruction execution. We focus on a collaborative scenario, where the system both acts and delegates tasks to human users using natural language. We compare user execution of generated instructions to the original system intent as an indication to the system’s success communicating its intent. We show how to use this signal to improve the system’s ability to generate instructions via contextual bandit learning. In interaction with real users, our system demonstrates dramatic improvements in its ability to generate language over time.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    