skip to main content


Title: Integrated Learning of Dialog Strategies and Semantic Parsing,
Natural language understanding and dia- log management are two integral compo- nents of interactive dialog systems. Pre- vious research has used machine learning techniques to individually optimize these components, with different forms of direct and indirect supervision. We present an approach to integrate the learning of both a dialog strategy using reinforcement learn- ing, and a semantic parser for robust nat- ural language understanding, using only natural dialog interaction for supervision. Experimental results on a simulated task of robot instruction demonstrate that joint learning of both components improves di- alog performance over learning either of these components alone.  more » « less
Award ID(s):
1637736
NSF-PAR ID:
10025851
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics (EACL)
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. In this work, we present methods for using human-robot dialog to improve language understanding for a mobile robot agent. The agent parses natural language to underlying semantic meanings and uses robotic sensors to create multi-modal models of perceptual concepts like red and heavy. The agent can be used for showing navigation routes, delivering objects to people, and relocating objects from one location to another. We use dialog clari_cation questions both to understand commands and to generate additional parsing training data. The agent employs opportunistic active learning to select questions about how words relate to objects, improving its understanding of perceptual concepts. We evaluated this agent on Amazon Mechanical Turk. After training on data induced from conversations, the agent reduced the number of dialog questions it asked while receiving higher usability ratings. Additionally, we demonstrated the agent on a robotic platform, where it learned new perceptual concepts on the y while completing a real-world task. 
    more » « less
  2. Recent years have witnessed the emerging of conversational systems, including both physical devices and mobile-based applications, such as Amazon Echo, Google Now, Microsoft Cortana, Apple Siri, and many others. Both the research community and industry believe that conversational systems will have a major impact on human-computer interaction, and specifically, the IR community has begun to focus on Conversational Search. Conversational search based on user-system dialog exhibits major differences from conventional search in that 1) the user and system can interact for multiple semantically coherent rounds on a task through natural language dialog, and 2) it becomes possible for the system to understand user needs or to help users clarify their needs by asking appropriate questions from the users directly. In this paper, we propose and evaluate a unified conversational search framework. Specifically, we define the major components for conversational search, assemble them into a unified framework, and test an implementation of the framework using a conversational product search scenario in Amazon. To accomplish this, we propose the Multi-Memory Network (MMN) architecture, which is end-to-end trainable based on large-scale collections of user reviews in e-commerce. The system is capable of asking aspect-based questions in the right order so as to understand user needs, while (personalized) search is conducted during the conversation and results are provided when the system feels confident. Experiments on real-world user purchasing data verified the advantages of conversational search against conventional search algorithms in terms of standard evaluation measures such as NDCG. 
    more » « less
  3. A Natural Language Interface (NLI) enables the use of human languages to interact with computer systems, including smart phones and robots. Compared to other types of interfaces, such as command line interfaces (CLIs) or graphical user interfaces (GUIs), NLIs stand to enable more people to have access to functionality behind databases or APIs as they only require knowledge of natural languages. Many NLI applications involve structured data for the domain (e.g., applications such as hotel booking, product search, and factual question answering.) Thus, to fully process user questions, in addition to natural language comprehension, understanding of structured data is also crucial for the model. In this paper, we study neural network methods for building Natural Language Interfaces (NLIs) with a focus on learning structure data representations that can generalize to novel data sources and schemata not seen at training time. Specifically, we review two tasks related to natural language interfaces: i) semantic parsing where we focus on text-to-SQL for database access, and ii) task-oriented dialog systems for API access. We survey representative methods for text-to-SQL and task-oriented dialog tasks, focusing on representing and incorporating structured data. Lastly, we present two of our original studies on structured data representation methods for NLIs to enable access to i) databases, and ii) visualization APIs. 
    more » « less
  4. This work deals with the challenge of learning and reasoning over language and vision data for the related downstream tasks such as visual question answering (VQA) and natural language for visual reasoning (NLVR). We design a novel cross-modality relevance module that is used in an end-to-end framework to learn the relevance representation between components of various input modalities under the supervision of a target task, which is more generalizable to unobserved data compared to merely reshaping the original representation space. In addition to modeling the relevance between the textual entities and visual entities, we model the higher-order relevance between entity relations in the text and object relations in the image. Our proposed approach shows competitive performance on two different language and vision tasks using public benchmarks and improves the state-of-the-art published results. The learned alignments of input spaces and their relevance representations by NLVR task boost the training efficiency of VQA task. 
    more » « less
  5. Recent systems for converting natural language descriptions into regular expressions (regexes) have achieved some success, but typically deal with short, formulaic text and can only produce simple regexes. Real-world regexes are complex, hard to describe with brief sentences, and sometimes require examples to fully convey the user’s intent. We present a framework for regex synthesis in this setting where both natural language (NL) and examples are available. First, a semantic parser (either grammar-based or neural) maps the natural language description into an intermediate sketch, which is an incomplete regex containing holes to denote missing components. Then a program synthesizer searches over the regex space defined by the sketch and finds a regex that is consistent with the given string examples. Our semantic parser can be trained purely from weak supervision based on correctness of the synthesized regex, or it can leverage heuristically derived sketches. We evaluate on two prior datasets (Kushman and Barzilay 2013 ; Locascio et al. 2016 ) and a real-world dataset from Stack Overflow. Our system achieves state-of-the-art performance on the prior datasets and solves 57% of the real-world dataset, which existing neural systems completely fail on. 1 
    more » « less