skip to main content


Title: A Few-Shot Semantic Parser for Wizard-of-Oz Dialogues with the Precise ThingTalk Representation
Previous attempts to build effective semantic parsers for Wizard-of-Oz (WOZ) conversations suffer from the difficulty in acquiring a high-quality, manually annotated training set. Approaches based only on dialogue synthesis are insufficient, as dialogues generated from state-machine based models are poor approximations of real-life conversations. Furthermore, previously proposed dialogue state representations are ambiguous and lack the precision necessary for building an effective agent.This paper proposes a new dialogue representation and a sample-efficient methodology that can predict precise dialogue states in WOZ conversations. We extended the ThingTalk representation to capture all information an agent needs to respond properly. Our training strategy is sample-efficient: we combine (1) few-shot data sparsely sampling the full dialogue space and (2) synthesized data covering a subset space of dialogues generated by a succinct state-based dialogue model. The completeness of the extended ThingTalk language is demonstrated with a fully operational agent, which is also used in training data synthesis. We demonstrate the effectiveness of our methodology on MultiWOZ 3.0, a reannotation of the MultiWOZ 2.1 dataset in ThingTalk. ThingTalk can represent 98% of the test turns, while the simulator can emulate 85% of the validation set. We train a contextual semantic parser using our strategy, and obtain 79% turn-by-turn exact match accuracy on the reannotated test set.  more » « less
Award ID(s):
1900638
PAR ID:
10427007
Author(s) / Creator(s):
; ; ; ; ;
Date Published:
Journal Name:
Findings of the Association for Computational Linguistics: ACL 2022
Volume:
Findings of the Association for Computational Linguistics: ACL 2022
Issue:
May 2022
Page Range / eLocation ID:
4021 to 4034
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Robust state tracking for task-oriented dialogue systems currently remains restricted to a few popular languages. This paper shows that given a large-scale dialogue data set in one language, we can automatically produce an effective semantic parser for other languages using machine translation. We propose automatic translation of dialogue datasets with alignment to ensure faithful translation of slot values and eliminate costly human supervision used in previous benchmarks. We also propose a new contextual semantic parsing model, which encodes the formal slots and values, and only the last agent and user utterances. We show that the succinct representation reduces the compounding effect of translation errors, without harming the accuracy in practice. We evaluate our approach on several dialogue state tracking benchmarks. On RiSAWOZ, CrossWOZ, CrossWOZ-EN, and MultiWOZ-ZH datasets we improve the state of the art by 11%, 17%, 20%, and 0.3% in joint goal accuracy. We present a comprehensive error analysis for all three datasets showing erroneous annotations can lead to misguided judgments on the quality of the model. Finally, we present RiSAWOZ English and German datasets, created using our translation methodology. On these datasets, accuracy is within 11% of the original showing that high-accuracy multilingual dialogue datasets are possible without relying on expensive human annotations. We release our datasets and software open source. 
    more » « less
  2. Task-oriented Dialogue (ToD) agents are mostly limited to a few widely-spoken languages, mainly due to the high cost of acquiring training data for each language. Existing low-cost approaches that rely on cross-lingual embeddings or naive machine translation sacrifice a lot of accuracy for data efficiency, and largely fail in creating a usable dialogue agent. We propose automatic methods that use ToD training data in a source language to build a high-quality functioning dialogue agent in another target language that has no training data (i.e. zero-shot) or a small training set (i.e. fewshot). Unlike most prior work in cross-lingual ToD that only focuses on Dialogue State Tracking (DST), we build an end-to-end agent. We show that our approach closes the accuracy gap between few-shot and existing fullshot methods for ToD agents. We achieve this by (1) improving the dialogue data representation, (2) improving entity-aware machine translation, and (3) automatic filtering of noisy translations. We evaluate our approach on the recent bilingual dialogue dataset BiToD. In Chinese to English transfer, in the zero-shot setting, our method achieves 46.7% and 22.0% in Task Success Rate (TSR) and Dialogue Success Rate (DSR) respectively. In the few-shot setting where 10% of the data in the target language is used, we improve the state-of-the-art by 15.2% and 14.0%, coming within 5% of full-shot training. 
    more » « less
  3. In this paper, we apply the contribution model of grounding to a corpus of human-human peer-mentoring dialogues. From this analysis, we propose effective turn-taking strategies for human-robot interaction with a teachable robot. Specifically, we focus on (1) how robots can encourage humans to present and (2) how robots can signal that they are going to begin a new presentation. We evaluate the strategies against a corpus of human-robot dialogues and offer three guidelines for teachable robots to follow to achieve more human-like collaborative dialogue. 
    more » « less
  4. Abstract

    Body part terms (BPTs) are used extensively in Mesoamerican languages to name object parts. The process through which BPTs might be extended to refer to a part of an object and further serve as a relator in describing the relation between objects in space has often been attributed to metaphorical processes. This study proposes an alternative analysis following a Structure–Mapping Theory approach (Gentner, 1983, inter alia), based on data from Diidxazá (Isthmus Zapotec, Otomanguean) obtained through elicitation and experimental tasks. The data show that structure mapping does not depend on a 1:1 match of attributes; frequency of use shed light on principles that constrain the semantic extension of most BPTs; a core set of six BPTs are extended by abstraction of the set of intersecting axes of the body. The detailed nature of this study enables an analysis of the mental representations underlying the semantic extension of BPTs. This in turn elucidates on aspects of the relation between spatial language and spatial cognition. In addition, this study allows us to address questions about the categorial status of BPTs in Diidxazá and their lexicographic representation.

     
    more » « less
  5. Existing watermarked generation algorithms employ token-level designs and therefore, are vulnerable to paraphrase attacks. To address this issue, we introduce watermarking on the semantic representation of sentences. We propose SemStamp, a robust sentence-level semantic watermarking algorithm that uses locality-sensitive hashing (LSH) to partition the semantic space of sentences. The algorithm encodes and LSH-hashes a candidate sentence generated by a language model, and conducts rejection sampling until the sampled sentence falls in watermarked partitions in the semantic embedding space. To test the paraphrastic robustness of watermarking algorithms, we propose a {``}bigram paraphrase{''} attack that produces paraphrases with small bigram overlap with the original sentence. This attack is shown to be effective against existing token-level watermark algorithms, while posing only minor degradations to SemStamp. Experimental results show that our novel semantic watermark algorithm is not only more robust than the previous state-of-the-art method on various paraphrasers and domains, but also better at preserving the quality of generation. 
    more » « less