skip to main content


Title: Annotating low-confidence questions improves classifier performance
This paper compares methods to select data for annotation in order to improve a classifier used in a question-answering dialogue system. With a classifier trained on 1,500 questions, adding 300 training questions on which the classifier is least confident results in consistently improved performance, whereas adding 300 arbitrarily selected training questions does not yield consistent improvement, and sometimes even degrades performance. The paper uses a new method for comparative evaluation of classifiers for dialogue, which scores each classifier based on the number of appropriate responses retrieved.  more » « less
Award ID(s):
1852583
NSF-PAR ID:
10313591
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Proceedings of the 25th Workshop on the Semantics and Pragmatics of Dialogue - Poster Abstracts
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Computer-based job interview training, including virtual reality (VR) simulations, have gained popularity in recent years to support and aid autistic individuals, who face significant challenges and barriers in finding and maintaining employment. Although popular, these training systems often fail to resemble the complexity and dynamism of the employment interview, as the dialogue management for the virtual conversation agent either relies on choosing from a menu of prespecified answers, or dialogue processing is based on keyword extraction from the transcribed speech of the interviewee, which depends on the interview script. We address this limitation through automated dialogue act classification via transfer learning. This allows for recognizing intent from user speech, independent of the domain of the interview. We also redress the lack of training data for a domain general job interview dialogue act classifier by providing an original dataset with responses to interview questions within a virtual job interview platform from 22 autistic participants. Participants’ responses to a customized interview script were transcribed to text and annotated according to a custom 13-class dialogue act scheme. The best classifier was a fine-tuned bidirectional encoder representations from transformers (BERT) model, with an f1-score of 87%. 
    more » « less
  2. Exploration of a design space is the first step in identifying sets of high-performing solutions to complex engineering problems. For this purpose, Bayesian network classifiers (BNCs) have been shown to be effective for mapping regions of interest in the design space, even when those regions of interest exhibit complex topologies. However, identifying sets of desirable solutions can be difficult with a BNC when attempting to map a space where high-performance designs are spread sparsely among a disproportionately large number of low-performance designs, resulting in an imbalanced classifier. In this paper, a method is presented that utilizes probabilities of class membership for known training points, combined with interpolation between those points, to generate synthetic high-performance points in a design space. By adding synthetic design points into the BNC training set, a designer can rebalance an imbalanced classifier and improve classification accuracy throughout the space. For demonstration, this approach is applied to an acoustics metamaterial design problem with a sparse design space characterized by a combination of discrete and continuous design variables. Paper No: DETC2018-85274 
    more » « less
  3. Ground truth depth information is necessary for many computer vision tasks. Collecting this information is chal-lenging, especially for outdoor scenes. In this work, we propose utilizing single-view depth prediction neural networks pre-trained on synthetic scenes to generate relative depth, which we call pseudo-depth. This approach is a less expen-sive option as the pre-trained neural network obtains ac-curate depth information from synthetic scenes, which does not require any expensive sensor equipment and takes less time. We measure the usefulness of pseudo-depth from pre-trained neural networks by training indoor/outdoor binary classifiers with and without it. We also compare the difference in accuracy between using pseudo-depth and ground truth depth. We experimentally show that adding pseudo-depth to training achieves a 4.4% performance boost over the non-depth baseline model on DIODE, a large stan-dard test dataset, retaining 63.8% of the performance boost achieved from training a classifier on RGB and ground truth depth. It also boosts performance by 1.3% on another dataset, SUN397, for which ground truth depth is not avail-able. Our result shows that it is possible to take information obtained from a model pre-trained on synthetic scenes and successfully apply it beyond the synthetic domain to real-world data. 
    more » « less
  4. null (Ed.)
    Communication between human and mobile agents is getting increasingly important as such agents are widely deployed in our daily lives. Vision-and-Dialogue Navigation is one of the tasks that evaluate the agent’s ability to interact with humans for assistance and navigate based on natural language responses. In this paper, we explore the Navigation from Dialogue History (NDH) task, which is based on the Cooperative Vision-and-Dialogue Navigation (CVDN) dataset, and present a state-of-the-art model which is built upon Vision-Language transformers. However, despite achieving competitive performance, we find that the agent in the NDH task is not evaluated appropriately by the primary metric – Goal Progress. By analyzing the performance mismatch between Goal Progress and other metrics (e.g., normalized Dynamic Time Warping) from our state-of-the-art model, we show that NDH’s sub-path based task setup (i.e., navigating partial trajectory based on its correspondent subset of the full dialogue) does not provide the agent with enough supervision signal towards the goal region. Therefore, we propose a new task setup called NDH-Full which takes the full dialogue and the whole navigation path as one instance. We present a strong baseline model and show initial results on this new task. We further describe several approaches that we try, in order to improve the model performance (based on curriculum learning, pre-training, and data-augmentation), suggesting potential useful training methods on this new NDH-Full task. 
    more » « less
  5. Mitrovic, A. ; Bosch, N. (Ed.)
    Regular expression (regex) coding has advantages for text analysis. Humans are often able to quickly construct intelligible coding rules with high precision. That is, researchers can identify words and word patterns that correctly classify examples of a particular concept. And, it is often easy to identify false positives and improve the regex classifier so that the positive items are accurately captured. However, ensuring that a regex list is complete is a bigger challenge, because the concepts to be identified in data are often sparsely distributed, which makes it difficult to identify examples of \textit{false negatives}. For this reason, regex-based classifiers suffer by having low recall. That is, it often misses items that should be classified as positive. In this paper, we provide a neural network solution to this problem by identifying a \textit{negative reversion set}, in which false negative items occur much more frequently than in the data set as a whole. Thus, the regex classifier can be more quickly improved by adding missing regexes based on the false negatives found from the negative reversion set. This study used an existing data set collected from a simulation-based learning environment for which researchers had previously defined six codes and developed classifiers with validated regex lists. We randomly constructed incomplete (partial) regex lists and used neural network models to identify negative reversion sets in which the frequency of false negatives increased from a range of 3\\%-8\\% in the full data set to a range of 12\\%-52\\% in the negative reversion set. Based on this finding, we propose an interactive coding mechanism in which human-developed regex classifiers provide input for training machine learning algorithms and machine learning algorithms ``smartly" select highly suspected false negative items for human to more quickly develop regex classifiers. 
    more » « less