Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
We propose a learning system in which language is grounded in visual percepts without specific pre-defined categories of terms. We present a unified generative method to acquire a shared semantic/visual embedding that enables the learning of language about a wide range of real-world objects. We evaluate the efficacy of this learning by predicting the semantics of objects and comparing the performance with neural and non-neural inputs. We show that this generative approach exhibits promising results in language grounding without pre-specifying visual categories under low resource settings. Our experiments demonstrate that this approach is generalizable to multilingual, highly varied datasets.more » « less
-
Ordering the selection of training data using active learning can lead to improvements in learning efficiently from smaller corpora. We present an exploration of active learning approaches applied to three grounded language problems of varying complexity in order to analyze what methods are suitable for improving data efficiency in learning. We present a method for analyzing the complexity of data in this joint problem space, and report on how characteristics of the underlying task, along with design decisions such as feature selection and classification model, drive the results. We observe that representativeness, along with diversity, is crucial in selecting data samples.more » « less
-
null (Ed.)While grounded language learning, or learning the meaning of language with respect to the physical world in which a robot operates, is a major area in human-robot interaction studies, most research occurs in closed worlds or domain-constrained settings. We present a system in which language is grounded in visual percepts without using categorical constraints by combining CNN-based visual featurization with natural language labels. We demonstrate results comparable to those achieved using handcrafted features for specific traits, a step towards moving language grounding into the space of fully open world recognition.more » « less
-
null (Ed.)Learning the meaning of grounded language---language that references a robot’s physical environment and perceptual data---is an important and increasingly widely studied problem in robotics and human-robot interaction. However, with a few exceptions, research in robotics has focused on learning groundings for a single natural language pertaining to rich perceptual data. We present experiments on taking an existing natural language grounding system designed for English and applying it to a novel multilingual corpus of descriptions of objects paired with RGB-D perceptual data. We demonstrate that this specific approach transfers well to different languages, but also present possible design constraints to consider for grounded language learning systems intended for robots that will function in a variety of linguistic settings.more » « less
-
Grounded language acquisition is concerned with learning the meaning of language as it applies to the physical world. As robots become more capable and ubiquitous, there is an increasing need for non-specialists to interact with and control them, and natural language is an intuitive, flexible, and customizable mechanism for such communication. At the same time, physically embodied agents offer a way to learn to understand natural language in the context of the world to which it refers. This paper gives an overview of the research area, selected recent advances, and some future directions and challenges that remain.more » « less
-
There has been substantial work in recent years on grounded language acquisition, in which a model is learned that relates linguistic constructs to the perceivable world. While powerful, this approach is frequently hindered by ambiguities and omissions found in natural language. One such omission is the lack of negative descriptions of objects. We describe an unsupervised system that learns visual classifiers associated with words, using semantic similarity to automatically choose negative examples from a corpus of perceptual and linguistic data. We evaluate the effectiveness of each stage as well as the system's performance on the overall learning task.more » « less
An official website of the United States government

Full Text Available