Grounded language acquisition is a major area of research combining aspects of natural language processing, computer vision, and signal processing, compounded by domain issues requiring sample efficiency and other deployment constraints. In this work, we present a multimodal dataset of RGB+depth objects with spoken as well as textual descriptions. We analyze the differences between the two types of descriptive language and our experiments demonstrate that the different modalities affect learning. This will enable researchers studying the intersection of robotics, NLP, and HCI to better investigate how the multiple modalities of image, depth, text, speech, and transcription interact, as well as how differences in the vernacular of these modalities impact results.
more »
« less
A Spoken Language Dataset of Descriptions for Speech-Based Grounded Language Learning
Grounded language acquisition is a major area of research combining aspects of natural language processing, computer vision, and signal processing, compounded by domain issues requiring sample efficiency and other deployment constraints. In this work, we present a multimodal dataset of RGB+depth objects with spoken as well as textual descriptions. We analyze the differences between the two types of descriptive language and our experiments demonstrate that the different modalities affect learning. This will enable researchers studying the intersection of robotics, NLP, and HCI to better investigate how the multiple modalities of image, depth, text, speech, and transcription interact, as well as how differences in the vernacular of these modalities impact results.
more »
« less
- PAR ID:
- 10382787
- Date Published:
- Journal Name:
- Advances in neural information processing systems
- ISSN:
- 1049-5258
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Preference tuning is a crucial process for aligning deep generative models with human preferences. This survey offers a thorough overview of recent advancements in preference tuning and the integration of human feedback. The paper is organized into three main sections: 1) introduction and preliminaries: an introduction to reinforcement learning frameworks, preference tuning tasks, models, and datasets across various modalities: language, speech, and vision, as well as different policy approaches, 2) in-depth exploration of each preference tuning approach: a detailed analysis of the methods used in preference tuning, and 3) applications, discussion, and future directions: an exploration of the applications of preference tuning in downstream tasks, including evaluation methods for different modalities, and an outlook on future research directions. Our objective is to present the latest methodologies in preference tuning and model alignment, enhancing the understanding of this field for researchers and practitioners. We hope to encourage further engagement and innovation in this area. Additionally, we provide a GitHub link https://github.com/hanyang1999/Preference-Tuning-with-Human-Feedback.more » « less
-
Abstract A goal of early research on language processing was to characterize what is universal about language. Much of the past research focused on native speakers because the native language has been considered as providing privileged truths about acquisition, comprehension, and production. Populations or circumstances that deviated from these idealized norms were of interest but not regarded as essential to our understanding of language. In the past two decades, there has been a marked change in our understanding of how variation in language experience may inform the central and enduring questions about language. There is now evidence for significant plasticity in language learning beyond early childhood, and variation in language experience has been shown to influence both language learning and processing. In this paper, we feature what we take to be the most exciting recent new discoveries suggesting that variation in language experience provides a lens into the linguistic, cognitive, and neural mechanisms that enable language processing.more » « less
-
Kail, Michèle; null (Ed.)A goal of early research on language processing was to characterize what is universal about language. Much of the past research focused on native speakers because the native language has been considered as providing privileged truths about acquisition, comprehension, and production. Populations or circumstances that deviated from these idealized norms were of interest but not regarded as essential to our understanding of language. In the past two decades, there has been a marked change in our understanding of how variation in language experience may inform the central and enduring questions about language. There is now evidence for significant plasticity in language learning beyond early childhood, and variation in language experience has been shown to influence both language learning and processing. In this paper, we feature what we take to be the most exciting recent new discoveries suggesting that variation in language experience provides a lens into the linguistic, cognitive, and neural mechanisms that enable language processing.more » « less
-
Abstract The study of how bilingualism is linked to cognitive processing, including executive functioning, has historically focused on comparing bilinguals to monolinguals across a range of tasks. These group comparisons presume to capture relatively stable cognitive traits and have revealed important insights about the architecture of the language processing system that could not have been gleaned from studying monolinguals alone. However, there are drawbacks to using a group-comparison, or Traits, approach. In this theoretical review, we outline some limitations of treating executive functions as stable traits and of treating bilinguals as a uniform group when compared to monolinguals. To build on what we have learned from group comparisons, we advocate for an emerging complementary approach to the question of cognition and bilingualism. Using an approach that compares bilinguals to themselves under different linguistic or cognitive contexts allows researchers to ask questions about how language and cognitive processes interact based on dynamically fluctuating cognitive and neural states. A States approach, which has already been used by bilingualism researchers, allows for cause-and-effect hypotheses and shifts our focus from questions of group differences to questions of how varied linguistic environments influence cognitive operations in the moment and how fluctuations in cognitive engagement impact language processing.more » « less
An official website of the United States government

