skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: A Look "Inside" Children's Real-time Processing of Spatial Prepositions
A wealth of evidence indicates that children use their developing linguistic knowledge to incrementally interpret speech and predict upcoming reference to objects. For verbs, determiners, case-markers, and adjectives, hearing linguistic information that sufficiently constrains referent choice leads to anticipatory eye-movements. There is, however, limited evidence about whether children also use spatial prepositions predictively. This is surprising and theoretically important: spatial prepositions provide abstract semantic information that must interface with spatial properties of, and relations between, objects in the world. Making this connection may develop late because of the complex mapping required. In a visual-world eye-tracking task, we find that adults and 4-year-olds hearing 'inside' (but not 'near') look predictively to objects that afford the property of containment. We conclude that children make predictions about the geometric properties of objects from spatial terms that specify these properties, suggesting real-time use of language to guide analysis of objects in the visual world.  more » « less
Award ID(s):
2313939
PAR ID:
10534393
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
Proceedings of the Annual Meeting of the Cognitive Science Society
Date Published:
Volume:
46
ISSN:
1069-7977
Subject(s) / Keyword(s):
Spatial prepositions Sentence processing Language development Eye-tracking
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The goal of this article is to enable robots to perform robust task execution following human instructions in partially observable environments. A robot’s ability to interpret and execute commands is fundamentally tied to its semantic world knowledge. Commonly, robots use exteroceptive sensors, such as cameras or LiDAR, to detect entities in the workspace and infer their visual properties and spatial relationships. However, semantic world properties are often visually imperceptible. We posit the use of non-exteroceptive modalities including physical proprioception, factual descriptions, and domain knowledge as mechanisms for inferring semantic properties of objects. We introduce a probabilistic model that fuses linguistic knowledge with visual and haptic observations into a cumulative belief over latent world attributes to infer the meaning of instructions and execute the instructed tasks in a manner robust to erroneous, noisy, or contradictory evidence. In addition, we provide a method that allows the robot to communicate knowledge dissonance back to the human as a means of correcting errors in the operator’s world model. Finally, we propose an efficient framework that anticipates possible linguistic interactions and infers the associated groundings for the current world state, thereby bootstrapping both language understanding and generation. We present experiments on manipulators for tasks that require inference over partially observed semantic properties, and evaluate our framework’s ability to exploit expressed information and knowledge bases to facilitate convergence, and generate statements to correct declared facts that were observed to be inconsistent with the robot’s estimate of object properties. 
    more » « less
  2. We investigate the roles of linguistic and sensory experience in the early-produced visual, auditory, and abstract words of congenitally-blind toddlers, deaf toddlers, and typicallysighted/ hearing peers. We also assess the role of language access by comparing early word production in children learning English or American Sign Language (ASL) from birth, versus at a delay. Using parental report data on child word production from the MacArthur-Bates Communicative Development Inventory, we found evidence that while children produced words referring to imperceptible referents before age 2, such words were less likely to be produced relative to words with perceptible referents. For instance, blind (vs. sighted) children said fewer highly visual words like “blue” or “see”; deaf signing (vs. hearing) children produced fewer auditory signs like HEAR. Additionally, in spoken English and ASL, children who received delayed language access were less likely to produce words overall. These results demonstrate and begin to quantify how linguistic and sensory access may influence which words young children produce. 
    more » « less
  3. Abstract Objects and places are foundational spatial domains represented in human symbolic expressions, like drawings, which show a prioritization of depicting small-scale object-shape information over the large-scale navigable place information in which objects are situated. Is there a similar object-over-place bias in language? Across six experiments, adults and 3- to 4-year-old children were asked either to extend a novel noun in a labeling phrase, to extend a novel noun in a prepositional phrase, or to simply match pictures. To dissociate specific object and place information from more general figure and ground information, participants either saw scenes with both place information (a room) and object information (a block in the room), or scenes with two kinds of object information that matched the figure-ground relations of the room and block by presenting an open container with a smaller block inside. While adults showed a specific object-over-place bias in both extending novel noun labels and matching, they did not show this bias in extending novel nouns following prepositions. Young children showed this bias in extending novel noun labels only. Spatial domains may thus confer specific and foundational biases for word learning that may change through development in a way that is similar to that of other word-learning biases about objects, like the shape bias. These results expand the symbolic scope of prior studies on object biases in drawing to object biases in language, and they expand the spatial domains of prior studies characterizing the language of objects and places. 
    more » « less
  4. Purpose This study examined whether 2-year-olds are better able to acquire novel verb meanings when they appear in varying linguistic contexts, including both content nouns and pronouns, as compared to when the contexts are consistent, including only content nouns. Additionally, differences between typically developing toddlers and late talkers were explored. Method Forty-seven English-acquiring 2-year-olds ( n = 14 late talkers, n = 33 typically developing) saw scenes of actors manipulating objects. These actions were labeled with novel verbs. In the varied condition, children heard sentences containing both content nouns and pronouns (e.g., “The girl is ziffing the truck. She is ziffing it!”). In the consistent condition, children heard the verb an equal number of times, but only with content nouns (e.g., “The girl is ziffing the truck. The girl is ziffing the truck!”). At test, children were shown two new scenes and were asked to find the novel verb's referent. Children's eye gaze was analyzed as a measure of learning. Results Mixed-effects regression analyses revealed that children looked more toward the correct scene in the consistent condition than the varied condition. This difference was more pronounced for late talkers than for typically developing children. Conclusion To acquire an initial representation of a new verb's meaning, children, particularly late talkers, benefit more from hearing the verb in consistent linguistic contexts than in varying contexts. 
    more » « less
  5. Some animals including humans use stereoscopic vision which reconstructs spatial information about the environment from the disparity between images captured by eyes in two separate adjacent locations. Like other sensory information, such stereoscopic information is expected to influence attentional selection. We develop a biologically plausible model of binocular vision to study its effect on bottom-up visual attention, i.e., visual saliency. In our model, the scene is organized in terms of proto-objects on which attention acts, rather than on unbound sets of elementary features. We show that taking into account the stereoscopic information improves the performance of the model in the prediction of human eye movements with statistically significant differences. 
    more » « less