Abstract Teaching a new concept through gestures—hand movements that accompany speech—facilitates learning above‐and‐beyond instruction through speech alone (e.g., Singer & Goldin‐Meadow,). However, the mechanisms underlying this phenomenon are still under investigation. Here, we use eye tracking to explore one often proposed mechanism—gesture's ability to direct visual attention. Behaviorally, we replicate previous findings: Children perform significantly better on a posttest after learning through Speech+Gesture instruction than through Speech Alone instruction. Using eye tracking measures, we show that children who watch a math lesson with gesturedoallocate their visual attention differently from children who watch a math lesson without gesture—they look more to the problem being explained, less to the instructor, and are more likely to synchronize their visual attention with information presented in the instructor's speech (i.e.,follow along with speech) than children who watch the no‐gesture lesson. The striking finding is that, even though these looking patterns positively predict learning outcomes, the patterns do notmediatethe effects of training condition (Speech Alone vs. Speech+Gesture) on posttest success. We find instead a complex relation between gesture and visual attention in which gesturemoderatesthe impact of visual looking patterns on learning—following along with speechpredicts learning for children in the Speech+Gesture condition, but not for children in the Speech Alone condition. Gesture's beneficial effects on learning thus come not merely from its ability to guide visual attention, but also from its ability to synchronize with speech and affect what learners glean from that speech.
more »
« less
There is more to gesture than meets the eye: Visual attention to gesture's referents cannot account for its facilitative effects during math instruction.
Teaching a new concept with gestures – hand movements that accompany speech – facilitates learning above-and-beyond instruction through speech alone (e.g., Singer & Goldin-Meadow, 2005). However, the mechanisms underlying this phenomenon are still being explored. Here, we use eyetracking to explore one mechanism – gesture’s ability to direct visual attention. We examine how children allocate their visual attention during a mathematical equivalence less on that either contains gesture or does not. We show that gesture instruction improves posttest performance, and additionally that gesture does change how children visually attend to instruction: children look more to the problem being explained, and less to the instructor.However looking patterns alone cannot explain gesture’s effect, as posttest performance is not predicted by any of our looking-time measures. These findings suggest that gesture does guide visual attention, but that attention alone cannot account for its facilitative learning effects.
more »
« less
- PAR ID:
- 10025977
- Date Published:
- Journal Name:
- Proceedings of the 37th Annual Meeting of the Cognitive Science Society
- Page Range / eLocation ID:
- 2141-2146
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract Psycholinguistic research on children's early language environments has revealed many potential challenges for language acquisition. One is that in many cases, referents of linguistic expressions are hard to identify without prior knowledge of the language. Likewise, the speech signal itself varies substantially in clarity, with some productions being very clear, and others being phonetically reduced, even to the point of uninterpretability. In this study, we sought to better characterize the language‐learning environment of American English‐learning toddlers by testing how well phonetic clarity and referential clarity align in infant‐directed speech. Using an existing Human Simulation Paradigm (HSP) corpus with referential transparency measurements and adding new measures of phonetic clarity, we found that the phonetic clarity of words’ first mentions significantly predicted referential clarity (how easy it was to guess the intended referent from visual information alone) at that moment. Thus, when parents’ speech was especially clear, the referential semantics were also clearer. This suggests that young children could use the phonetics of speech to identify globally valuable instances that support better referential hypotheses, by homing in on clearer instances and filtering out less‐clear ones. Such multimodal “gems” offer special opportunities for early word learning. Research HighlightsIn parent‐infant interaction, parents’ referential intentions are sometimes clear and sometimes unclear; likewise, parents’ pronunciation is sometimes clear and sometimes quite difficult to understand.We find that clearer referential instances go along with clearer phonetic instances, more so than expected by chance.Thus, there are globally valuable instances (“gems”) from which children could learn about words’ pronunciations and words’ meanings at the same time.Homing in on clear phonetic instances and filtering out less‐clear ones would help children identify these multimodal “gems” during word learning.more » « less
-
This study examined the immediate effects of mask-wearing on infant selective visual attention to audiovisual speech in familiar and unfamiliar languages. Infants distribute their selective attention to regions of a speaker's face differentially based on their age and language experience. However, the potential impact wearing a face mask may have on infants' selective attention to audiovisual speech has not been systematically studied. We utilized eye tracking to examine the proportion of infant looking time to the eyes and mouth of a masked or unmasked actress speaking in a familiar or unfamiliar language. Six-month-old and 12-month-old infants (n= 42, 55% female, 91% White Non-Hispanic/Latino) were shown videos of an actress speaking in a familiar language (English) with and without a mask on, as well as videos of the same actress speaking in an unfamiliar language (German) with and without a mask. Overall, infants spent more time looking at the unmasked presentations compared to the masked presentations. Regardless of language familiarity or age, infants spent more time looking at the mouth area of an unmasked speaker and they spent more time looking at the eyes of a masked speaker. These findings indicate mask-wearing has immediate effects on the distribution of infant selective attention to different areas of the face of a speaker during audiovisual speech.more » « less
-
Augmented reality (AR) headsets are being utilized in different task-based domains (e.g., healthcare, education) for both adults and children. However, prior work has mainly examined the applicability of AR headsets instead of how to design the visual information being displayed. It is essential to study how visual information should be presented in AR headsets to maximize task performance for both adults and children. Therefore, we conducted two studies (adults vs. children) analyzing distinct design combinations of critical and secondary textual information during a procedural assembly task. We found that while the design of information did not affect adults' task performance, the location of information had a direct effect on children's task performance. Our work contributes new understanding on how to design textual information in AR headsets to aid in adults’ and children's task performance. In addition, we identify specific differences on how to design textual information between adults and children.more » « less
-
Automatic speech recognition (ASR) systems for children have lagged behind in performance when compared to adult ASR. The exact problems and evaluation methods for child ASR have not yet been fully investigated. Recent work from the robotics community suggests that ASR for kindergarten speech is especially difficult, even though this age group may benefit most from voice-based educational and diagnostic tools. Our study focused on ASR performance for specific grade levels (K-10) using a word identification task. Grade-specific ASR systems were evaluated, with particular attention placed on the evaluation of kindergarten-aged children (5-6 years old). Experiments included investigation of grade-specific interactions with triphone models using feature space maximum likelihood linear regression (fMLLR), vocal tract length normalization (VTLN), and subglottal resonance (SGR) normalization. Our results indicate that kindergarten ASR performs dramatically worse than even 1st grade ASR, likely due to large speech variability at that age. As such, ASR systems may require targeted evaluations on kindergarten speech rather than being evaluated under the guise of “child ASR.” Additionally, results show that systems trained in matched conditions on kindergarten speech may be less suitable than mismatched-grade training with 1st grade speech. Finally, we analyzed the phonetic errors made by the kindergarten ASR.more » « less
An official website of the United States government

