skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Children and adults produce distinct technology- and human-directed speech
Abstract This study compares how English-speaking adults and children from the United States adapt their speech when talking to a real person and a smart speaker (Amazon Alexa) in a psycholinguistic experiment. Overall, participants produced more effortful speech when talking to a device (longer duration and higher pitch). These differences also varied by age: children produced even higher pitch in device-directed speech, suggesting a stronger expectation to be misunderstood by the system. In support of this, we see that after a staged recognition error by the device, children increased pitch even more. Furthermore, both adults and children displayed the same degree of variation in their responses for whether “Alexa seems like a real person or not”, further indicating that children’s conceptualization of the system’s competence shaped their register adjustments, rather than an increased anthropomorphism response. This work speaks to models on the mechanisms underlying speech production, and human–computer interaction frameworks, providing support for routinized theories of spoken interaction with technology.  more » « less
Award ID(s):
1911855
PAR ID:
10521304
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
Nature Publishing Group
Date Published:
Journal Name:
Scientific Reports
Volume:
14
Issue:
1
ISSN:
2045-2322
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    This paper investigates users’ speech rate adjustments during conversations with an Amazon Alexa socialbot in response to situational (in-lab vs. at-home) and communicative (ASR comprehension errors) factors. We collected user interaction studies and measured speech rate at each turn in the conversation and in baseline productions (collected prior to the interaction). Overall, we find that users slow their speech rate when talking to the bot, relative to their pre-interaction productions, consistent with hyperarticulation. Speakers use an even slower speech rate in the in-lab setting (relative to at-home). We also see evidence for turn-level entrainment: the user follows the directionality of Alexa’s changes in rate in the immediately preceding turn. Yet, we do not see differences in hyperarticulation or entrainment in response to ASR errors, or on the basis of user ratings of the interaction. Overall, this work has implications for human-computer interaction and theories of linguistic adaptation and entrainment. 
    more » « less
  2. Ruban, Nersisson (Ed.)
    Though it is often taken as a truism that communication contributes to organizational productivity, there are surprisingly few empirical studies documenting a relationship between observable interaction and productivity. This is because comprehensive, direct observation of communication in organizational settings is notoriously difficult. In this paper, we report a method for extracting network and speech characteristics data from audio recordings of participants talking with each other in real time. We use this method to analyze communication and productivity data from seventy-nine employees working within a software engineering organization who had their speech recorded during working hours for a period of approximately 3 years. From the speech data, we infer when any two individuals are talking to each other and use this information to construct a communication graph for the organization for each week. We use the spectral and temporal characteristics of the produced speech and the structure of the resultant communication graphs to predict the productivity of the group, as measured by the number of lines of code produced. The results indicate that the most important speech and network features for predicting productivity include those that measure the number of unique people interacting within the organization, the frequency of interactions, and the topology of the communication network. 
    more » « less
  3. Understanding people's attitudes towards robots and how those attitudes are affected by exposure to robots is essential to the effective design and development of social robots. Although researchers have been studying attitudes towards robots among adults and even children for more than a decade, little has been explored assessing attitudes among teens-a highly vulnerable population that presents unique opportunities and challenges for social robots. Our work aims to close this gap. In this paper we present findings from several participatory robot interaction and design sessions with 136 teenagers who completed a modified version of the Negative Attitudes Towards Robots Scale (NARS) before participation in a robot interaction. Our data reveal that most teens are 1) highly optimistic about the helpfulness of robots, 2) do not feel nervous talking with a robot, but also 3) do not trust a robot with their data. Ninety teens also completed a post-interaction survey and reported a significant change in the motional attitudes subscale of the NARS. We discuss the implications of our findings on the design of social robots for teens. 
    more » « less
  4. Children use popular web search tools, which are generally designed for adult users. Because children have different developmental needs than adults, these tools may not always adequately support their search for information. Moreover, even though search tools offer support to help in query formulation, these too are aimed at adults and may hinder children rather than help them. This calls for the examination of existing technologies in this area, to better understand what remains to be done when it comes to facilitating query-formulation tasks for young users. In this paper, we investigate interaction elements of query formulation--including query suggestion algorithms--for children. The primary goals of our research efforts are to: (i) examine existing plug-ins and interfaces that explicitly aid children's query formulation; (ii) investigate children's interactions with suggestions offered by a general-purpose query suggestion strategy vs. a counterpart designed with children in mind; and (iii) identify, via participatory design sessions, their preferences when it comes to tools / strategies that can help children find information and guide them through the query formulation process. Our analysis shows that existing tools do not meet children's needs and expectations; the outcomes of our work can guide researchers and developers as they implement query formulation strategies for children. 
    more » « less
  5. null (Ed.)
    More and more, humans are engaging with voice-activated artificially intelligent (voice-AI) systems that have names (e.g., Alexa), apparent genders, and even emotional expression; they are in many ways a growing ‘social’ presence. But to what extent do people display sociolinguistic attitudes, developed from human-human interaction, toward these disembodied text-to-speech (TTS) voices? And how might they vary based on the cognitive traits of the individual user? The current study addresses these questions, testing native English speakers’ judgments for 6 traits (intelligent, likeable, attractive, professional, human-like, and age) for a naturally-produced female human voice and the US-English default Amazon Alexa voice. Following exposure to the voices, participants completed these ratings for each speaker, as well as the Autism Quotient (AQ) survey, to assess individual differences in cognitive processing style. Results show differences in individuals’ ratings of the likeability and human-likeness of the human and AI talkers based on AQ score. Results suggest that humans transfer social assessment of human voices to voice-AI, but that the way they do so is mediated by their own cognitive characteristics. 
    more » « less