skip to main content


Title: Seeing is Believing: Improving the Perceived Trust in Visually Embodied Alexa in Augmented Reality
Voice-activated Intelligent Virtual Assistants (IVAs) such as Amazon Alexa offer a natural and realistic form of interaction that pursues the level of social interaction among real humans. The user experience with such technologies depends to a large degree on the perceived trust in and reliability of the IVA. In this poster, we explore the effects of a three-dimensional embodied representation of Amazon Alexa in Augmented Reality (AR) on the user’s perceived trust in her being able to control Internet of Things (IoT) devices in a smart home environment. We present a preliminary study and discuss the potential of positive effects in perceived trust due to the embodied representation compared to a voice-only condition.  more » « less
Award ID(s):
1800961
NSF-PAR ID:
10105860
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
17th IEEE International Symposium on Mixed and Augmented Reality
Page Range / eLocation ID:
204 to 205
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Voice assistants embodied in smart speakers (e.g., Amazon Echo, Google Home) enable voice-based interaction that does not necessarily rely on expertise with mobile or desktop computing. Hence, these voice assistants offer new opportunities to different populations, including individuals who are not interested or able to use traditional computing devices such as computers and smartphones. To understand how older adults who use technology infrequently perceive and use these voice assistants, we conducted a 3-week field deployment of the Amazon Echo Dot in the homes of seven older adults. While some types of usage dropped over the 3-week period (e.g., playing music), we observed consistent usage for finding online information. Given that much of this information was health-related, this finding emphasizes the need to revisit concerns about credibility of information with this new interaction medium. Although features to support memory (e.g., setting timers, reminders) were initially perceived as useful, the actual usage was unexpectedly low due to reliability concerns. We discuss how these findings apply to other user groups along with design implications and recommendations for future work on voice-user interfaces. 
    more » « less
  2. Voice assistants embodied in smart speakers (e.g., Amazon Echo, Google Home) enable conversational interaction that does not necessarily rely on expertise with mobile or desktop computing. Hence, these voice assistants offer new opportunities to different populations, including individuals who are not interested or able to use traditional computing devices such as computers and smartphones. To understand how older adults who use technology infrequently perceive and use these voice assistants, we conducted a three-week field deployment of the Amazon Echo Dot in the homes of seven older adults. Participants described increased confidence using digital technology and found the conversational voice interfaces easy to use. While some types of usage dropped over the three-week period (e.g., playing music), we observed consistent usage for finding online information. Given that much of this information was health-related, this finding emphasizes the need to revisit concerns about credibility of information with this new interaction medium. Although features to support memory (e.g., setting timers, reminders) were initially perceived as useful, the actual usage was unexpectedly low due to reliability concerns. We discuss how these findings apply to other user groups along with design implications and recommendations for future work on voice user interfaces. 
    more » « less
  3. null (Ed.)
    The present study compares how individuals perceive gradient acoustic realizations of emotion produced by a human voice versus an Amazon Alexa text-to-speech (TTS) voice. We manipulated semantically neutral sentences spoken by both talkers with identical emotional synthesis methods, using three levels of increasing ‘happiness’ (0 %, 33 %, 66% ‘happier’). On each trial, listeners (native speakers of American English, n=99) rated a given sentence on two scales to assess dimensions of emotion: valence (negative-positive) and arousal (calm-excited). Participants also rated the Alexa voice on several parameters to assess anthropomorphism (e.g., naturalness, human-likeness, etc.). Results showed that the emotion manipulations led to increases in perceived positive valence and excitement. Yet, the effect differed by interlocutor: increasing ‘happiness’ manipulations led to larger changes for the human voice than the Alexa voice. Additionally, we observed individual differences in perceived valence/arousal based on participants’ anthropomorphism scores. Overall, this line of research can speak to theories of computer personification and elucidate our changng relationship with voice-AI technology. 
    more » « less
  4. null (Ed.)
    Smart speakers such as Amazon Echo present promising opportunities for exploring voice interaction in the domain of in-home exercise tracking. In this work, we examine if and how voice interaction complements and augments a mobile app in promoting consistent exercise. We designed and developed TandemTrack, which combines a mobile app and an Alexa skill to support exercise regimen, data capture, feedback, and reminder. We then conducted a four-week between-subjects study deploying TandemTrack to 22 participants who were instructed to follow a short daily exercise regimen: one group used only the mobile app and the other group used both the app and the skill. We collected rich data on individuals' exercise adherence and performance, and their use of voice and visual interactions, while examining how TandemTrack as a whole influenced their exercise experience. Reflecting on these data, we discuss the benefits and challenges of incorporating voice interaction to assist daily exercise, and implications for designing effective multimodal systems to support self-tracking and promote consistent exercise. 
    more » « less
  5. null (Ed.)
    More and more, humans are engaging with voice-activated artificially intelligent (voice-AI) systems that have names (e.g., Alexa), apparent genders, and even emotional expression; they are in many ways a growing ‘social’ presence. But to what extent do people display sociolinguistic attitudes, developed from human-human interaction, toward these disembodied text-to-speech (TTS) voices? And how might they vary based on the cognitive traits of the individual user? The current study addresses these questions, testing native English speakers’ judgments for 6 traits (intelligent, likeable, attractive, professional, human-like, and age) for a naturally-produced female human voice and the US-English default Amazon Alexa voice. Following exposure to the voices, participants completed these ratings for each speaker, as well as the Autism Quotient (AQ) survey, to assess individual differences in cognitive processing style. Results show differences in individuals’ ratings of the likeability and human-likeness of the human and AI talkers based on AQ score. Results suggest that humans transfer social assessment of human voices to voice-AI, but that the way they do so is mediated by their own cognitive characteristics. 
    more » « less