Abstract The enhancement hypothesis suggests that deaf individuals are more vigilant to visual emotional cues than hearing individuals. The present eye-tracking study examined ambient–focal visual attention when encoding affect from dynamically changing emotional facial expressions. Deaf (n = 17) and hearing (n = 17) individuals watched emotional facial expressions that in 10-s animations morphed from a neutral expression to one of happiness, sadness, or anger. The task was to recognize emotion as quickly as possible. Deaf participants tended to be faster than hearing participants in affect recognition, but the groups did not differ in accuracy. In general, happy faces were more accurately and more quickly recognized than faces expressing anger or sadness. Both groups demonstrated longer average fixation duration when recognizing happiness in comparison to anger and sadness. Deaf individuals directed their first fixations less often to the mouth region than the hearing group. During the last stages of emotion recognition, deaf participants exhibited more focal viewing of happy faces than negative faces. This pattern was not observed among hearing individuals. The analysis of visual gaze dynamics, switching between ambient and focal attention, was useful in studying the depth of cognitive processing of emotional information among deaf and hearing individuals.
more »
« less
Exploring the Social Influence of Virtual Humans Unintentionally Conveying Conflicting Emotions
The expression of human emotion is integral to social interaction, and in virtual reality it is increasingly common to develop virtual avatars that attempt to convey emotions by mimicking these visual and aural cues, i.e. the facial and vocal expressions. However, errors in (or the absence of) facial tracking can result in the rendering of incorrect facial expressions on these virtual avatars. For example, a virtual avatar may speak with a happy or unhappy vocal inflection while their facial expression remains otherwise neutral. In circumstances where there is conflict between the avatar's facial and vocal expressions, it is possible that users will incorrectly interpret the avatar's emotion, which may have unintended consequences in terms of social influence or in terms of the outcome of the interaction. In this paper, we present a human-subjects study (N = 22) aimed at understanding the impact of conflicting facial and vocal emotional expressions. Specifically we explored three levels of emotional valence (unhappy, neutral, and happy) expressed in both visual (facial) and aural (vocal) forms. We also investigate three levels of head scales (down-scaled, accurate, and up-scaled) to evaluate whether head scale affects user interpretation of the conveyed emotion. We find significant effects of different multimodal expressions on happiness and trust perception, while no significant effect was observed for head scales. Evidence from our results suggest that facial expressions have a stronger impact than vocal expressions. Additionally, as the difference between the two expressions increase, the less predictable the multimodal expression becomes. For example, for the happy-looking and happy-sounding multimodal expression, we expect and see high happiness rating and high trust, however if one of the two expressions change, this mismatch makes the expression less predictable. We discuss the relationships, implications, and guidelines for social applications that aim to leverage multimodal social cues.
more »
« less
- Award ID(s):
- 1800961
- PAR ID:
- 10442473
- Date Published:
- Journal Name:
- 2023 IEEE Conference Virtual Reality and 3D User Interfaces (VR)
- Page Range / eLocation ID:
- 571 to 580
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
This paper demonstrates the utility of ambient-focal attention and pupil dilation dynamics to describe visual processing of emotional facial expressions. Pupil dilation and focal eye movements reflect deeper cognitive processing and thus shed more light on the dy- namics of emotional expression recognition. Socially anxious in- dividuals (N = 24) and non-anxious controls (N = 24) were asked to recognize emotional facial expressions that gradually morphed from a neutral expression to one of happiness, sadness, or anger in 10-sec animations. Anxious cohorts exhibited more ambient face scanning than their non-anxious counterparts. We observed a positive relationship between focal fixations and pupil dilation, indi- cating deeper processing of viewed faces, but only by non-anxious participants, and only during the last phase of emotion recognition. Group differences in the dynamics of ambient-focal attention sup- port the hypothesis of vigilance to emotional expression processing by socially anxious individuals. We discuss the results by referring to current literature on cognitive psychopathology.more » « less
-
In general, people tend to identify the emotions of others from their facial expressions, however recent findings suggest that we may be more accurate when we hear someone’s voice than when we look only at their facial expression. The study reported in the paper examined whether these findings hold true for animated agents. A total of 37 subjects participated in the study: 19 males, 14 females, and 4 of non-specified gender. Subjects were asked to view 18 video stimuli; 9 clips featured a male agent and 9 clips a female agent. Each agent showed 3 different facial expressions (happy, angry, neutral), each one paired with 3 different voice lines spoken in three different tones (happy, angry, neutral). Hence, in some clips the agent’s tone of voice and facial expression were congruent, while in some videos they were not. Subjects answered questions regarding the emotion they believed the agent was feeling and rated the emotion intensity, typicality, and sincerity. Findings showed that emotion recognition rate and ratings of emotion intensity, typicality and sincerity were highest when the agent’s face and voice were congruent. However, when the channels were incongruent, subjects identified the emotion more accurately from the agent’s facial expression than the tone of voice.more » « less
-
Extended reality (XR) technologies, such as virtual reality (VR) and augmented reality (AR), provide users, their avatars, and embodied agents a shared platform to collaborate in a spatial context. Although traditional face-to-face communication is limited by users’ proximity, meaning that another human’s non-verbal embodied cues become more difficult to perceive the farther one is away from that person, researchers and practitioners have started to look into ways to accentuate or amplify such embodied cues and signals to counteract the effects of distance with XR technologies. In this article, we describe and evaluate the Big Head technique, in which a human’s head in VR/AR is scaled up relative to their distance from the observer as a mechanism for enhancing the visibility of non-verbal facial cues, such as facial expressions or eye gaze. To better understand and explore this technique, we present two complimentary human-subject experiments in this article. In our first experiment, we conducted a VR study with a head-mounted display to understand the impact of increased or decreased head scales on participants’ ability to perceive facial expressions as well as their sense of comfort and feeling of “uncannniness” over distances of up to 10 m. We explored two different scaling methods and compared perceptual thresholds and user preferences. Our second experiment was performed in an outdoor AR environment with an optical see-through head-mounted display. Participants were asked to estimate facial expressions and eye gaze, and identify a virtual human over large distances of 30, 60, and 90 m. In both experiments, our results show significant differences in minimum, maximum, and ideal head scales for different distances and tasks related to perceiving faces, facial expressions, and eye gaze, and we also found that participants were more comfortable with slightly bigger heads at larger distances. We discuss our findings with respect to the technologies used, and we discuss implications and guidelines for practical applications that aim to leverage XR-enhanced facial cues.more » « less
-
Facial expressions of emotions by people with visual impairment and blindness via video conferencingMany people including those with visual impairment and blindness take advantage of video conferencing tools to meet people. Video conferencing tools enable them to share facial expressions that are considered as one of the most important aspects of human communication. This study aims to advance knowledge of how those with visual impairment and blindness share their facial expressions of emotions virtually. This study invited a convenience sample of 28 adults with visual impairment and blindness to Zoom video conferencing. The participants were instructed to pose facial expressions of basic human emotions (anger, fear, disgust, happiness, surprise, neutrality, calmness, and sadness), which were video recorded. The facial expressions were analyzed using the Facial Action Coding System (FACS) that encodes the movement of specific facial muscles called Action Units (AUs). This study found that there was a particular set of AUs significantly engaged in expressing each emotion, except for sadness. Individual differences were also found in AUs influenced by the participants’ visual acuity levels and emotional characteristics such as valence and arousal levels. The research findings are anticipated to serve as the foundation of knowledge, contributing to developing emotion-sensing technologies for those with visual impairment and blindness.more » « less
An official website of the United States government

