Positive interpersonal relationships require shared understanding along with a sense of rapport. A key facet of rapport is mirroring and convergence of facial expression and body language, known as nonverbal synchrony. We examined nonverbal synchrony in a study of 29 heterosexual romantic couples, in which audio, video, and bracelet accelerometer were recorded during three conversations. We extracted facial expression, body movement, and acoustic-prosodic features to train neural network models that predicted the nonverbal behaviors of one partner from those of the other. Recurrent models (LSTMs) outperformed feed-forward neural networks and other chance baselines. The models learned behaviors encompassing facial responses, speech-related facial movements, and head movement. However, they did not capture fleeting or periodic behaviors, such as nodding, head turning, and hand gestures. Notably, a preliminary analysis of clinical measures showed greater association with our model outputs than correlation of raw signals. We discuss potential uses of these generative models as a research tool to complement current analytical methods along with real-world applications (e.g., as a tool in therapy).
more »
« less
Facial expressions contribute more than body movements to conversational outcomes in avatar-mediated virtual environments
Abstract This study focuses on the individual and joint contributions of two nonverbal channels (i.e., face and upper body) in avatar mediated-virtual environments. 140 dyads were randomly assigned to communicate with each other via platforms that differentially activated or deactivated facial and bodily nonverbal cues. The availability of facial expressions had a positive effect on interpersonal outcomes. More specifically, dyads that were able to see their partner’s facial movements mapped onto their avatars liked each other more, formed more accurate impressions about their partners, and described their interaction experiences more positively compared to those unable to see facial movements. However, the latter was only true when their partner’s bodily gestures were also available and not when only facial movements were available. Dyads showed greater nonverbal synchrony when they could see their partner’s bodily and facial movements. This study also employed machine learning to explore whether nonverbal cues could predict interpersonal attraction. These classifiers predicted high and low interpersonal attraction at an accuracy rate of 65%. These findings highlight the relative significance of facial cues compared to bodily cues on interpersonal outcomes in virtual environments and lend insight into the potential of automatically tracked nonverbal cues to predict interpersonal attitudes.
more »
« less
- Award ID(s):
- 1800922
- PAR ID:
- 10203191
- Publisher / Repository:
- Nature Publishing Group
- Date Published:
- Journal Name:
- Scientific Reports
- Volume:
- 10
- Issue:
- 1
- ISSN:
- 2045-2322
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
null (Ed.)We show that the human voice has complex acoustic qualities that are directly coupled to peripheral musculoskeletal tensioning of the body, such as subtle wrist movements. In this study, human vocalizers produced a steady-state vocalization while rhythmically moving the wrist or the arm at different tempos. Although listeners could only hear and not see the vocalizer, they were able to completely synchronize their own rhythmic wrist or arm movement with the movement of the vocalizer which they perceived in the voice acoustics. This study corroborates recent evidence suggesting that the human voice is constrained by bodily tensioning affecting the respiratory–vocal system. The current results show that the human voice contains a bodily imprint that is directly informative for the interpersonal perception of another’s dynamic physical states.more » « less
-
Displaying emotional states is an important part of nonverbal communication that can facilitate successful interactions. Facial expressions have been studied for their emotional expression, but this work looks at the capacity of body movements to convey different emotions. This work first generates a large set of nonverbal behaviors with a variety of torso and arm properties on a humanoid robot, Quori. Participants in a user study evaluated how much each movement displayed each of eight different emotions. Results indicate that specific movement properties are associated with particular emotions; such as leaning backward and arms held high displaying surprise and leaning forward displaying sadness. Understanding the emotions associated with certain movements can allow for the design of more appropriate behaviors during interactions with humans and could improve people’s perception of the robot.more » « less
-
In a future of pervasive augmented reality (AR), AR systems will need to be able to efficiently draw or guide the attention of the user to visual points of interest in their physical-virtual environment. Since AR imagery is overlaid on top of the user's view of their physical environment, these attention guidance techniques must not only compete with other virtual imagery, but also with distracting or attention-grabbing features in the user's physical environment. Because of the wide range of physical-virtual environments that pervasive AR users will find themselves in, it is difficult to design visual cues that “pop out” to the user without performing a visual analysis of the user's environment, and changing the appearance of the cue to stand out from its surroundings. In this paper, we present an initial investigation into the potential uses of dichoptic visual cues for optical see-through AR displays, specifically cues that involve having a difference in hue, saturation, or value between the user's eyes. These types of cues have been shown to be preattentively processed by the user when presented on other stereoscopic displays, and may also be an effective method of drawing user attention on optical see-through AR displays. We present two user studies: one that evaluates the saliency of dichoptic visual cues on optical see-through displays, and one that evaluates their subjective qualities. Our results suggest that hue-based dichoptic cues or “Forbidden Colors” may be particularly effective for these purposes, achieving significantly lower error rates in a pop out task compared to value-based and saturation-based cues.more » « less
-
Extended reality (XR) technologies, such as virtual reality (VR) and augmented reality (AR), provide users, their avatars, and embodied agents a shared platform to collaborate in a spatial context. Although traditional face-to-face communication is limited by users’ proximity, meaning that another human’s non-verbal embodied cues become more difficult to perceive the farther one is away from that person, researchers and practitioners have started to look into ways to accentuate or amplify such embodied cues and signals to counteract the effects of distance with XR technologies. In this article, we describe and evaluate the Big Head technique, in which a human’s head in VR/AR is scaled up relative to their distance from the observer as a mechanism for enhancing the visibility of non-verbal facial cues, such as facial expressions or eye gaze. To better understand and explore this technique, we present two complimentary human-subject experiments in this article. In our first experiment, we conducted a VR study with a head-mounted display to understand the impact of increased or decreased head scales on participants’ ability to perceive facial expressions as well as their sense of comfort and feeling of “uncannniness” over distances of up to 10 m. We explored two different scaling methods and compared perceptual thresholds and user preferences. Our second experiment was performed in an outdoor AR environment with an optical see-through head-mounted display. Participants were asked to estimate facial expressions and eye gaze, and identify a virtual human over large distances of 30, 60, and 90 m. In both experiments, our results show significant differences in minimum, maximum, and ideal head scales for different distances and tasks related to perceiving faces, facial expressions, and eye gaze, and we also found that participants were more comfortable with slightly bigger heads at larger distances. We discuss our findings with respect to the technologies used, and we discuss implications and guidelines for practical applications that aim to leverage XR-enhanced facial cues.more » « less
An official website of the United States government
