skip to main content


Title: The Power of Voice to Convey Emotion in Multimedia Instructional Messages
This study examines an aspect of the role of emotion in multimedia learning, i.e., whether participants can recognize the instructor’s positive or negative emotion based on hearing short clips involving only the instructor’s voice just as well as also seeing an embodied onscreen agent. Participants viewed 16 short video clips from a statistics lecture in which an animated instructor, conveying a happy, content, frustrated, or bored emotion, stands next to a slide as she lectures (agent present) or uses only her voice (agent absent). For each clip, participants rated the instructor on five-point scales for how happy, content, frustrated, and bored the instructor seemed. First, for happy, content, and bored instructors, participants were just as accurate in rating emotional tone based on voice only as with voice plus onscreen agent. This supports the voice hypothesis, which posits that voice is a powerful source of social-emotional information. Second, participants rated happy and content instructors higher on happy and content scales and rated frustrated and bored instructors higher on frustrated and bored scales. This supports the positivity hypothesis, which posits that people are particularly sensitive to the positive or negative tone of multimedia instructional messages.  more » « less
Award ID(s):
1821894
NSF-PAR ID:
10341385
Author(s) / Creator(s):
Date Published:
Journal Name:
International journal of artificial intelligence in education
ISSN:
1560-4292
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    The positivity principle states that people learn better from instructors who display positive emotions rather than negative emotions. In two experiments, students viewed a short video lecture on a statistics topic in which an instructor stood next to a series of slides as she lectured and then they took either an immediate test (Experiment 1) or a delayed test (Experiment 2). In a between-subjects design, students saw an instructor who used her voice, body movement, gesture, facial expression, and eye gaze to display one of four emotions while lecturing: happy (positive/active), content (positive/passive), frustrated (negative/active), or bored (negative/passive). First, learners were able to recognize the emotional tone of the instructor in an instructional video lecture, particularly by more strongly rating a positive instructor as displaying positive emotions and a negative instructor as displaying negative emotions (in Experiments 1 and 2). Second, concerning building a social connection during learning, learners rated a positive instructor as more likely to facilitate learning, more credible, and more engaging than a negative instructor (in Experiments 1 and 2). Third, concerning cognitive engagement during learning, learners reported paying more attention during learning for a positive instructor than a negative instructor (in Experiments 1 and 2). Finally, concerning learning outcome, learners who had a positive instructor scored higher than learners who had a negative instructor on a delayed posttest (Experiment 2) but not an immediate posttest (Experiment 1). Overall, there is evidence for the positivity principle and the cognitive-affective model of e-learning from which it is derived.

     
    more » « less
  2. In general, people tend to identify the emotions of others from their facial expressions, however recent findings suggest that we may be more accurate when we hear someone’s voice than when we look only at their facial expression. The study reported in the paper examined whether these findings hold true for animated agents. A total of 37 subjects participated in the study: 19 males, 14 females, and 4 of non-specified gender. Subjects were asked to view 18 video stimuli; 9 clips featured a male agent and 9 clips a female agent. Each agent showed 3 different facial expressions (happy, angry, neutral), each one paired with 3 different voice lines spoken in three different tones (happy, angry, neutral). Hence, in some clips the agent’s tone of voice and facial expression were congruent, while in some videos they were not. Subjects answered questions regarding the emotion they believed the agent was feeling and rated the emotion intensity, typicality, and sincerity. Findings showed that emotion recognition rate and ratings of emotion intensity, typicality and sincerity were highest when the agent’s face and voice were congruent. However, when the channels were incongruent, subjects identified the emotion more accurately from the agent’s facial expression than the tone of voice. 
    more » « less
  3. This study examined how well people can recognize and relate to animated pedagogical agents of varying ethnicities/races and genders. For both Study 1 (realistic-style agents) and Study 2 (cartoon-style agents), participants viewed brief video clips of virtual agents of varying racial/ethnic categories and gender types and then identified their race/ethnicity and gender and rated how human-like and likable the agent appeared. Participants were highly accurate in identifying Black and White agents but were less accurate for Asian, Indian, and Hispanic agents. Participants were accurate in recognizing gender differences. Participants rated all types of agents as moderately human-like, except for White agents. Likability ratings were lowest for White and male agents. The same pattern of results was obtained across two independent studies with different participants and different onscreen agents, which indicates that the results are not solely due to one specific set of agents. Consistent with the Media Equation Hypothesis and the Alliance Hypothesis, this work shows that people are sensitive to the race/ethnicity and gender of onscreen agents and relate to them differently. These findings have implications for how to design animated pedagogical agents for improved multimedia learning environments in the future and serve as a crucial first step in highlighting the possibility and feasibility of incorporating diverse onscreen virtual agents into educational computer software.

     
    more » « less
  4. he goal of this research is to identify key affective body gestures that can clearly convey four emotions, namely happy, content, bored, and frustrated, in animated characters that lack facial features. Two studies were conducted, a first to identify affective body gestures from a series of videos, and a second to validate the gestures as representative of the four emotions. Videos were created using motion capture data of four actors portraying the four targeted emotions and mapping the data to two 3D character models, one male and one female. In the first study the researchers identified body gestures that are commonly produced by individuals when they experience each of the four emotions. In the second study the researchers tested four sets of identified body gestures, one set for each emotion. The animated gestures were mapped to the 3D character models and 91 participants were asked to identify the emotional state conveyed by the characters through the body gestures. The study identified six gestures that were shown to have an acceptable recognition rate of at least 80% for three of the four emotions tested. Contentment was the only emotion which was not conveyed clearly by the identified body gestures. The gender of the character had a significant effect on recognition rates across all emotions. 
    more » « less
  5. The overall goal of our research is to develop a system of intelligent multimodal affective pedagogical agents that are effective for different types of learners (Adamo et al., 2021). While most of the research on pedagogical agents tends to focus on the cognitive aspects of online learning and instruction, this project explores the less-studied role of affective (or emotional) factors. We aim to design believable animated agents that can convey realistic, natural emotions through speech, facial expressions, and body gestures and that can react to the students’ detected emotional states with emotional intelligence. Within the context of this goal, the specific objective of the work reported in the paper was to examine the extent to which the agents’ facial micro-expressions affect students’ perception of the agents’ emotions and their naturalness. Micro-expressions are very brief facial expressions that occur when a person either deliberately or unconsciously conceals an emotion being felt (Ekman &Friesen, 1969). Our assumption is that if the animated agents display facial micro expressions in addition to macro expressions, they will convey higher expressive richness and naturalness to the viewer, as “the agents can possess two emotional streams, one based on interaction with the viewer and the other based on their own internal state, or situation” (Queiroz et al. 2014, p.2).The work reported in the paper involved two studies with human subjects. The objectives of the first study were to examine whether people can recognize micro-expressions (in isolation) in animated agents, and whether there are differences in recognition based on the agent’s visual style (e.g., stylized versus realistic). The objectives of the second study were to investigate whether people can recognize the animated agents’ micro-expressions when integrated with macro-expressions, the extent to which the presence of micro + macro-expressions affect the perceived expressivity and naturalness of the animated agents, the extent to which exaggerating the micro expressions, e.g. increasing the amplitude of the animated facial displacements affects emotion recognition and perceived agent naturalness and emotional expressivity, and whether there are differences based on the agent’s design characteristics. In the first study, 15 participants watched eight micro-expression animations representing four different emotions (happy, sad, fear, surprised). Four animations featured a stylized agent and four a realistic agent. For each animation, subjects were asked to identify the agent’s emotion conveyed by the micro-expression. In the second study, 234 participants watched three sets of eight animation clips (24 clips in total, 12 clips per agent). Four animations for each agent featured the character performing macro-expressions only, four animations for each agent featured the character performing macro- + micro-expressions without exaggeration, and four animations for each agent featured the agent performing macro + micro-expressions with exaggeration. Participants were asked to recognize the true emotion of the agent and rate the emotional expressivity ad naturalness of the agent in each clip using a 5-point Likert scale. We have collected all the data and completed the statistical analysis. Findings and discussion, implications for research and practice, and suggestions for future work will be reported in the full paper. ReferencesAdamo N., Benes, B., Mayer, R., Lei, X., Meyer, Z., &Lawson, A. (2021). Multimodal Affective Pedagogical Agents for Different Types of Learners. In: Russo D., Ahram T., Karwowski W., Di Bucchianico G., Taiar R. (eds) Intelligent Human Systems Integration 2021. IHSI 2021. Advances in Intelligent Systems and Computing, 1322. Springer, Cham. https://doi.org/10.1007/978-3-030-68017-6_33Ekman, P., &Friesen, W. V. (1969, February). Nonverbal leakage and clues to deception. Psychiatry, 32(1), 88–106. https://doi.org/10.1080/00332747.1969.11023575 Queiroz, R. B., Musse, S. R., &Badler, N. I. (2014). Investigating Macroexpressions and Microexpressions in Computer Graphics Animated Faces. Presence, 23(2), 191-208. http://dx.doi.org/10.1162/

     
    more » « less