skip to main content


Title: Deep Learning-Based Emotion Recognition from Real-Time Videos
We introduce a novel framework for emotional state detection from facial expression targeted to learning environments. Our framework is based on a convolutional deep neural network that classifies people’s emotions that are captured through a web-cam. For our classification outcome we adopt Russel’s model of core affect in which any particular emotion can be placed in one of four quadrants: pleasant-active, pleasant-inactive, unpleasant-active, and unpleasant-inactive. We gathered data from various datasets that were normalized and used to train the deep learning model. We use the fully-connected layers of the VGG_S network which was trained on human facial expressions that were manually labeled. We have tested our application by splitting the data into 80:20 and re-training the model. The overall test accuracy of all detected emotions was 66%. We have a working application that is capable of reporting the user emotional state at about five frames per second on a standard laptop computer with a web-cam. The emotional state detector will be integrated into an affective pedagogical agent system where it will serve as a feedback to an intelligent animated educational tutor.  more » « less
Award ID(s):
1821894
NSF-PAR ID:
10276161
Author(s) / Creator(s):
Date Published:
Journal Name:
Kurosu M. (eds) Human-Computer Interaction. Multimodal and Natural Interaction. HCII 2020. Lecture Notes in Computer Science
Volume:
12182
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Affective studies provide essential insights to address emotion recognition and tracking. In traditional open-loop structures, a lack of knowledge about the internal emotional state makes the system incapable of adjusting stimuli parameters and automatically responding to changes in the brain. To address this issue, we propose to use facial electromyogram measurements as biomarkers to infer the internal hidden brain state as feedback to close the loop. In this research, we develop a systematic way to track and control emotional valence, which codes emotions as being pleasant or obstructive. Hence, we conduct a simulation study by modeling and tracking the subject's emotional valence dynamics using state-space approaches. We employ Bayesian filtering to estimate the person-specific model parameters along with the hidden valence state, using continuous and binary features extracted from experimental electromyogram measurements. Moreover, we utilize a mixed-filter estimator to infer the secluded brain state in a real-time simulation environment. We close the loop with a fuzzy logic controller in two categories of regulation: inhibition and excitation. By designing a control action, we aim to automatically reflect any required adjustments within the simulation and reach the desired emotional state levels. Final results demonstrate that, by making use of physiological data, the proposed controller could effectively regulate the estimated valence state. Ultimately, we envision future outcomes of this research to support alternative forms of self-therapy by using wearable machine interface architectures capable of mitigating periods of pervasive emotions and maintaining daily well-being and welfare. 
    more » « less
  2. The overall goal of our research is to develop a system of intelligent multimodal affective pedagogical agents that are effective for different types of learners (Adamo et al., 2021). While most of the research on pedagogical agents tends to focus on the cognitive aspects of online learning and instruction, this project explores the less-studied role of affective (or emotional) factors. We aim to design believable animated agents that can convey realistic, natural emotions through speech, facial expressions, and body gestures and that can react to the students’ detected emotional states with emotional intelligence. Within the context of this goal, the specific objective of the work reported in the paper was to examine the extent to which the agents’ facial micro-expressions affect students’ perception of the agents’ emotions and their naturalness. Micro-expressions are very brief facial expressions that occur when a person either deliberately or unconsciously conceals an emotion being felt (Ekman &Friesen, 1969). Our assumption is that if the animated agents display facial micro expressions in addition to macro expressions, they will convey higher expressive richness and naturalness to the viewer, as “the agents can possess two emotional streams, one based on interaction with the viewer and the other based on their own internal state, or situation” (Queiroz et al. 2014, p.2).The work reported in the paper involved two studies with human subjects. The objectives of the first study were to examine whether people can recognize micro-expressions (in isolation) in animated agents, and whether there are differences in recognition based on the agent’s visual style (e.g., stylized versus realistic). The objectives of the second study were to investigate whether people can recognize the animated agents’ micro-expressions when integrated with macro-expressions, the extent to which the presence of micro + macro-expressions affect the perceived expressivity and naturalness of the animated agents, the extent to which exaggerating the micro expressions, e.g. increasing the amplitude of the animated facial displacements affects emotion recognition and perceived agent naturalness and emotional expressivity, and whether there are differences based on the agent’s design characteristics. In the first study, 15 participants watched eight micro-expression animations representing four different emotions (happy, sad, fear, surprised). Four animations featured a stylized agent and four a realistic agent. For each animation, subjects were asked to identify the agent’s emotion conveyed by the micro-expression. In the second study, 234 participants watched three sets of eight animation clips (24 clips in total, 12 clips per agent). Four animations for each agent featured the character performing macro-expressions only, four animations for each agent featured the character performing macro- + micro-expressions without exaggeration, and four animations for each agent featured the agent performing macro + micro-expressions with exaggeration. Participants were asked to recognize the true emotion of the agent and rate the emotional expressivity ad naturalness of the agent in each clip using a 5-point Likert scale. We have collected all the data and completed the statistical analysis. Findings and discussion, implications for research and practice, and suggestions for future work will be reported in the full paper. ReferencesAdamo N., Benes, B., Mayer, R., Lei, X., Meyer, Z., &Lawson, A. (2021). Multimodal Affective Pedagogical Agents for Different Types of Learners. In: Russo D., Ahram T., Karwowski W., Di Bucchianico G., Taiar R. (eds) Intelligent Human Systems Integration 2021. IHSI 2021. Advances in Intelligent Systems and Computing, 1322. Springer, Cham. https://doi.org/10.1007/978-3-030-68017-6_33Ekman, P., &Friesen, W. V. (1969, February). Nonverbal leakage and clues to deception. Psychiatry, 32(1), 88–106. https://doi.org/10.1080/00332747.1969.11023575 Queiroz, R. B., Musse, S. R., &Badler, N. I. (2014). Investigating Macroexpressions and Microexpressions in Computer Graphics Animated Faces. Presence, 23(2), 191-208. http://dx.doi.org/10.1162/

     
    more » « less
  3. Abstract

    Effective interactions between humans and robots are vital to achieving shared tasks in collaborative processes. Robots can utilize diverse communication channels to interact with humans, such as hearing, speech, sight, touch, and learning. Our focus, amidst the various means of interactions between humans and robots, is on three emerging frontiers that significantly impact the future directions of human–robot interaction (HRI): (i) human–robot collaboration inspired by human–human collaboration, (ii) brain-computer interfaces, and (iii) emotional intelligent perception. First, we explore advanced techniques for human–robot collaboration, covering a range of methods from compliance and performance-based approaches to synergistic and learning-based strategies, including learning from demonstration, active learning, and learning from complex tasks. Then, we examine innovative uses of brain-computer interfaces for enhancing HRI, with a focus on applications in rehabilitation, communication, brain state and emotion recognition. Finally, we investigate the emotional intelligence in robotics, focusing on translating human emotions to robots via facial expressions, body gestures, and eye-tracking for fluid, natural interactions. Recent developments in these emerging frontiers and their impact on HRI were detailed and discussed. We highlight contemporary trends and emerging advancements in the field. Ultimately, this paper underscores the necessity of a multimodal approach in developing systems capable of adaptive behavior and effective interaction between humans and robots, thus offering a thorough understanding of the diverse modalities essential for maximizing the potential of HRI.

     
    more » « less
  4. Abstract

    The positivity principle states that people learn better from instructors who display positive emotions rather than negative emotions. In two experiments, students viewed a short video lecture on a statistics topic in which an instructor stood next to a series of slides as she lectured and then they took either an immediate test (Experiment 1) or a delayed test (Experiment 2). In a between-subjects design, students saw an instructor who used her voice, body movement, gesture, facial expression, and eye gaze to display one of four emotions while lecturing: happy (positive/active), content (positive/passive), frustrated (negative/active), or bored (negative/passive). First, learners were able to recognize the emotional tone of the instructor in an instructional video lecture, particularly by more strongly rating a positive instructor as displaying positive emotions and a negative instructor as displaying negative emotions (in Experiments 1 and 2). Second, concerning building a social connection during learning, learners rated a positive instructor as more likely to facilitate learning, more credible, and more engaging than a negative instructor (in Experiments 1 and 2). Third, concerning cognitive engagement during learning, learners reported paying more attention during learning for a positive instructor than a negative instructor (in Experiments 1 and 2). Finally, concerning learning outcome, learners who had a positive instructor scored higher than learners who had a negative instructor on a delayed posttest (Experiment 2) but not an immediate posttest (Experiment 1). Overall, there is evidence for the positivity principle and the cognitive-affective model of e-learning from which it is derived.

     
    more » « less
  5. Recognizing the affective state of children with autism spectrum disorder (ASD) in real-world settings poses challenges due to the varying head poses, illumination levels, occlusion and a lack of datasets annotated with emotions in in-the-wild scenarios. Understanding the emotional state of children with ASD is crucial for providing personalized interventions and support. Existing methods often rely on controlled lab environments, limiting their applicability to real-world scenarios. Hence, a framework that enables the recognition of affective states in children with ASD in uncontrolled settings is needed. This paper presents a framework for recognizing the affective state of children with ASD in an in-the-wild setting using heart rate (HR) information. More specifically, an algorithm is developed that can classify a participant’s emotion as positive, negative, or neutral by analyzing the heart rate signal acquired from a smartwatch. The heart rate data are obtained in real time using a smartwatch application while the child learns to code a robot and interacts with an avatar. The avatar assists the child in developing communication skills and programming the robot. In this paper, we also present a semi-automated annotation technique based on facial expression recognition for the heart rate data. The HR signal is analyzed to extract features that capture the emotional state of the child. Additionally, in this paper, the performance of a raw HR-signal-based emotion classification algorithm is compared with a classification approach based on features extracted from HR signals using discrete wavelet transform (DWT). The experimental results demonstrate that the proposed method achieves comparable performance to state-of-the-art HR-based emotion recognition techniques, despite being conducted in an uncontrolled setting rather than a controlled lab environment. The framework presented in this paper contributes to the real-world affect analysis of children with ASD using HR information. By enabling emotion recognition in uncontrolled settings, this approach has the potential to improve the monitoring and understanding of the emotional well-being of children with ASD in their daily lives.

     
    more » « less