To understand the genuine emotions expressed by humans during social interactions, it is necessary to recognize the subtle changes on the face (micro-expressions) demonstrated by an individual. Facial micro-expressions are brief, rapid, spontaneous gestures and non-voluntary facial muscle movements beneath the skin. Therefore, it is a challenging task to classify facial micro-expressions. This paper presents an end-to-end novel three-stream graph attention network model to capture the subtle changes on the face and recognize micro-expressions (MEs) by exploiting the relationship between optical flow magnitude, optical flow direction, and the node locations features. A facial graph representational structure is used to extract the spatial and temporal information using three frames. The varying dynamic patch size of optical flow features is used to extract the local texture information across each landmark point. The network only utilizes the landmark points location features and optical flow information across these points and generates good results for the classification of MEs. A comprehensive evaluation of SAMM and the CASME II datasets demonstrates the high efficacy, efficiency, and generalizability of the proposed approach and achieves better results than the state-of-the-art methods.
more »
« less
Micro-Expression Classification based on Landmark Relations with Graph Attention Convolutional Network
Facial micro-expressions are brief, rapid, spontaneous gestures of the facial muscles that express an individual's genuine emotions. Because of their short duration and subtlety, detecting and classifying these micro-expressions by humans and machines is difficult. In this paper, a novel approach is proposed that exploits relationships between landmark points and the optical flow patch for the given landmark points. It consists of a two-stream graph attention convolutional network that extracts the relationships between the landmark points and local texture using an optical flow patch. A graph structure is built to draw out temporal information using the triplet of frames. One stream is for node feature location, and the other one is for a patch of optical-flow information. These two streams (node location stream and optical flow stream) are fused for classification. The results are shown on, CASME II and SAMM, publicly available datasets, for three classes and five classes of micro-expressions. The proposed approach outperforms the state-of-the-art methods for 3 and 5 categories of expressions.
more »
« less
- Award ID(s):
- 1911197
- PAR ID:
- 10279826
- Date Published:
- Journal Name:
- Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops
- Page Range / eLocation ID:
- 1511-1520
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Facial micro-expressions (MEs) refer to subtle, transient, and involuntary muscle movements expressing a per-son’s true feelings. This paper presents a novel two-stream relational edge-node graph attention network-based approach to classify MEs in a video by selecting the high-intensity frames and edge-node features that can provide valuable information about the relationship between nodes and structural information in a graph structure. The pa-per examines the impact of different edge-node features and their relationships on the graphs. The first step involves extracting high-intensity-emotion frames from the video using optical flow. Second, node feature embeddings are calculated using the node location coordinate features and the patch size information of the optical flow across each node location. Additionally, we obtain the global and local structural similarity score using the jaccard’s similarity score and radial basis function as the edge features. Third, a self-attention graph pooling layer helps to remove the nodes with lower attention scores based on the top-k selection. As the final step, the network employs a two-stream edge-node graph attention network that focuses on finding correlations among the edge and node features, such as landmark coordinates, optical flow, and global and local edge features. A three-frame graph structure is designed to obtain spatio-temporal information. For 3 and 5 expression classes, the results are compared for SMIC and CASME II databases.more » « less
-
This work is motivated by the need to automate the analysis of parent-infant interactions to better understand the existence of any potential behavioral patterns useful for the early diagnosis of autism spectrum disorder (ASD). It presents an approach for synthesizing the facial expression exchanges that occur during parent-infant interactions. This is accomplished by developing a novel approach that uses landmarks when synthesizing changing facial expressions. The proposed model consists of two components: (i) The first is a landmark converter that receives a set of facial landmarks and the target emotion as input and outputs a set of new landmarks transformed to match the emotion. (ii) The second component involves an image converter that takes in an input image, a target landmark and a target emotion and outputs a face transformed to match the input emotion. The inclusion of landmarks in the generation process proves useful in the generation of baby facial expressions; babies have somewhat different facial musculature and facial dynamics than adults. This paper presents a realistic-looking matrix of changing facial expressions sampled from a 2-D emotion continuum (valence and arousal) and displays successfully transferred facial expressions from real-life mother-infant dyads to novel ones.more » « less
-
IntroductionEarly and accurate diagnosis of autism spectrum disorder (ASD) is crucial for effective intervention, yet it remains a significant challenge due to its complexity and variability. Micro-expressions are rapid, involuntary facial movements indicative of underlying emotional states. It is unknown whether micro-expression can serve as a valid bio-marker for ASD diagnosis. MethodsThis study introduces a novel machine-learning (ML) framework that advances ASD diagnostics by focusing on facial micro-expressions. We applied cutting-edge algorithms to detect and analyze these micro-expressions from video data, aiming to identify distinctive patterns that could differentiate individuals with ASD from typically developing peers. Our computational approach included three key components: (1) micro-expression spotting using Shallow Optical Flow Three-stream CNN (SOFTNet), (2) feature extraction via Micron-BERT, and (3) classification with majority voting of three competing models (MLP, SVM, and ResNet). ResultsDespite the sophisticated methodology, the ML framework's ability to reliably identify ASD-specific patterns was limited by the quality of video data. This limitation raised concerns about the efficacy of using micro-expressions for ASD diagnostics and pointed to the necessity for enhanced video data quality. DiscussionOur research has provided a cautious evaluation of micro-expression diagnostic value, underscoring the need for advancements in behavioral imaging and multimodal AI technology to leverage the full capabilities of ML in an ASD-specific clinical context.more » « less
-
null (Ed.)Multiplex networks are complex graph structures in which a set of entities are connected to each other via multiple types of relations, each relation representing a distinct layer. Such graphs are used to investigate many complex biological, social, and technological systems. In this work, we present a novel semi-supervised approach for structure-aware representation learning on multiplex networks. Our approach relies on maximizing the mutual information between local node-wise patch representations and label correlated structure-aware global graph representations to model the nodes and cluster structures jointly. Specifically, it leverages a novel cluster-aware, node-contextualized global graph summary generation strategy for effective joint-modeling of node and cluster representations across the layers of a multiplex network. Empirically, we demonstrate that the proposed architecture outperforms state-of-the-art methods in a range of tasks: classification, clustering, visualization, and similarity search on seven real-world multiplex networks for various experiment settings.more » « less
An official website of the United States government

