skip to main content


Title: Infant AFAR: Automated facial action recognition in infants
Abstract

Automated detection of facial action units in infants is challenging. Infant faces have different proportions, less texture, fewer wrinkles and furrows, and unique facial actions relative to adults. For these and related reasons, action unit (AU) detectors that are trained on adult faces may generalize poorly to infant faces. To train and test AU detectors for infant faces, we trained convolutional neural networks (CNN) in adult video databases and fine-tuned these networks in two large, manually annotated, infant video databases that differ in context, head pose, illumination, video resolution, and infant age. AUs were those central to expression of positive and negative emotion. AU detectors trained in infants greatly outperformed ones trained previously in adults. Training AU detectors across infant databases afforded greater robustness to between-database differences than did training database specific AU detectors and outperformed previous state-of-the-art in infant AU detection. The resulting AU detection system, which we refer to as Infant AFAR (Automated Facial Action Recognition), is available to the research community for further testing and applications in infant emotion, social interaction, and related topics.

 
more » « less
NSF-PAR ID:
10366906
Author(s) / Creator(s):
; ; ; ; ;
Publisher / Repository:
Springer Science + Business Media
Date Published:
Journal Name:
Behavior Research Methods
Volume:
55
Issue:
3
ISSN:
1554-3528
Format(s):
Medium: X Size: p. 1024-1035
Size(s):
["p. 1024-1035"]
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Although still‐face effects are well‐studied, little is known about the degree to which the Face‐to‐Face/Still‐Face (FFSF) is associated with the production of intense affective displays. Duchenne smiling expresses more intense positive affect than non‐Duchenne smiling, while Duchenne cry‐faces express more intense negative affect than non‐Duchenne cry‐faces. Forty 4‐month‐old infants and their mothers completed the FFSF, and key affect‐indexing facial Action Units (AUs) were coded by expert Facial Action Coding System coders for the first 30 s of each FFSF episode. Computer vision software, automated facial affect recognition (AFAR), identified AUs for the entire 2‐min episodes. Expert coding and AFAR produced similar infant and mother Duchenne and non‐Duchenne FFSF effects, highlighting the convergent validity of automated measurement. Substantive AFAR analyses indicated that both infant Duchenne and non‐Duchenne smiling declined from the FF to the SF, but only Duchenne smiling increased from the SF to the RE. In similar fashion, the magnitude of mother Duchenne smiling changes over the FFSF were 2–4 times greater than non‐Duchenne smiling changes. Duchenne expressions appear to be a sensitive index of intense infant and mother affective valence that are accessible to automated measurement and may be a target for future FFSF research.

     
    more » « less
  2. Abstract

    Functional brain networks are assessed differently earlier versus later in development: infants are almost universally scanned asleep, whereas adults are typically scanned awake. Observed differences between infant and adult functional networks may thus reflect differing states of consciousness rather than or in addition to developmental changes. We explore this question by comparing functional networks in functional magnetic resonance imaging (fMRI) scans of infants during natural sleep and awake movie-watching. As a reference, we also scanned adults during awake rest and movie-watching. Whole-brain functional connectivity was more similar within the same state (sleep and movie in infants; rest and movie in adults) compared with across states. Indeed, a classifier trained on patterns of functional connectivity robustly decoded infant state and even generalized to adults; interestingly, a classifier trained on adult state did not generalize as well to infants. Moreover, overall similarity between infant and adult functional connectivity was modulated by adult state (stronger for movie than rest) but not infant state (same for sleep and movie). Nevertheless, the connections that drove this similarity, particularly in the frontoparietal control network, were modulated by infant state. In sum, infant functional connectivity differs between sleep and movie states, highlighting the value of awake fMRI for studying functional networks over development.

     
    more » « less
  3. Agents must monitor their partners' affective states continuously in order to understand and engage in social interactions. However, methods for evaluating affect recognition do not account for changes in classification performance that may occur during occlusions or transitions between affective states. This paper addresses temporal patterns in affect classification performance in the context of an infant-robot interaction, where infants’ affective states contribute to their ability to participate in a therapeutic leg movement activity. To support robustness to facial occlusions in video recordings, we trained infant affect recognition classifiers using both facial and body features. Next, we conducted an in-depth analysis of our best-performing models to evaluate how performance changed over time as the models encountered missing data and changing infant affect. During time windows when features were extracted with high confidence, a unimodal model trained on facial features achieved the same optimal performance as multimodal models trained on both facial and body features. However, multimodal models outperformed unimodal models when evaluated on the entire dataset. Additionally, model performance was weakest when predicting an affective state transition and improved after multiple predictions of the same affective state. These findings emphasize the benefits of incorporating body features in continuous affect recognition for infants. Our work highlights the importance of evaluating variability in model performance both over time and in the presence of missing data when applying affect recognition to social interactions. 
    more » « less
  4. We devised and evaluated a multi-modal machine learning-based system to analyze videos of school classrooms for "positive climate" and "negative climate", which are two dimensions of the Classroom Assessment Scoring System (CLASS). School classrooms are highly cluttered audiovisual scenes containing many overlapping faces and voices. Due to the difficulty of labeling them (reliable coding requires weeks of training) and their sensitive nature (students and teachers may be in stressful or potentially embarrassing situations), CLASS- labeled classroom video datasets are scarce, and their labels are sparse (just a few labels per 15-minute video dip). Thus, the overarching challenge was how to harness modern deep perceptual architectures despite the paucity of labeled data. Through training low-level CNN-based facial attribute detectors (facial expression & adult/child) as well as a direct audio-to- climate regressor, and by integrating low-level information over time using a Bi-LSTM, we constructed automated detectors of positive and negative classroom climate with accuracy (10- fold cross-validation Pearson correlation on 241 CLASS-labeled videos) of 0.40 and 0.51, respectively. These numbers are superior to what we obtained using shallower architectures. This work represents the first automated system designed to detect specific dimensions of the CLASS. 
    more » « less
  5. Abstract

    An attention bias to threat has been linked to psychosocial outcomes across development, including anxiety (Pérez‐Edgar, K., Bar‐Haim, Y., McDermott, J. M., Chronis‐Tuscano, A., Pine, D. S., & Fox, N. A. (2010). Attention biases to threat and behavioral inhibition in early childhood shape adolescent social withdrawal.Emotion (Washington, D.C.), 10(3), 349). Although some attention biases to threat are normative, it remains unclear how these biases diverge into maladaptive patterns of emotion processing for some infants. Here, we examined the relation between household stress, maternal anxiety, and attention bias to threat in a longitudinal sample of infants tested at 4, 8, and 12 months. Infants were presented with a passive viewing eye‐tracking task in which angry, happy, or neutral facial configurations appeared in one of the four corners of a screen. We measured infants’ latency to fixate each target image and collected measures of parental anxiety and daily hassles at each timepoint. Intensity of daily parenting hassles moderated patterns of attention bias to threat in infants over time. Infants exposed to heightened levels of parental hassles becameslowerto detect angry (but not happy) facial configurations compared with neutral faces between 4 and 12 months of age, regardless of parental anxiety. Our findings highlight the potential impact of the environment on the development of infants’ early threat processing and the need to further investigate how early environmental factors shape the development of infant emotion processing.

     
    more » « less