skip to main content


Title: A Cautionary Note on Predicting Social Judgments from Faces with Deep Neural Networks
Abstract People spontaneously infer other people’s psychology from faces, encompassing inferences of their affective states, cognitive states, and stable traits such as personality. These judgments are known to be often invalid, but nonetheless bias many social decisions. Their importance and ubiquity have made them popular targets for automated prediction using deep convolutional neural networks (DCNNs). Here, we investigated the applicability of this approach: how well does it generalize, and what biases does it introduce? We compared three distinct sets of features (from a face identification DCNN, an object recognition DCNN, and using facial geometry), and tested their prediction across multiple out-of-sample datasets. Across judgments and datasets, features from both pre-trained DCNNs provided better predictions than did facial geometry. However, predictions using object recognition DCNN features were not robust to superficial cues (e.g., color and hair style). Importantly, predictions using face identification DCNN features were not specific: models trained to predict one social judgment (e.g., trustworthiness) also significantly predicted other social judgments (e.g., femininity and criminal), and at an even higher accuracy in some cases than predicting the judgment of interest (e.g., trustworthiness). Models trained to predict affective states (e.g., happy) also significantly predicted judgments of stable traits (e.g., sociable), and vice versa. Our analysis pipeline not only provides a flexible and efficient framework for predicting affective and social judgments from faces but also highlights the dangers of such automated predictions: correlated but unintended judgments can drive the predictions of the intended judgments.  more » « less
Award ID(s):
1840756
NSF-PAR ID:
10295673
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Affective Science
ISSN:
2662-2041
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Autism spectrum disorder (ASD) is characterized by difficulties in social processes, interactions, and communication. Yet, the neurocognitive bases underlying these difficulties are unclear. Here, we triangulated the ‘trans-diagnostic’ approach to personality, social trait judgments of faces, and neurophysiology to investigate (1) the relative position of autistic traits in a comprehensive social-affective personality space, and (2) the distinct associations between the social-affective personality dimensions and social trait judgment from faces in individuals with ASD and neurotypical individuals. We collected personality and facial judgment data from a large sample of online participants (N = 89 self-identified ASD;N = 307 neurotypical controls). Factor analysis with 33 subscales of 10 social-affective personality questionnaires identified a 4-dimensional personality space. This analysis revealed that ASD and control participants did not differ significantly along the personality dimensions of empathy and prosociality, antisociality, or social agreeableness. However, the ASD participants exhibited a weaker association between prosocial personality dimensions and judgments of facial trustworthiness and warmth than the control participants. Neurophysiological data also indicated that ASD participants had a weaker association with neuronal representations for trustworthiness and warmth from faces. These results suggest that the atypical association between social-affective personality and social trait judgment from faces may contribute to the social and affective difficulties associated with ASD.

     
    more » « less
  2. Recent neuroimaging evidence challenges the classical view that face identity and facial expression are processed by segregated neural pathways, showing that information about identity and expression are encoded within common brain regions. This article tests the hypothesis that integrated representations of identity and expression arise spontaneously within deep neural networks. A subset of the CelebA dataset is used to train a deep convolutional neural network (DCNN) to label face identity (chance = 0.06%, accuracy = 26.5%), and the FER2013 dataset is used to train a DCNN to label facial expression (chance = 14.2%, accuracy = 63.5%). The identity-trained and expression-trained networks each successfully transfer to labeling both face identity and facial expression on the Karolinska Directed Emotional Faces dataset. This study demonstrates that DCNNs trained to recognize face identity and DCNNs trained to recognize facial expression spontaneously develop representations of facial expression and face identity, respectively. Furthermore, a congruence coefficient analysis reveals that features distinguishing between identities and features distinguishing between expressions become increasingly orthogonal from layer to layer, suggesting that deep neural networks disentangle representational subspaces corresponding to different sources. 
    more » « less
  3. First impressions make up an integral part of our interactions with other humans by providing an instantaneous judgment of the trustworthiness, dominance and attractiveness of an individual prior to engaging in any other form of interaction. Unfortunately, this can lead to unintentional bias in situations that have serious consequences, whether it be in judicial proceedings, career advancement, or politics. The ability to automatically recognize social traits presents a number of highly useful applications: from minimizing bias in social interactions to providing insight into how our own facial attributes are interpreted by others. However, while first impressions are well-studied in the field of psychology, automated methods for predicting social traits are largely non-existent. In this work, we demonstrate the feasibility of two automated approaches—multi-label classification (MLC) and multi-output regression (MOR)—for first impression recognition from faces. We demonstrate that both approaches are able to predict social traits with better than chance accuracy, but there is still significant room for improvement. We evaluate ethical concerns and detail application areas for future work in this direction. 
    more » « less
  4. Deep convolutional neural networks (DCNNs) trained for face identification can rival and even exceed human-level performance. The ways in which the internal face representations in DCNNs relate to human cognitive representations and brain activity are not well understood. Nearly all previous studies focused on static face image processing with rapid display times and ignored the processing of naturalistic, dynamic information. To address this gap, we developed the largest naturalistic dynamic face stimulus set in human neuroimaging research (700+ naturalistic video clips of unfamiliar faces). We used this naturalistic dataset to compare representational geometries estimated from DCNNs, behavioral responses, and brain responses. We found that DCNN representational geometries were consistent across architectures, cognitive representational geometries were consistent across raters in a behavioral arrangement task, and neural representational geometries in face areas were consistent across brains. Representational geometries in late, fully connected DCNN layers, which are optimized for individuation, were much more weakly correlated with cognitive and neural geometries than were geometries in late-intermediate layers. The late-intermediate face-DCNN layers successfully matched cognitive representational geometries, as measured with a behavioral arrangement task that primarily reflected categorical attributes, and correlated with neural representational geometries in known face-selective topographies. Our study suggests that current DCNNs successfully capture neural cognitive processes for categorical attributes of faces but less accurately capture individuation and dynamic features.

     
    more » « less
  5. Agents must monitor their partners' affective states continuously in order to understand and engage in social interactions. However, methods for evaluating affect recognition do not account for changes in classification performance that may occur during occlusions or transitions between affective states. This paper addresses temporal patterns in affect classification performance in the context of an infant-robot interaction, where infants’ affective states contribute to their ability to participate in a therapeutic leg movement activity. To support robustness to facial occlusions in video recordings, we trained infant affect recognition classifiers using both facial and body features. Next, we conducted an in-depth analysis of our best-performing models to evaluate how performance changed over time as the models encountered missing data and changing infant affect. During time windows when features were extracted with high confidence, a unimodal model trained on facial features achieved the same optimal performance as multimodal models trained on both facial and body features. However, multimodal models outperformed unimodal models when evaluated on the entire dataset. Additionally, model performance was weakest when predicting an affective state transition and improved after multiple predictions of the same affective state. These findings emphasize the benefits of incorporating body features in continuous affect recognition for infants. Our work highlights the importance of evaluating variability in model performance both over time and in the presence of missing data when applying affect recognition to social interactions. 
    more » « less