skip to main content

Title: Coupled Systems for Modeling Rapport Between Interlocutors
This research work explores different machine learning techniques for recognizing the existence of rapport between two people engaged in a conversation, based on their facial expressions. First using artificially generated pairs of correlated data signals, a coupled gated recurrent unit (cGRU) neural network is developed to measure the extent of similarity between the temporal evolution of pairs of time-series signals. By pre-selecting their covariance values (between 0.1 and 1.0), pairs of coupled sequences are generated. Using the developed cGRU architecture, this covariance between the signals is successfully recovered. Using this and various other coupled architectures, tests for rapport (measured by the extent of mirroring and mimicking of behaviors) are conducted on real-life datasets. On fifty-nine (N = 59) pairs of interactants in an interview setting, a transformer based coupled architecture performs the best in determining the existence of rapport. To test for generalization, the models were applied on never-been-seen data collected 14 years prior, also to predict the existence of rapport. The coupled transformer model again performed the best for this transfer learning task, determining which pairs of interactants had rapport and which did not. The experiments and results demonstrate the advantages of coupled architectures for predicting an interactional process such as rapport, even in the presence of limited data.  more » « less
Award ID(s):
Author(s) / Creator(s):
; ; ; ; ;
Date Published:
Journal Name:
2021 16th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2021)
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Background

    Oxford Nanopore long‐read sequencing technology addresses current limitations for DNA methylation detection that are inherent in short‐read bisulfite sequencing or methylation microarrays. A number of analytical tools, such as Nanopolish, Guppy/Tombo and DeepMod, have been developed to detect DNA methylation on Nanopore data. However, additional improvements can be made in computational efficiency, prediction accuracy, and contextual interpretation on complex genomics regions (such as repetitive regions, low GC density regions).


    In the current study, we apply Transformer architecture to detect DNA methylation on ionic signals from Oxford Nanopore sequencing data. Transformer is an algorithm that adopts self‐attention architecture in the neural networks and has been widely used in natural language processing.


    Compared to traditional deep‐learning method such as convolutional neural network (CNN) and recurrent neural network (RNN), Transformer may have specific advantages in DNA methylation detection, because the self‐attention mechanism can assist the relationship detection between bases that are far from each other and pay more attention to important bases that carry characteristic methylation‐specific signals within a specific sequence context.


    We demonstrated the ability of Transformers to detect methylation on ionic signal data.

    more » « less
  2. Abstract

    Spread‐F (SF) is a feature that can be visually observed on ionograms when the ionosonde signals are significantly impacted by plasma irregularities in the ionosphere. Depending on the scale of the plasma irregularities, radio waves of different frequencies are impacted differently when the signals pass through the ionosphere. An automated method for detecting SF in ionograms is presented in this study. Through detecting the existence of SF in ionograms, we can help identify instances of plasma irregularities that are potentially affecting the high‐frequency radio‐wave systems. The ionogram images from Jicamarca observatory in Peru, during the years 2008–2019, are used in this study. Three machine learning approaches have been carried out: supervised learning using Support Vector Machines, and two neural network‐based learning methods: autoencoder and transfer learning. Of these three methods, the transfer learning approach, which uses convolutional neural network architectures, demonstrates the best performance. The best existing architecture that is suitable for this problem appears to be the ResNet50. With respect to the training epoch number, the ResNet50 showed the greatest change in the metric values for the key metrics that we were tracking. Furthermore, on a test set of 2050 ionograms, the model based on the ResNet50 architecture provides an accuracy of 89%, recall of 87%, precision of 95%, as well as Area Under the Curve of 96%. The work also provides a labeled data set of around 28,000 ionograms, which is extremely useful for the community for future machine learning studies.

    more » « less
  3. Abstract We compare different neural network architectures for machine learning algorithms designed to identify the neutrino interaction vertex position in the MINERvA detector. The architectures developed and optimized by hand are compared with the architectures developed in an automated way using the package “Multi-node Evolutionary Neural Networks for Deep Learning” (MENNDL), developed at Oak Ridge National Laboratory. While the domain-expert hand-tuned network was the best performer, the differences were negligible and the auto-generated networks performed as well. There is always a trade-off between human, and computer resources for network optimization and this work suggests that automated optimization, assuming resources are available, provides a compelling way to save significant expert time. 
    more » « less
  4. Abstract—Sleep staging is a key challenge in diagnosing and treating sleep-related diseases due to its labor-intensive, time- consuming, costly, and error-prone. With the availability of large- scale sleep signal data, many deep learning methods are proposed for automatic sleep staging. However, these existing methods face several challenges including the heterogeneity of patients’ underlying health conditions and the difficulty modeling complex interactions between sleep stages. In this paper, we propose a neural network architecture named DREAM to tackle these is- sues for automatic sleep staging. DREAM consists of (i) a feature representation network that generates robust representations for sleep signals via the variational auto-encoder framework and contrastive learning and (ii) a sleep stage classification network that explicitly models the interactions between sleep stages in the sequential context at both feature representation and label classification levels via Transformer and conditional random field architectures. Our experimental results indicate that DREAM significantly outperforms existing methods for automatic sleep staging on three sleep signal datasets. 
    more » « less
  5. Aliannejadi, M ; Faggioli, G ; Ferro, N ; Vlachos, M. (Ed.)
    This work discusses the participation of CS_Morgan in the Concept Detection and Caption Prediction tasks of the ImageCLEFmedical 2023 Caption benchmark evaluation campaign. The goal of this task is to automatically identify relevant concepts and their locations in images, as well as generate coherent captions for the images. The dataset used for this task is a subset of the extended Radiology Objects in Context (ROCO) dataset. The implementation approach employed by us involved the use of pre-trained Convolutional Neural Networks (CNNs), Vision Transformer (ViT), and Text-to-Text Transfer Transformer (T5) architectures. These models were leveraged to handle the different aspects of the tasks, such as concept detection and caption generation. In the Concept Detection task, the objective was to classify multiple concepts associated with each image. We utilized several deep learning architectures with ‘sigmoid’ activation to enable multilabel classification using the Keras framework. We submitted a total of five (5) runs for this task, and the best run achieved an F1 score of 0.4834, indicating its effectiveness in detecting relevant concepts in the images. For the Caption Prediction task, we successfully submitted eight (8) runs. Our approach involved combining the ViT and T5 models to generate captions for the images. For the caption prediction task, the ranking is based on the BERTScore, and our best run achieved a score of 0.5819 based on generating captions using the fine-tuned T5 model from keywords generated using the pretrained ViT as the encoder. 
    more » « less