NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Disentangling audio content and emotion with adaptive instance normalization for expressive facial animation synthesis

https://doi.org/10.1002/cav.2076

Chang, Che‐Jui; Zhao, Long; Zhang, Sen; Kapadia, Mubbasir (July 2022, Computer Animation and Virtual Worlds)

Abstract 3D facial animation synthesis from audio has been a focus in recent years. However, most existing literature works are designed to map audio and visual content, providing limited knowledge regarding the relationship between emotion in audio and expressive facial animation. This work generates audio‐matching facial animations with the specified emotion label. In such a task, we argue that separating the content from audio is indispensable—the proposed model must learn to generate facial content from audio content while expressions from the specified emotion. We achieve it by an adaptive instance normalization module that isolates the content in the audio and combines the emotion embedding from the specified label. The joint content‐emotion embedding is then used to generate 3D facial vertices and texture maps. We compare our method with state‐of‐the‐art baselines, including the facial segmentation‐based and voice conversion‐based disentanglement approaches. We also conduct a user study to evaluate the performance of emotion conditioning. The results indicate that our proposed method outperforms the baselines in animation quality and expression categorization accuracy.
more » « less
Procedure-Aware Pretraining for Instructional Video Understanding

Honglu Zhou, Roberto Martín-Martín (April 2023, Computer Vision and Pattern Recognition)

Full Text Available
The Importance of Multimodal Emotion Conditioning and Affect Consistency for Embodied Conversational Agents

https://doi.org/10.1145/3581641.3584045

Chang, Che-Jui; Sohn, Samuel S; Zhang, Sen; Jayashankar, Rajath; Usman, Muhammad; Kapadia, Mubbasir (March 2023, Intelligent User Interfaces 2023)

Full Text Available
The IVI Lab entry to the GENEA Challenge 2022 – A Tacotron2 Based Method for Co-Speech Gesture Generation With Locality-Constraint Attention Mechanism

https://doi.org/10.1145/3536221.3558060

Chang, Che-Jui; Zhang, Sen; Kapadia, Mubbasir (November 2022, GENEA Challenge 2022)

Full Text Available
HM: Hybrid Masking for Few-Shot Segmentation

Seonghyeon Moon, Samuel S. (November 2022, ECCV 2022)

Full Text Available
D-HYPR: Harnessing Neighborhood Modeling and Asymmetry Preservation for Digraph Representation Learning

https://doi.org/10.1145/3511808.3557344

Zhou, Honglu; Chegu, Advith; Sohn, Samuel S.; Fu, Zuohui; de Melo, Gerard; Kapadia, Mubbasir (October 2022, Proceedings of the 31st ACM International Conference on Information & Knowledge Management)

Full Text Available
COMPOSER: Compositional Reasoning of Group Activity in Videos with Keypoint-Only Modality.

Honglu Zhou, Asim Kadav (October 2022, European Conference on Computer Vision 2022)

Full Text Available
A2X: An end-to-end framework for assessing agent and environment interactions in multimodal human trajectory prediction

https://doi.org/10.1016/j.cag.2022.05.010

Sohn, Samuel S.; Lee, Mihee; Moon, Seonghyeon; Qiao, Gang; Usman, Muhammad; Yoon, Sejong; Pavlovic, Vladimir; Kapadia, Mubbasir (August 2022, Computers & Graphics)

Full Text Available
A Computational Method for the Classification of Mental Representations of Objects in 3D Space

Sam Sohn, Panagiotis Mavros (August 2022, 15th International Conference on Spatial Information Theory (COSIT 2022))

The structure mapping task is a simple method to test people’s mental representations of spatial relationships, and has recently been particularly useful in the study of volumetric spatial cognition such as the spatial memory for locations in multilevel buildings. However, there does not exist a standardised method to analyse such data and structure mapping tasks are typically analysed by human raters, based on criteria defined by the researchers. In this article, we introduce a computational method to assess spatial relationships of objects in the vertical and horizontal domains, which are realized through the structure mapping task. Here, we reanalyse participants’ digitised structure maps from an earlier study (N=41) using the proposed computational methodology. Our results show that the new method successfully distinguishes between different types of structure map representations, and is sensitive to learning order effects. This method can be useful to advance the study of volumetric spatial cognition.
more » « less
Full Text Available
Harnessing Fourier Isovists and Geodesic Interaction for Long-Term Crowd Flow Prediction

https://doi.org/10.24963/ijcai.2022/185

Sohn, Samuel S.; Moon, Seonghyeon; Zhou, Honglu; Lee, Mihee; Yoon, Sejong; Pavlovic, Vladimir; Kapadia, Mubbasir (January 2022, Thirty-First International Joint Conference on Artificial Intelligence (IJCAI))

Full Text Available

« Prev Next »

Search for: All records