NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Electro-activated indigos intensify ampere-level CO2 reduction to CO on silver catalysts

https://doi.org/10.1038/s41467-025-58593-w

Li, Zhengyuan; Li, Xing; Wang, Ruoyu; Campos_Mata, Astrid; Gerke, Carter_S; Xiang, Shuting; Mathur, Anmol; Zhang, Lingyu; Lin, Dian-Zhao; Li, Tianchen; et al (April 2025, Nature Communications)
Temporal Attention and Consistency Measuring for Video Question Answering

https://doi.org/10.1145/3382507.3418886

Zhang, Lingyu; Radke, Richard J. (October 2020, Publication: ICMI '20: Proceedings of the 2020 International Conference on Multimodal Interaction)

Social signal processing algorithms have become increasingly better at solving well-defined prediction and estimation problems in audiovisual recordings of group discussion. However, much human behavior and communication is less structured and more subtle. In this paper, we address the problem of generic question answering from diverse audiovisual recordings of human interaction. The goal is to select the correct free-text answer to a free-text question about human interaction in a video. We propose an RNN-based model with two novel ideas: a temporal attention module that highlights key words and phrases in the question and candidate answers, and a consistency measurement module that scores the similarity between the multimodal data, the question, and the candidate answers. This small set of consistency scores forms the input to the final question-answering stage, resulting in a lightweight model. We demonstrate that our model achieves state of the art accuracy on the Social-IQ dataset containing hundreds of videos and question/answer pairs.
more » « less
Full Text Available
A Multi-Stream Recurrent Neural Network for Social Role Detection in Multiparty Interactions

https://doi.org/10.1109/JSTSP.2020.2992394

Zhang, Lingyu; Radke, Richard J. (March 2020, IEEE Journal of Selected Topics in Signal Processing)

Full Text Available
Predicting Individual Treatment Effects of Large-scale Team Competitions in a Ride-sharing Economy

https://doi.org/10.1145/3394486.3403286

Ye, Teng; Ai, Wei; Zhang, Lingyu; Luo, Ning; Zhang, Lulu; Ye, Jieping; Mei, Qiaozhu (July 2020, Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining)
null (Ed.)
Full Text Available
Keep Meeting Summaries on Topic: Abstractive Multi-Modal Meeting Summarization

Li, Manling; Zhang, Lingyu; Radke, Richard J.; Ji, Heng (July 2019, 57th Conference of the Association for Computational Linguistics)

Transcripts of natural, multi-person meetings differ significantly from documents like news articles, which can make Natural Language Generation models generate unfocused summaries. We develop an abstractive meeting summarizer from both videos and audios of meeting recordings. Specifically, we propose a multi-modal hierarchical attention mechanism across three levels: topic segment, utterance and word. To narrow down the focus into topically-relevant segments, we jointly model topic segmentation and summarization. In addition to traditional textual features, we introduce new multi-modal features derived from visual focus of attention, based on the assumption that an utterance is more important if its speaker receives more attention. Experiments show that our model significantly outperforms the state-of-the-art with both BLEU and ROUGE measures.
more » « less
Full Text Available
Multiparty Visual Co-Occurrences for Estimating Personality Traits in Group Meetings

https://doi.org/10.1109/WACV45572.2020.9093642

Zhang, Lingyu; Bhattacharya, Indrani; Morgan, Mallory; Foley, Michael; Riedl, Christoph; Welles, Brooke Foucault; Radke, Richard J. (March 2020, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV))

Full Text Available
Improved Visual Focus of Attention Estimation and Prosodic Features for Analyzing Group Interactions

https://doi.org/10.1145/3340555.3353761

Zhang, Lingyu; Morgan, Mallory; Bhattacharya, Indrani; Foley, Michael; Braasch, Jonas; Riedl, Christoph; Foucault Welles, Brooke; Radke, Richard J. (October 2019, ICMI '19: 2019 International Conference on Multimodal Interaction)

Full Text Available

Search for: All records