Keep Meeting Summaries on Topic: Abstractive Multi-Modal Meeting Summarization

Li, Manling; Zhang, Lingyu; Radke, Richard J.; Ji, Heng

Citation Details

Transcripts of natural, multi-person meetings differ significantly from documents like news articles, which can make Natural Language Generation models generate unfocused summaries. We develop an abstractive meeting summarizer from both videos and audios of meeting recordings. Specifically, we propose a multi-modal hierarchical attention mechanism across three levels: topic segment, utterance and word. To narrow down the focus into topically-relevant segments, we jointly model topic segmentation and summarization. In addition to traditional textual features, we introduce new multi-modal features derived from visual focus of attention, based on the assumption that an utterance is more important if its speaker receives more attention. Experiments show that our model significantly outperforms the state-of-the-art with both BLEU and ROUGE measures. more »

Award ID(s):: 1631674

PAR ID:: 10107386

Author(s) / Creator(s):: Li, Manling; Zhang, Lingyu; Radke, Richard J.; Ji, Heng

Date Published:: 2019-07-28

Journal Name:: 57th Conference of the Association for Computational Linguistics

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this