The increased prevalence of online meetings has significantly en- hanced the practicality of a model that can automatically generate the summary of a given meeting. This paper introduces a novel and effective approach to automate the generation of meeting sum- maries. Current approaches to this problem generate general and basic summaries, considering the meeting simply as a long dia- logue. However, our novel algorithms can generate abstractive meet- ing summaries that are driven by the action items contained in the meeting transcript. This is done by recursively generating sum- maries and employing our action-item extraction algorithm for each section of the meeting in parallel. All of these sectional sum- maries are then combined and summarized together to create a coherent and action-item-driven summary. In addition, this paper introduces three novel methods for dividing up long transcripts into topic-based sections to improve the time efficiency of our al- gorithm, as well as to resolve the issue of large language models (LLMs) forgetting long-term dependencies. Our pipeline achieved a BERTScore of 64.98 across the AMI corpus, which is an approxi- mately 4.98% increase from the current state-of-the-art result pro- duced by a fine-tuned BART (Bidirectional and Auto-Regressive Transformers) model.
more »
« less
Action-Item-Driven Summarization of Long Meeting Transcripts
The increased prevalence of online meetings has significantly en- hanced the practicality of a model that can automatically generate the summary of a given meeting. This paper introduces a novel and effective approach to automate the generation of meeting sum- maries. Current approaches to this problem generate general and basic summaries, considering the meeting simply as a long dialogue. However, our novel algorithms can generate abstractive meeting summaries that are driven by the action items contained in the meet- ing transcript. This is done by recursively generating summaries and employing our action-item extraction algorithm for each sec- tion of the meeting in parallel. All of these sectional summaries are then combined and summarized together to create a coherent and action-item-driven summary. In addition, this paper introduces three novel methods for dividing up long transcripts into topic- based sections to improve the time efficiency of our algorithm, as well as to resolve the issue of large language models (LLMs) forget- ting long-term dependencies. Our pipeline achieved a BERTScore of 64.98 across the AMI corpus, which is an approximately 4.98% increase from the current state-of-the-art result produced by a fine-tuned BART (Bidirectional and Auto-Regressive Transformers) model.
more »
« less
- Award ID(s):
- 2050919
- PAR ID:
- 10519433
- Publisher / Repository:
- ACM Digital Library
- Date Published:
- ISBN:
- 979-8-4007-0922-7
- Format(s):
- Medium: X
- Location:
- Seoul, South Korea
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Social media9s explosive growth has resulted in a massive influx of electronic documents influencing various facets of daily life. However, the enormous and complex nature of this content makes extracting valuable insights challenging. Long document summarization emerges as a pivotal technique in this context, serving to distill extensive texts into concise and comprehensible summaries. This paper presents a novel three-stage pipeline for effective long document summarization. The proposed approach combines unsupervised and supervised learning techniques, efficiently handling large document sets while requiring minimal computational resources. Our methodology introduces a unique process for forming semantic chunks through spectral dynamic segmentation, effectively reducing redundancy and repetitiveness in the summarization process. Contrary to previous methods, our approach aligns each semantic chunk with the entire summary paragraph, allowing the abstractive summarization model to process documents without truncation and enabling the summarization model to deduce missing information from other chunks. To enhance the summary generation, we utilize a sophisticated rewrite model based on Bidirectional and Auto- Regressive Transformers (BART), rearranging and reformulating summary constructs to improve their fluidity and coherence. Empirical studies conducted on the long documents from the Webis-TLDR-17 dataset demonstrate that our approach significantly enhances the efficiency of abstractive summarization transformers. The contributions of this paper thus offer significant advancements in the field of long document summarization, providing a novel and effective methodology for summarizing extensive texts in the context of social media.more » « less
-
Progress in summarizing long texts is inhibited by the lack of appropriate evaluation frameworks. A long summary that appropriately covers the facets of that text must also present a coherent narrative, but current automatic and human evaluation methods fail to identify gaps in coherence. In this work, we introduce SNaC, a narrative coherence evaluation framework for fine-grained annotations of long summaries. We develop a taxonomy of coherence errors in generated narrative summaries and collect span-level annotations for 6.6k sentences across 150 book and movie summaries. Our work provides the first characterization of coherence errors generated by state-of-the-art summarization models and a protocol for eliciting coherence judgments from crowdworkers. Furthermore, we show that the collected annotations allow us to benchmark past work in coherence modeling and train a strong classifier for automatically localizing coherence errors in generated summaries. Finally, our SNaC framework can support future work in long document summarization and coherence evaluation, including improved summarization modeling and post-hoc summary correction.more » « less
-
ABSTRACT Although exploratory play is considered a hallmark of cognitive development and learning, relatively few studies have been able to quantitatively characterize the shifts that may occur in children's approach to exploration. One reason for this gap is due to challenges coding and analyzing children's exploratory play behavior. In our paper, we employ a novel computational modeling approach to understand whether and how children's exploratory play patterns shift in early childhood (3‐ to 11‐years‐old). We analyze data from children (N = 432) across five different experiments that varied in the type of exploration task (including novel toys, novel topics, and novel envelopes). Children's behaviors were coded action‐by‐action according to whether children repeated an action on the same type of target, switched to a novel target, or terminated play. Our computational Markov model searches over the space of possible “stay,” “switch,” and “end” parameters to quantify child‐specific transition probabilities. We find that overall, older children are less likely to perseverate, more likely to switch, and more likely to end the task earlier. Our approach provides a demonstration of how Markov models can be used to map the process of play, providing insight into theories of developmental changes in exploration. SummaryWe use Markov models to quantify developmental shifts in children's exploratory play across five naturalistic tasks.Older children showed increased exploratory variability and decreased perseveration during play.Developmental effects were most consistent in novel toy tasks, but varied across contexts.Our findings help reconcile conflicting prior research by highlighting the role of task structure and developmental changes in exploratory strategy.more » « less
-
Long document summarization systems are critical for domains with lengthy and jargonladen text, yet they present significant challenges to researchers and developers with limited computing resources. Existing solutions mainly focus on efficient attentions or divideand- conquer strategies. The former reduces theoretical time complexity, but is still memoryheavy. The latter methods sacrifice global context, leading to uninformative and incoherent summaries. This work aims to leverage the memory-efficient nature of divide-and-conquer methods while preserving global context. Concretely, our framework AWESOME uses two novel mechanisms: (1) External memory mechanisms track previously encoded document segments and their corresponding summaries, to enhance global document understanding and summary coherence. (2) Global salient content is further identified beforehand to augment each document segment to support its summarization. Extensive experiments on diverse genres of text, including government reports, meeting transcripts, screenplays, scientific papers, and novels, show that AWESOME produces summaries with improved informativeness, faithfulness, and coherence than competitive baselines on longer documents, while having a smaller GPU memory footprint.more » « less
An official website of the United States government

