Understanding Factual Errors in Summarization: Errors, Summarizers, Datasets, Error Detectors

Tang, Liyan; Goyal, Tanya; Fabbri, Alex; Laban, Philippe; Xu, Jiacheng; Yavuz, Semih; Kryscinski, Wojciech; Rousseau, Justin; Durrett, Greg

doi:10.18653/v1/2023.acl-long.650

Citation Details

Understanding Factual Errors in Summarization: Errors, Summarizers, Datasets, Error Detectors

The propensity of abstractive summarization models to make factual errors has been studied extensively, including design of metrics to detect factual errors and annotation of errors in current systems’ outputs. However, the ever-evolving nature of summarization systems, metrics, and annotated benchmarks makes factuality evaluation a moving target, and drawing clear comparisons among metrics has become increasingly difficult. In this work, we aggregate factuality error annotations from nine existing datasets and stratify them according to the underlying summarization model. We compare performance of state-of-the-art factuality metrics, including recent ChatGPT-based metrics, on this stratified benchmark and show that their performance varies significantly across different types of summarization models. Critically, our analysis shows that much of the recent improvement in the factuality detection space has been on summaries from older (pre-Transformer) models instead of more relevant recent summarization models. We further perform a finer-grained analysis per error-type and find similar performance variance across error types for different factuality metrics. Our results show that no one metric is superior in all settings or for all error types, and we provide recommendations for best practices given these insights. more »

Award ID(s):: 2145280

PAR ID:: 10516569

Author(s) / Creator(s):: Tang, Liyan; Goyal, Tanya; Fabbri, Alex; Laban, Philippe; Xu, Jiacheng; Yavuz, Semih; Kryscinski, Wojciech; Rousseau, Justin; Durrett, Greg

Publisher / Repository:: Association for Computational Linguistics

Date Published:: 2023-01-01

Journal Name:: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Page Range / eLocation ID:: 11626 to 11644

Format(s):: Medium: X

Location:: Toronto, Canada

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.18653/v1/2023.acl-long.650

More Like this