The Removal of Irrelevant Human Factors in a Multi-Review Corpus through Text Filtering

Moody, Aaron; Spurling, Makenzie; Hu, Chenyi

doi:10.54941/ahfe1003766

Citation Details

The Removal of Irrelevant Human Factors in a Multi-Review Corpus through Text Filtering

Generating a high-quality explainable summary of a multi-review corpus can help people save time in reading the reviews. With natural language processing and text clustering, people can generate both abstractive and extractive summaries on a corpus containing up to 967 product reviews (Moody et al. 2022). However, the overall quality of the summaries needs further improvement. Noticing that online reviews in the corpus come from a diverse population, we take an approach of removing irrelevant human factors through pre-processing. Apply available pre-trained models together with reference based and reference free metrics, we filter out noise in each review automatically prior to summary generation. Our computational experiments evident that one may significantly improve the overall quality of an explainable summary from such a pre-processed corpus than from the original one. It is suggested of applying available high-quality pre-trained tools to filter noises rather than start from scratch. Although this work is on the specific multi-review corpus, the methods and conclusions should be helpful for generating summaries for other multi-review corpora. more »

Award ID(s):: 1946391

PAR ID:: 10498137

Author(s) / Creator(s):: Moody, Aaron; Spurling, Makenzie; Hu, Chenyi

Publisher / Repository:: AHFE International

Date Published:: 2023-01-01

Journal Name:: Accelerating Open Access Science in Human Factors Engineering and Human-Centered Computing

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Conference Paper:
https://doi.org/10.54941/ahfe1003766

More Like this