PIEClass: Weakly-Supervised Text Classification with Prompting and Noise-Robust Iterative Ensemble Training

Zhang, Yunyi; Jiang, Minhao; Meng, Yu; Zhang, Yu; Han, Jiawei

doi:10.18653/v1/2023.emnlp-main.780

Citation Details

PIEClass: Weakly-Supervised Text Classification with Prompting and Noise-Robust Iterative Ensemble Training

Weakly-supervised text classification trains a classifier using the label name of each target class as the only supervision, which largely reduces human annotation efforts. Most existing methods first use the label names as static keyword-based features to generate pseudo labels, which are then used for final classifier training. While reasonable, such a commonly adopted framework suffers from two limitations: (1) keywords can have different meanings in different contexts and some text may not have any keyword, so keyword matching can induce noisy and inadequate pseudo labels; (2) the errors made in the pseudo label generation stage will directly propagate to the classifier training stage without a chance of being corrected. In this paper, we propose a new method, PIEClass, consisting of two modules: (1) a pseudo label acquisition module that uses zero-shot prompting of pre-trained language models (PLM) to get pseudo labels based on contextualized text understanding beyond static keyword matching, and (2) a noise-robust iterative ensemble training module that iteratively trains classifiers and updates pseudo labels by utilizing two PLM fine-tuning methods that regularize each other. Extensive experiments show that PIEClass achieves overall better performance than existing strong baselines on seven benchmark datasets and even achieves similar performance to fully-supervised classifiers on sentiment classification tasks. more »

Award ID(s):: 2118329 2019897

PAR ID:: 10543141

Author(s) / Creator(s):: Zhang, Yunyi; Jiang, Minhao; Meng, Yu; Zhang, Yu; Han, Jiawei

Publisher / Repository:: Association for Computational Linguistics

Date Published:: 2023-01-01

Page Range / eLocation ID:: 12655 to 12670

Format(s):: Medium: X

Location:: Singapore

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.18653/v1/2023.emnlp-main.780

More Like this