Cross-Document Event Coreference Resolution: Instruct Humans or Instruct GPT?

Zhao, Jin; Xue, Nianwen; Min, Bonan

doi:10.18653/v1/2023.conll-1.38

Citation Details

Cross-Document Event Coreference Resolution: Instruct Humans or Instruct GPT?

This paper explores utilizing Large Language Models (LLMs) to perform Cross-Document Event Coreference Resolution (CDEC) annotations and evaluates how they fare against human annotators with different levels of training. Specifically, we formulate CDEC as a multi-category classification problem on pairs of events that are represented as decontextualized sentences, and compare the predictions of GPT-4 with the judgment of fully trained annotators and crowdworkers on the same data set. Our study indicates that GPT-4 with zero-shot learning outperformed crowd-workers by a large margin and exhibits a level of performance comparable to trained annotators. Upon closer analysis, GPT-4 also exhibits tendencies of being overly confident, and force annotation decisions even when such decisions are not warranted due to insufficient information. Our results have implications on how to perform complicated annotations such as CDEC in the age of LLMs, and show that the best way to acquire such annotations might be to combine the strengths of LLMs and trained human annotators in the annotation process, and using untrained or undertrained crowdworkers is no longer a viable option to acquire high-quality data to advance the state of the art for such problems. more »

Award ID(s):: 2213804 2213805

PAR ID:: 10527033

Author(s) / Creator(s):: Zhao, Jin; Xue, Nianwen; Min, Bonan

Editor(s):: Jiang, Jing; Reitter, David; Deng, Shumin

Publisher / Repository:: Association for Computational Linguistics

Date Published:: 2023-12-06

Page Range / eLocation ID:: 561 to 574

Format(s):: Medium: X

Location:: Singapore

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.18653/v1/2023.conll-1.38

More Like this