Open (Clinical) LLMs are Sensitive to Instruction Phrasings

Ceballos-Arroyo, Alberto Mario; Munnangi, Monica; Sun, Jiuding; Zhang, Karen; McInerney, Jered; Wallace, Byron C; Amir, Silvio

doi:10.18653/v1/2024.bionlp-1.5

Citation Details

Open (Clinical) LLMs are Sensitive to Instruction Phrasings

Instruction-tuned Large Language Models (LLMs) can perform a wide range of tasks given natural language instructions to do so, but they are sensitive to how such instructions are phrased. This issue is especially concerning in healthcare, as clinicians are unlikely to be experienced prompt engineers and the potential consequences of inaccurate outputs are heightened in this domain. This raises a practical question: How robust are instruction-tuned LLMs to natural variations in the instructions provided for clinical NLP tasks? We collect prompts from medical doctors across a range of tasks and quantify the sensitivity of seven LLMs—some general, others specialized—to natural (i.e., non-adversarial) instruction phrasings. We find that performance varies substantially across all models, and that—perhaps surprisingly—domain-specific models explicitly trained on clinical data are especially brittle, compared to their general domain counterparts. Further, arbitrary phrasing differences can affect fairness, e.g., valid but distinct instructions for mortality prediction yield a range both in overall performance, and in terms of differences between demographic groups. more »

Award ID(s):: 1901117

PAR ID:: 10617076

Author(s) / Creator(s):: Ceballos-Arroyo, Alberto Mario; Munnangi, Monica; Sun, Jiuding; Zhang, Karen; McInerney, Jered; Wallace, Byron C; Amir, Silvio

Publisher / Repository:: Association for Computational Linguistics

Date Published:: 2024-01-01

Page Range / eLocation ID:: 50 to 71

Format(s):: Medium: X

Location:: Bangkok, Thailand

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.18653/v1/2024.bionlp-1.5

More Like this