Low-rank finetuning for LLMs: A fairness perspective

Das, Saswat; Romanelli, Marco; Tran, Cuong; Reza, Zarreen; Kailkhura, Bhavya; Fioretto, Ferdinando

Citation Details

Low-rank approximation techniques have become the de facto standard for fine-tuning Large Language Models (LLMs) due to their reduced computational and memory requirements. This paper investigates the effectiveness of these methods in capturing the shift of fine-tuning datasets from the initial pre-trained data distribution. Our findings reveal that there are cases in which low-rank fine-tuning falls short in learning such shifts. This, in turn, produces non-negligible side effects, especially when fine-tuning is adopted for toxicity mitigation in pre-trained models, or in scenarios where it is important to provide fair models. Through comprehensive empirical evidence on several models, datasets, and tasks, we show that low-rank fine-tuning inadvertently preserves undesirable biases and toxic behaviors. We also show that this extends to sequential decision-making tasks, emphasizing the need for careful evaluation to promote responsible LLMs development. more »

Award ID(s):: 2345483 2401285 2334936

PAR ID:: 10540649

Author(s) / Creator(s):: Das, Saswat; Romanelli, Marco; Tran, Cuong; Reza, Zarreen; Kailkhura, Bhavya; Fioretto, Ferdinando

Publisher / Repository:: ArXiv

Date Published:: 2024-05-28

Journal Name:: arXivorg

ISSN:: 2331-8422

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
The DOI is not currently available.

More Like this