Surgical Fine-Tuning Improves Adaptation to Distribution Shifts

lee, Yoonho; Chen, Annie S.; Tajwar, Fahim; Kumar, Ananya; Yao, Huaxiu; Liang, Percy; Finn, Chelsea

Citation Details

A common approach to transfer learning under distribution shift is to fine-tune the last few layers of a pre-trained model, preserving learned features while also adapting to the new task. This paper shows that in such settings, selectively fine-tuning a subset of layers (which we term surgical fine-tuning) matches or outperforms commonly used fine-tuning approaches. Moreover, the type of distribution shift influences which subset is more effective to tune: for example, for image corruptions, fine-tuning only the first few layers works best. We validate our findings systematically across seven real-world data tasks spanning three types of distribution shifts. Theoretically, we prove that for two-layer neural networks in an idealized setting, first-layer tuning can outperform fine-tuning all layers. Intuitively, fine-tuning more parameters on a small target dataset can cause information learned during pre-training to be forgotten, and the relevant information depends on the type of shift. more »

Award ID(s):: 2343611

PAR ID:: 10472128

Author(s) / Creator(s):: lee, Yoonho; Chen, Annie S.; Tajwar, Fahim; Kumar, Ananya; Yao, Huaxiu; Liang, Percy; Finn, Chelsea

Publisher / Repository:: ICLR 2023

Date Published:: 2022-06-06

Subject(s) / Keyword(s):: Machine Learning (cs.LG) Artificial Intelligence (cs.AI)

Format(s):: Medium: X

Location:: ICLR 2023

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this