ZERO-SHOT GENERALIZATION ACROSS ARCHITECTURES FOR VISUAL CLASSIFICATION

Gerritz, E; Dyballa, L; Zucker, S

Citation Details

Generalization to unseen data is a key desideratum for deep networks, but its re- lation to classification accuracy is unclear. Using a minimalist vision dataset and a measure of generalizability, we show that popular networks, from deep con- volutional networks (CNNs) to transformers, vary in their power to extrapolate to unseen classes both across layers and across architectures. Accuracy is not a good predictor of generalizability, and generalization varies non-monotonically with layer depth. Our code is available at github.com/dyballa/generalization more »

Award ID(s):: 1822650

PAR ID:: 10559196

Author(s) / Creator(s):: Gerritz, E; Dyballa, L; Zucker, S

Publisher / Repository:: https://openreview.net/forum?id=orYMrUv7eu

Date Published:: 2024-03-19

Format(s):: Medium: X

Location:: Vienna

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this