AHomogeneous Transformer Architecture

Gan, Yulu; Poggio, Tomaso

Citation Details

While the Transformer architecture has made a substantial impact in the field of machine learning, it is unclear what purpose each component serves in the overall architecture. Heterogeneous nonlinear circuits such as multi-layer RELU networks are interleaved with layers of soft-max units. We introduce here a homogeneous architecture based on Hyper Radial Basis Function (HyperBF) units. Evalua- tions on CIFAR10, CIFAR100, and Tiny ImageNet demonstrate a performance comparable to standard vision transformers. more »

Award ID(s):: 2134108

PAR ID:: 10565443

Author(s) / Creator(s):: Gan, Yulu; Poggio, Tomaso

Publisher / Repository:: Center for Brains, Minds and Machines (CBMM)

Date Published:: 2023-09-18

Format(s):: Medium: X

Institution:: Massachusetts Institute of Technology

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Posted Content:
The DOI is not currently available.

More Like this