This content will become publicly available on April 24, 2026
Transformers provably learn two-mixture of linear classification via gradient flow
An official website of the United States government
This content will become publicly available on April 24, 2026