skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Bounds for the smallest eigenvalue of the NTK for arbitrary spherical data of arbitrary dimension
Award ID(s):
2212520 2145630
PAR ID:
10640644
Author(s) / Creator(s):
; ;
Publisher / Repository:
Advances in Neural Information Processing Systems (NeurIPS) 2024
Date Published:
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Bounds on the smallest eigenvalue of the neural tangent kernel (NTK) are a key ingredient in the analysis of neural network optimization and memorization. How- ever, existing results require distributional assumptions on the data and are limited to a high-dimensional setting, where the input dimension d0 scales at least log- arithmically in the number of samples n. In this work we remove both of these requirements and instead provide bounds in terms of a measure of distance between data points: notably these bounds hold with high probability even when d0 is held constant versus n. We prove our results through a novel application of the hemisphere transform. 
    more » « less
  2. Bounds on the smallest eigenvalue of the neural tangent kernel (NTK) are a key ingredient in the analysis of neural network optimization and memorization. How- ever, existing results require distributional assumptions on the data and are limited to a high-dimensional setting, where the input dimension d0 scales at least log- arithmically in the number of samples n. In this work we remove both of these requirements and instead provide bounds in terms of a measure of distance between data points: notably these bounds hold with high probability even when d0 is held constant versus n. We prove our results through a novel application of the hemisphere transform. 
    more » « less
  3. Let Γ<#comment/> \Gamma be a countable abelian group. An (abstract) Γ<#comment/> \Gamma -system X \mathrm {X} - that is, an (abstract) probability space equipped with an (abstract) probability-preserving action of Γ<#comment/> \Gamma - is said to be aConze–Lesigne systemif it is equal to its second Host–Kra–Ziegler factor Z 2 ( X ) \mathrm {Z}^2(\mathrm {X}) . The main result of this paper is a structural description of such Conze–Lesigne systems for arbitrary countable abelian Γ<#comment/> \Gamma , namely that they are the inverse limit of translational systems G n / Λ<#comment/> n G_n/\Lambda _n arising from locally compact nilpotent groups G n G_n of nilpotency class 2 2 , quotiented by a lattice Λ<#comment/> n \Lambda _n . Results of this type were previously known when Γ<#comment/> \Gamma was finitely generated, or the product of cyclic groups of prime order. In a companion paper, two of us will apply this structure theorem to obtain an inverse theorem for the Gowers U 3 ( G ) U^3(G) norm for arbitrary finite abelian groups G G
    more » « less
  4. Abstract Mixed-space cluster expansion (MSCE), a first-principles method to simultaneously model the configuration-dependent short-ranged chemical and long-ranged strain interactions in alloy thermodynamics, has been successfully applied to binary FCC and BCC alloys. However, the previously reported MSCE method is limited to binary alloys with cubic crystal symmetry on a single sublattice. In the current work, MSCE is generalized to systems with multiple sublattices by formulating compatible reciprocal space interactions and combined with a crystal-symmetry-agnostic algorithm for the calculation of constituent strain energy. This generalized approach is then demonstrated in a hypothetical HCP system and Mg-Zn alloys. The current MSCE can significantly improve the accuracy of the energy parameterization and account for all the fully relaxed structures regardless of lattice distortion. The generalized MSCE method makes it possible to simultaneously analyze the short- and long-ranged configuration-dependent interactions in crystalline materials with arbitrary lattices with the accuracy of typical first-principles methods. 
    more » « less