Analyzing Generalization of Neural Networks through Loss Path Kernels

Chen, Yilan; Huang, Wei; Wang, Hao; Loh, Charlotte; Srivastava, Akash; Nguyen, Lam; Weng, Tsui-Wei

Citation Details

Deep neural networks have been increasingly used in real-world applications, making it critical to ensure their ability to adapt to new, unseen data. In this paper, we study the generalization capability of neural networks trained with (stochastic) gradient flow. We establish a new connection between the loss dynamics of gradient flow and general kernel machines by proposing a new kernel, called loss path kernel. This kernel measures the similarity between two data points by evaluating the agreement between loss gradients along the path determined by the gradient flow. Based on this connection, we derive a new generalization upper bound that applies to general neural network architectures. This new bound is tight and strongly correlated with the true generalization error. We apply our results to guide the design of neural architecture search (NAS) and demonstrate favorable performance compared with state-of-the-art NAS algorithms through numerical experiments. more »

Award ID(s):: 2107189

PAR ID:: 10518466

Author(s) / Creator(s):: Chen, Yilan; Huang, Wei; Wang, Hao; Loh, Charlotte; Srivastava, Akash; Nguyen, Lam; Weng, Tsui-Wei

Publisher / Repository:: NeurIPS 2023

Date Published:: 2023-12-10

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Conference Proceeding:
The DOI is not currently available.

More Like this