k-Mixup Regularization for Deep Learning via Optimal Transport

Greenewald, Kristjan; Gu, Anming; Yurochkin, Mikhail; Solomon, Justin; Chien, Edward

Citation Details

This content will become publicly available on November 14, 2024

k-Mixup Regularization for Deep Learning via Optimal Transport

Mixup is a popular regularization technique for training deep neural networks that improves generalization and increases robustness to certain distribution shifts. It perturbs input training data in the direction of other randomly-chosen instances in the training set. To better leverage the structure of the data, we extend mixup in a simple, broadly applicable way to k-mixup, which perturbs k-batches of training points in the direction of other k-batches. The perturbation is done with displacement interpolation, i.e. interpolation under the Wasserstein metric. We demonstrate theoretically and in simulations that k-mixup preserves cluster and manifold structures, and we extend theory studying the efficacy of standard mixup to the k-mixup case. Our empirical results show that training with k-mixup further improves generalization and robustness across several network architectures and benchmark datasets of differing modalities. For the wide variety of real datasets considered, the performance gains of k-mixup over standard mixup are similar to or larger than the gains of mixup itself over standard ERM after hyperparameter optimization. In several instances, in fact, k-mixup achieves gains in settings where standard mixup has negligible to zero improvement over ERM. more »

Award ID(s):: 1838071

NSF-PAR ID:: 10483956

Author(s) / Creator(s):: Greenewald, Kristjan; Gu, Anming; Yurochkin, Mikhail; Solomon, Justin; Chien, Edward

Publisher / Repository:: OpenReview

Date Published:: 2023-11-14

Journal Name:: Transactions on Machine Learning Research

ISSN:: 2835-8856

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
This content will become publicly available on November 14, 2024
Journal Article:
The DOI is not currently available.

More Like this