Wasserstein Coreset via Sinkhorn Loss

Yin, H; Qiu, Y; Wang, X

Citation Details

This content will become publicly available on February 1, 2026

Wasserstein Coreset via Sinkhorn Loss

Coreset selection, a technique for compressing large datasets while preserving performance, is crucial for modern machine learning. This paper presents a novel method for generating high-quality Wasserstein coresets using the Sinkhorn loss, a powerful tool with computational advantages. However, existing approaches suffer from numerical instability in Sinkhorn’s algorithm. We address this by proposing stable algorithms for the computation and differentiation of the Sinkhorn optimization problem, including an analytical formula for the derivative of the Sinkhorn loss and a rigorous stability analysis of our method. Extensive experiments demonstrate that our approach significantly outperforms existing methods in terms of sample selection quality, computational efficiency, and achieving a smaller Wasserstein distance. more »

Award ID(s):: 2316428

PAR ID:: 10618856

Author(s) / Creator(s):: Yin, H; Qiu, Y; Wang, X

Publisher / Repository:: OpenReview.net

Date Published:: 2025-02-01

Journal Name:: Transactions on machine learning research

ISSN:: 2835-8856

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
This content will become publicly available on February 1, 2026
Journal Article:
The DOI is not currently available.

More Like this