NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

On the existence of solutions to adversarial training in multiclass classification

https://doi.org/10.1017/S0956792524000822

García_Trillos, Nicolás; Jacobs, Matt; Kim, Jakwang (December 2024, European Journal of Applied Mathematics)

Abstract Adversarial training is a min-max optimization problem that is designed to construct robust classifiers against adversarial perturbations of data. We study three models of adversarial training in the multiclass agnostic-classifier setting. We prove the existence of Borel measurable robust classifiers in each model and provide a unified perspective of the adversarial training problem, expanding the connections with optimal transport initiated by the authors in their previous work [21]. In addition, we develop new connections between adversarial training in the multiclass setting and total variation regularization. As a corollary of our results, we provide an alternative proof of the existence of Borel measurable solutions to the agnostic adversarial training problem in the binary classification setting.
more » « less
Free, publicly-accessible full text available December 3, 2025
On adversarial robustness and the use of Wasserstein ascent-descent dynamics to enforce it

https://doi.org/10.1093/imaiai/iaae018

García_Trillos, Camilo Andrés; García_Trillos, Nicolás (July 2024, Information and Inference: A Journal of the IMA)

Abstract We propose iterative algorithms to solve adversarial training problems in a variety of supervised learning settings of interest. Our algorithms, which can be interpreted as suitable ascent-descent dynamics in Wasserstein spaces, take the form of a system of interacting particles. These interacting particle dynamics are shown to converge toward appropriate mean-field limit equations in certain large number of particles regimes. In turn, we prove that, under certain regularity assumptions, these mean-field equations converge, in the large time limit, toward approximate Nash equilibria of the original adversarial learning problems. We present results for non-convex non-concave settings, as well as for non-convex concave ones. Numerical experiments illustrate our results.
more » « less
Full Text Available
A new perspective on denoising based on optimal transport

https://doi.org/10.1093/imaiai/iaae029

García_Trillos, Nicolás; Sen, Bodhisattva (September 2024, Information and Inference: A Journal of the IMA)

Abstract In the standard formulation of the classical denoising problem, one is given a probabilistic model relating a latent variable $$\varTheta \in \varOmega \subset{\mathbb{R}}^{m} \; (m\ge 1)$$ and an observation $$Z \in{\mathbb{R}}^{d}$$ according to $$Z \mid \varTheta \sim p(\cdot \mid \varTheta )$$ and $$\varTheta \sim G^{*}$$, and the goal is to construct a map to recover the latent variable from the observation. The posterior mean, a natural candidate for estimating $$\varTheta $$ from $$Z$$, attains the minimum Bayes risk (under the squared error loss) but at the expense of over-shrinking the $$Z$$, and in general may fail to capture the geometric features of the prior distribution $$G^{*}$$ (e.g. low dimensionality, discreteness, sparsity). To rectify these drawbacks, in this paper we take a new perspective on this denoising problem that is inspired by optimal transport (OT) theory and use it to study a different, OT-based, denoiser at the population level setting. We rigorously prove that, under general assumptions on the model, this OT-based denoiser is mathematically well-defined and unique, and is closely connected to the solution to a Monge OT problem. We then prove that, under appropriate identifiability assumptions on the model, the OT-based denoiser can be recovered solely from information of the marginal distribution of $$Z$$ and the posterior mean of the model, after solving a linear relaxation problem over a suitable space of couplings that is reminiscent of standard multimarginal OT problems. In particular, due to Tweedie’s formula, when the likelihood model $$\{ p(\cdot \mid \theta ) \}_{\theta \in \varOmega }$$ is an exponential family of distributions, the OT-based denoiser can be recovered solely from the marginal distribution of $$Z$$. In general, our family of OT-like relaxations is of interest in its own right and for the denoising problem suggests alternative numerical methods inspired by the rich literature on computational OT.
more » « less
Full Text Available
Rates of convergence for regression with the graph poly-Laplacian

https://doi.org/10.1007/s43670-023-00075-5

Trillos, Nicolás García; Murray, Ryan; Thorpe, Matthew (November 2023, Sampling Theory, Signal Processing, and Data Analysis)

Abstract In the (special) smoothing spline problem one considers a variational problem with a quadratic data fidelity penalty and Laplacian regularization. Higher order regularity can be obtained via replacing the Laplacian regulariser with a poly-Laplacian regulariser. The methodology is readily adapted to graphs and here we consider graph poly-Laplacian regularization in a fully supervised, non-parametric, noise corrupted, regression problem. In particular, given a dataset$$\{x_i\}_{i=1}^n$$ ${x_{i}}_{i = 1}^{n}$ and a set of noisy labels$$\{y_i\}_{i=1}^n\subset \mathbb {R}$$ ${y_{i}}_{i = 1}^{n} \subset R$ we let$$u_n{:}\{x_i\}_{i=1}^n\rightarrow \mathbb {R}$$ $u_{n} : {x_{i}}_{i = 1}^{n} \to R$ be the minimizer of an energy which consists of a data fidelity term and an appropriately scaled graph poly-Laplacian term. When$$y_i = g(x_i)+\xi _i$$ $y_{i} = g (x_{i}) + ξ_{i}$ , for iid noise$$\xi _i$$ $ξ_{i}$ , and using the geometric random graph, we identify (with high probability) the rate of convergence of$$u_n$$ $u_{n}$ togin the large data limit$$n\rightarrow \infty $$ $n \to \infty$ . Furthermore, our rate is close to the known rate of convergence in the usual smoothing spline model.
more » « less
Defending against diverse attacks in federated learning through consensus-based bi-level optimization

https://doi.org/10.1098/rsta.2024.0235

García_Trillos, Nicolás; Kumar_Akash, Aditya; Li, Sixu; Riedl, Konstantin; Zhu, Yuhua (June 2025, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences)

Adversarial attacks pose significant challenges in many machine learning applications, particularly in the setting of distributed training and federated learning, where malicious agents seek to corrupt the training process with the goal of jeopardizing and compromising the performance and reliability of the final models. In this paper, we address the problem of robust federated learning in the presence of such attacks by formulating the training task as a bi-level optimization problem. We conduct a theoretical analysis of the resilience of consensus-based bi-level optimization (CB²O), an interacting multi-particle metaheuristic optimization method, in adversarial settings. Specifically, we provide a global convergence analysis of CB²O in mean-field law in the presence of malicious agents, demonstrating the robustness of CB²O against a diverse range of attacks. Thereby, we offer insights into how specific hyperparameter choices enable to mitigate adversarial effects. On the practical side, we extend CB²O to the clustered federated learning setting by proposing FedCB²O, a novel interacting multi-particle system, and design a practical algorithm that addresses the demands of real-world applications. Extensive experiments demonstrate the robustness of the FedCB²O algorithm against label-flipping attacks in decentralized clustered federated learning scenarios, showcasing its effectiveness in practical contexts. This article is part of the theme issue ‘Partial differential equations in data science’.
more » « less
Free, publicly-accessible full text available June 5, 2026
An Optimal Transport Approach for Computing Adversarial Training Lower Bounds in Multiclass Classification

García_Trillos, N; Jacobs, M; Kim, J; Werenski, M (December 2024, Journal of machine learning research)

Free, publicly-accessible full text available December 1, 2025
FedCBO: Reaching Group Consensus in Clustered Federated Learning through Consensus-based Optimization

Carrillo, Jose A; Garcia_Trillos, Nicolas; Li, Sixu; Zhu, Sixu (July 2024, Journal of machine learning research)

Full Text Available
An Analytical and Geometric Perspective on Adversarial Robustness

https://doi.org/10.1090/noti2758

García Trillos, Nicolas; Jacobs, Matt (September 2023, Notices of the American Mathematical Society)

Full Text Available

Search for: All records