NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

On the existence of solutions to adversarial training in multiclass classification

https://doi.org/10.1017/S0956792524000822

García_Trillos, Nicolás; Jacobs, Matt; Kim, Jakwang (December 2024, European Journal of Applied Mathematics)

Abstract Adversarial training is a min-max optimization problem that is designed to construct robust classifiers against adversarial perturbations of data. We study three models of adversarial training in the multiclass agnostic-classifier setting. We prove the existence of Borel measurable robust classifiers in each model and provide a unified perspective of the adversarial training problem, expanding the connections with optimal transport initiated by the authors in their previous work [21]. In addition, we develop new connections between adversarial training in the multiclass setting and total variation regularization. As a corollary of our results, we provide an alternative proof of the existence of Borel measurable solutions to the agnostic adversarial training problem in the binary classification setting.
more » « less
Free, publicly-accessible full text available December 3, 2025
Scaling limit of soliton lengths in a multicolor box-ball system

https://doi.org/10.1017/fms.2024.74

Lewis, Joel; Lyu, Hanbaek; Pylyavskyy, Pavlo; Sen, Arnab (December 2024, Forum of Mathematics, Sigma)

Abstract The box-ball systems are integrable cellular automata whose long-time behavior is characterized by soliton solutions, with rich connections to other integrable systems such as the Korteweg-de Vries equation. In this paper, we consider a multicolor box-ball system with two types of random initial configurations and obtain sharp scaling limits of the soliton lengths as the system size tends to infinity. We obtain a sharp scaling limit of soliton lengths that turns out to be more delicate than that in the single color case established in [LLP20]. A large part of our analysis is devoted to studying the associated carrier process, which is a multidimensional Markov chain on the orthant, whose excursions and running maxima are closely related to soliton lengths. We establish the sharp scaling of its ruin probabilities, Skorokhod decomposition, strong law of large numbers and weak diffusive scaling limit to a semimartingale reflecting Brownian motion with explicit parameters. We also establish and utilize complementary descriptions of the soliton lengths and numbers in terms of modified Greene-Kleitman invariants for the box-ball systems and associated circular exclusion processes.
more » « less
Free, publicly-accessible full text available December 10, 2025
Maximum Likelihood Estimation for Unrooted 3-Leaf Trees: An Analytic Solution for the CFN Model

https://doi.org/10.1007/s11538-024-01340-x

Hill, Max; Roch, Sebastien; Rodriguez, Jose Israel (September 2024, Bulletin of Mathematical Biology)

Abstract Maximum likelihood estimation is among the most widely-used methods for inferring phylogenetic trees from sequence data. This paper solves the problem of computing solutions to the maximum likelihood problem for 3-leaf trees under the 2-state symmetric mutation model (CFN model). Our main result is a closed-form solution to the maximum likelihood problem for unrooted 3-leaf trees, given generic data; this result characterizes all of the ways that a maximum likelihood estimate can fail to exist for generic data and provides theoretical validation for predictions made in Parks and Goldman (Syst Biol 63(5):798–811, 2014). Our proof makes use of both classical tools for studying group-based phylogenetic models such as Hadamard conjugation and reparameterization in terms of Fourier coordinates, as well as more recent results concerning the semi-algebraic constraints of the CFN model. To be able to put these into practice, we also give a complete characterization to test genericity.
more » « less
Full Text Available
On adversarial robustness and the use of Wasserstein ascent-descent dynamics to enforce it

https://doi.org/10.1093/imaiai/iaae018

García_Trillos, Camilo Andrés; García_Trillos, Nicolás (July 2024, Information and Inference: A Journal of the IMA)

Abstract We propose iterative algorithms to solve adversarial training problems in a variety of supervised learning settings of interest. Our algorithms, which can be interpreted as suitable ascent-descent dynamics in Wasserstein spaces, take the form of a system of interacting particles. These interacting particle dynamics are shown to converge toward appropriate mean-field limit equations in certain large number of particles regimes. In turn, we prove that, under certain regularity assumptions, these mean-field equations converge, in the large time limit, toward approximate Nash equilibria of the original adversarial learning problems. We present results for non-convex non-concave settings, as well as for non-convex concave ones. Numerical experiments illustrate our results.
more » « less
Full Text Available
Solving, tracking and stopping streaming linear inverse problems

https://doi.org/10.1088/1361-6420/ad5583

Pritchard, Nathaniel; Patel, Vivak (June 2024, Inverse Problems)

Abstract In large-scale applications including medical imaging, collocation differential equation solvers, and estimation with differential privacy, the underlying linear inverse problem can be reformulated as a streaming problem. In theory, the streaming problem can be effectively solved using memory-efficient, exponentially-converging streaming solvers. In special cases when the underlying linear inverse problem is finite-dimensional, streaming solvers can periodically evaluate the residual norm at a substantial computational cost. When the underlying system is infinite dimensional, streaming solver can only access noisy estimates of the residual. While such noisy estimates are computationally efficient, they are useful only when their accuracy is known. In this work, we rigorously develop a general family of computationally-practical residual estimators and their uncertainty sets for streaming solvers, and we demonstrate the accuracy of our methods on a number of large-scale linear problems. Thus, we further enable the practical use of streaming solvers for important classes of linear inverse problems.
more » « less
A new perspective on denoising based on optimal transport

https://doi.org/10.1093/imaiai/iaae029

García_Trillos, Nicolás; Sen, Bodhisattva (September 2024, Information and Inference: A Journal of the IMA)

Abstract In the standard formulation of the classical denoising problem, one is given a probabilistic model relating a latent variable $$\varTheta \in \varOmega \subset{\mathbb{R}}^{m} \; (m\ge 1)$$ and an observation $$Z \in{\mathbb{R}}^{d}$$ according to $$Z \mid \varTheta \sim p(\cdot \mid \varTheta )$$ and $$\varTheta \sim G^{*}$$, and the goal is to construct a map to recover the latent variable from the observation. The posterior mean, a natural candidate for estimating $$\varTheta $$ from $$Z$$, attains the minimum Bayes risk (under the squared error loss) but at the expense of over-shrinking the $$Z$$, and in general may fail to capture the geometric features of the prior distribution $$G^{*}$$ (e.g. low dimensionality, discreteness, sparsity). To rectify these drawbacks, in this paper we take a new perspective on this denoising problem that is inspired by optimal transport (OT) theory and use it to study a different, OT-based, denoiser at the population level setting. We rigorously prove that, under general assumptions on the model, this OT-based denoiser is mathematically well-defined and unique, and is closely connected to the solution to a Monge OT problem. We then prove that, under appropriate identifiability assumptions on the model, the OT-based denoiser can be recovered solely from information of the marginal distribution of $$Z$$ and the posterior mean of the model, after solving a linear relaxation problem over a suitable space of couplings that is reminiscent of standard multimarginal OT problems. In particular, due to Tweedie’s formula, when the likelihood model $$\{ p(\cdot \mid \theta ) \}_{\theta \in \varOmega }$$ is an exponential family of distributions, the OT-based denoiser can be recovered solely from the marginal distribution of $$Z$$. In general, our family of OT-like relaxations is of interest in its own right and for the denoising problem suggests alternative numerical methods inspired by the rich literature on computational OT.
more » « less
Full Text Available
Learning low-rank latent mesoscale structures in networks

https://doi.org/10.1038/s41467-023-42859-2

Lyu, Hanbaek; Kureh, Yacoub H.; Vendrow, Joshua; Porter, Mason A. (January 2024, Nature Communications)

Abstract Researchers in many fields use networks to represent interactions between entities in complex systems. To study the large-scale behavior of complex systems, it is useful to examine mesoscale structures in networks as building blocks that influence such behavior. In this paper, we present an approach to describe low-rank mesoscale structures in networks. We find that many real-world networks possess a small set of latent motifs that effectively approximate most subgraphs at a fixed mesoscale. Such low-rank mesoscale structures allow one to reconstruct networks by approximating subgraphs of a network using combinations of latent motifs. Employing subgraph sampling and nonnegative matrix factorization enables the discovery of these latent motifs. The ability to encode and reconstruct networks using a small set of latent motifs has many applications in network analysis, including network comparison, network denoising, and edge inference.
more » « less
Summary Tests of Introgression Are Highly Sensitive to Rate Variation Across Lineages

https://doi.org/10.1093/sysbio/syad056

Frankel, Lauren_E; Ané, Cécile; Folk, ed., Ryan (September 2023, Systematic Biology)

Abstract The evolutionary implications and frequency of hybridization and introgression are increasingly being recognized across the tree of life. To detect hybridization from multi-locus and genome-wide sequence data, a popular class of methods are based on summary statistics from subsets of 3 or 4 taxa. However, these methods often carry the assumption of a constant substitution rate across lineages and genes, which is commonly violated in many groups. In this work, we quantify the effects of rate variation on the D test (also known as ABBA–BABA test), the D3 test, and HyDe. All 3 tests are used widely across a range of taxonomic groups, in part because they are very fast to compute. We consider rate variation across species lineages, across genes, their lineage-by-gene interaction, and rate variation across gene-tree edges. We simulated species networks according to a birth–death-hybridization process, so as to capture a range of realistic species phylogenies. For all 3 methods tested, we found a marked increase in the false discovery of reticulation (type-1 error rate) when there is rate variation across species lineages. The D3 test was the most sensitive, with around 80% type-1 error, such that D3 appears to more sensitive to a departure from the clock than to the presence of reticulation. For all 3 tests, the power to detect hybridization events decreased as the number of hybridization events increased, indicating that multiple hybridization events can obscure one another if they occur within a small subset of taxa. Our study highlights the need to consider rate variation when using site-based summary statistics, and points to the advantages of methods that do not require assumptions on evolutionary rates across lineages or across genes.
more » « less
PhyloCoalSimulations: A Simulator for Network Multispecies Coalescent Models, Including a New Extension for the Inheritance of Gene Flow

https://doi.org/10.1093/sysbio/syad030

Fogg, John; Allman, Elizabeth_S; Ané, Cécile; Thomson, ed., Robert_C (May 2023, Systematic Biology)

Abstract We consider the evolution of phylogenetic gene trees along phylogenetic species networks, according to the network multispecies coalescent process, and introduce a new network coalescent model with correlated inheritance of gene flow. This model generalizes two traditional versions of the network coalescent: with independent or common inheritance. At each reticulation, multiple lineages of a given locus are inherited from parental populations chosen at random, either independently across lineages or with positive correlation according to a Dirichlet process. This process may account for locus-specific probabilities of inheritance, for example. We implemented the simulation of gene trees under these network coalescent models in the Julia package PhyloCoalSimulations, which depends on PhyloNetworks and its powerful network manipulation tools. Input species phylogenies can be read in extended Newick format, either in numbers of generations or in coalescent units. Simulated gene trees can be written in Newick format, and in a way that preserves information about their embedding within the species network. This embedding can be used for downstream purposes, such as to simulate species-specific processes like rate variation across species, or for other scenarios as illustrated in this note. This package should be useful for simulation studies and simulation-based inference methods. The software is available open source with documentation and a tutorial at https://github.com/cecileane/PhyloCoalSimulations.jl.
more » « less
Adversarial classification via distributional robustness with Wasserstein ambiguity

https://doi.org/10.1007/s10107-022-01796-6

Ho-Nguyen, Nam; Wright, Stephen J. (April 2022, Mathematical Programming)

Abstract We study a model for adversarial classification based on distributionally robust chance constraints. We show that under Wasserstein ambiguity, the model aims to minimize the conditional value-at-risk of the distance to misclassification, and we explore links to adversarial classification models proposed earlier and to maximum-margin classifiers. We also provide a reformulation of the distributionally robust model for linear classification, and show it is equivalent to minimizing a regularized ramp loss objective. Numerical experiments show that, despite the nonconvexity of this formulation, standard descent methods appear to converge to the global minimizer for this problem. Inspired by this observation, we show that, for a certain class of distributions, the only stationary point of the regularized ramp loss minimization problem is the global minimizer.
more » « less

« Prev Next »

Search for: All records