NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Large sample spectral analysis of graph-based multi-manifold clustering

Garcia Trillos, Nicolas; He, Pengfei; Li, Chenghui (May 2023, Journal of machine learning research)
Carreira-Perpinan, Miguel (Ed.)
In this work we study statistical properties of graph-based algorithms for multi-manifold clustering (MMC). In MMC the goal is to retrieve the multi-manifold structure underlying a given Euclidean data set when this one is assumed to be obtained by sampling a distribution on a union of manifolds M = M1 ∪ · · · ∪ MN that may intersect with each other and that may have different dimensions. We investigate sufficient conditions that similarity graphs on data sets must satisfy in order for their corresponding graph Laplacians to capture the right geometric information to solve the MMC problem. Precisely, we provide high probability error bounds for the spectral approximation of a tensorized Laplacian on M with a suitable graph Laplacian built from the observations; the recovered tensorized Laplacian contains all geometric information of all the individual underlying manifolds. We provide an example of a family of similarity graphs, which we call annular proximity graphs with angle constraints, satisfying these sufficient conditions. We contrast our family of graphs with other constructions in the literature based on the alignment of tangent planes. Extensive numerical experiments expand the insights that our theory provides on the MMC problem.
more » « less
Full Text Available
The multimarginal optimal transport formulation of adversarial multiclass classification

Garcia Trillos, Nicolas; Jacobs, Matt; Kim, Jakwang (January 2023, Journal of machine learning research)
Bubeck, Sebastien (Ed.)
We study a family of adversarial multiclass classification problems and provide equivalent reformulations in terms of: 1) a family of generalized barycenter problems introduced in the paper and 2) a family of multimarginal optimal transport problems where the number of marginals is equal to the number of classes in the original classification problem. These new theoretical results reveal a rich geometric structure of adversarial learning problems in multiclass classification and extend recent results restricted to the binary classification setting. A direct computational implication of our results is that by solving either the barycenter problem and its dual, or the MOT problem and its dual, we can recover the optimal robust classification rule and the optimal adversarial strategy for the original adversarial problem. Examples with synthetic and real data illustrate our results.
more » « less
Full Text Available
Adversarial Classification: Necessary Conditions and Geometric Flows

Garcia Trillos, Nicolas; Murray, Ryan (July 2022, Journal of machine learning research)

Full Text Available
Clustering Dynamics on Graphs: From Spectral Clustering to Mean Shift Through Fokker–Planck Interpolation.

https://doi.org/10.1007/978-3-030-93302-9_4

Craig, Katy; Garcia Trillos, Nicolas; Slepcev, Dejan (December 2021, Modeling and simulation in science engineering technology)
Bellomo, N.; Carrillo, J.A.; Tadmor, E. (Ed.)
In this work, we build a unifying framework to interpolate between density-driven and geometry-based algorithms for data clustering and, specifically, to connect the mean shift algorithm with spectral clustering at discrete and continuum levels. We seek this connection through the introduction of Fokker–Planck equations on data graphs. Besides introducing new forms of mean shift algorithms on graphs, we provide new theoretical insights on the behavior of the family of diffusion maps in the large sample limit as well as provide new connections between diffusion maps and mean shift dynamics on a fixed graph. Several numerical examples illustrate our theoretical findings and highlight the benefits of interpolating density-driven and geometry-based clustering algorithms.
more » « less
Full Text Available
Traditional and Accelerated Gradient Descent for Neural Architecture Search

https://doi.org/10.1007/978-3-030-80209-7_55

Garcia Trillos, Nicolas; Morales, Felix; Morales, Javier (July 2021, 978-3-030-80208-0)

Full Text Available
Geometric structure of graph Laplacian embeddings

Garcia Trillos, Nicolas; Hoffmann, Franca; Hosseini, Bamdad (March 2021, Journal of machine learning research)

Full Text Available
On the consistency of graph-based Bayesian semi-supervised learning and the scalability of sampling algorithms

Garcia Trillos, Nicolas; Kaplan, Zachary; Samakhoana, Thabo; Sanz-Alonso, Daniel (March 2020, Journal of machine learning research)

This paper considers a Bayesian approach to graph-based semi-supervised learning. We show that if the graph parameters are suitably scaled, the graph-posteriors converge to a continuum limit as the size of the unlabeled data set grows. This consistency result has profound algorithmic implications: we prove that when consistency holds, carefully designed Markov chain Monte Carlo algorithms have a uniform spectral gap, independent of the number of unlabeled inputs. Numerical experiments illustrate and complement the theory.
more » « less
Full Text Available
Local Regularization of Noisy Point Clouds: Improved Global Geometric Estimates and Data Analysis

Garcia Trillos, Nicolas; Sanz-Alonso, Daniel; Yang, Ruiyi (August 2019, Journal of machine learning research)

Several data analysis techniques employ similarity relationships between data points to uncover the intrinsic dimension and geometric structure of the underlying data-generating mechanism. In this paper we work under the model assumption that the data is made of random perturbations of feature vectors lying on a low-dimensional manifold. We study two questions: how to deﬁne the similarity relationships over noisy data points, and what is the resulting impact of the choice of similarity in the extraction of global geometric information from the underlying manifold. We provide concrete mathematical evidence that using a local regularization of the noisy data to deﬁne the similarity improves the approximation of the hidden Euclidean distance between unperturbed points. Furthermore, graph-based objects constructed with the locally regularized similarity function satisfy better error bounds in their recovery of global geometric ones. Our theory is supported by numerical experiments that demonstrate that the gain in geometric understanding facilitated by local regularization translates into a gain in classiﬁcation accuracy in simulated and real data.
more » « less
Full Text Available

Search for: All records