NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Two-sample statistics based on anisotropic kernels

https://doi.org/10.1093/imaiai/iaz018

Cheng, Xiuyuan; Cloninger, Alexander; Coifman, Ronald R. (December 2019, Information and Inference: A Journal of the IMA)

Abstract The paper introduces a new kernel-based Maximum Mean Discrepancy (MMD) statistic for measuring the distance between two distributions given finitely many multivariate samples. When the distributions are locally low-dimensional, the proposed test can be made more powerful to distinguish certain alternatives by incorporating local covariance matrices and constructing an anisotropic kernel. The kernel matrix is asymmetric; it computes the affinity between $$n$$ data points and a set of $$n_R$$ reference points, where $$n_R$$ can be drastically smaller than $$n$$. While the proposed statistic can be viewed as a special class of Reproducing Kernel Hilbert Space MMD, the consistency of the test is proved, under mild assumptions of the kernel, as long as $$\|p-q\| \sqrt{n} \to \infty $$, and a finite-sample lower bound of the testing power is obtained. Applications to flow cytometry and diffusion MRI datasets are demonstrated, which motivate the proposed approach to compare distributions.
more » « less
Classification Logit Two-Sample Testing by Neural Networks for Differentiating Near Manifold Densities

https://doi.org/10.1109/TIT.2022.3175691

Cheng, Xiuyuan; Cloninger, Alexander (October 2022, IEEE Transactions on Information Theory)

Full Text Available
Linear optimal transport embedding: provable Wasserstein classification for certain rigid transformations and perturbations

https://doi.org/10.1093/imaiai/iaac023

Moosmüller, Caroline; Cloninger, Alexander (September 2022, Information and Inference: A Journal of the IMA)

Abstract Discriminating between distributions is an important problem in a number of scientific fields. This motivated the introduction of Linear Optimal Transportation (LOT), which embeds the space of distributions into an $L^2$-space. The transform is defined by computing the optimal transport of each distribution to a fixed reference distribution and has a number of benefits when it comes to speed of computation and to determining classification boundaries. In this paper, we characterize a number of settings in which LOT embeds families of distributions into a space in which they are linearly separable. This is true in arbitrary dimension, and for families of distributions generated through perturbations of shifts and scalings of a fixed distribution. We also prove conditions under which the $L^2$ distance of the LOT embedding between two distributions in arbitrary dimension is nearly isometric to Wasserstein-2 distance between those distributions. This is of significant computational benefit, as one must only compute $$N$$ optimal transport maps to define the $N^2$ pairwise distances between $$N$$ distributions. We demonstrate the benefits of LOT on a number of distribution classification problems.
more » « less
Full Text Available
StreaMRAK a streaming multi-resolution adaptive kernel algorithm

https://doi.org/10.1016/j.amc.2022.127112

Oslandsbotn, Andreas; Kereta, Željko; Naumova, Valeriya; Freund, Yoav; Cloninger, Alexander (August 2022, Applied Mathematics and Computation)

Full Text Available
Kernel Distance Measures for Time Series, Random Fields and Other Structured Data

https://doi.org/10.3389/fams.2021.787455

Das, Srinjoy; Mhaskar, Hrushikesh N.; Cloninger, Alexander (December 2021, Frontiers in Applied Mathematics and Statistics)

This paper introduces kdiff, a novel kernel-based measure for estimating distances between instances of time series, random fields and other forms of structured data. This measure is based on the idea of matching distributions that only overlap over a portion of their region of support. Our proposed measure is inspired by MPdist which has been previously proposed for such datasets and is constructed using Euclidean metrics, whereas kdiff is constructed using non-linear kernel distances. Also, kdiff accounts for both self and cross similarities across the instances and is defined using a lower quantile of the distance distribution. Comparing the cross similarity to self similarity allows for measures of similarity that are more robust to noise and partial occlusions of the relevant signals. Our proposed measure kdiff is a more general form of the well known kernel-based Maximum Mean Discrepancy distance estimated over the embeddings. Some theoretical results are provided for separability conditions using kdiff as a distance measure for clustering and classification problems where the embedding distributions can be modeled as two component mixtures. Applications are demonstrated for clustering of synthetic and real-life time series and image data, and the performance of kdiff is compared to competing distance measures for clustering.
more » « less
Full Text Available
A deep network construction that adapts to intrinsic dimensionality beyond the domain

https://doi.org/10.1016/j.neunet.2021.06.004

Cloninger, Alexander; Klock, Timo (September 2021, Neural Networks)

Full Text Available
Cautious active clustering

https://doi.org/10.1016/j.acha.2021.02.002

Cloninger, A.; Mhaskar, H.N. (September 2021, Applied and Computational Harmonic Analysis)
null (Ed.)
Full Text Available
Natural Graph Wavelet Packet Dictionaries

https://doi.org/10.1007/s00041-021-09832-3

Cloninger, Alexander; Li, Haotian; Saito, Naoki (June 2021, Journal of Fourier Analysis and Applications)
null (Ed.)
Abstract We introduce a set of novel multiscale basis transforms for signals on graphs that utilize their “dual” domains by incorporating the “natural” distances between graph Laplacian eigenvectors, rather than simply using the eigenvalue ordering. These basis dictionaries can be seen as generalizations of the classical Shannon wavelet packet dictionary to arbitrary graphs, and do not rely on the frequency interpretation of Laplacian eigenvalues. We describe the algorithms (involving either vector rotations or orthogonalizations) to construct these basis dictionaries, use them to efficiently approximate graph signals through the best basis search, and demonstrate the strengths of these basis dictionaries for graph signals measured on sunflower graphs and street networks.
more » « less
Full Text Available
Nonclosedness of sets of neural networks in Sobolev spaces

https://doi.org/10.1016/j.neunet.2021.01.007

Mahan, Scott; King, Emily J.; Cloninger, Alex (May 2021, Neural Networks)
null (Ed.)
Full Text Available
LDLE: Low Distortion Local Eigenmaps

Kohli, Dhruv; Cloninger, Alexander; Mishne, Gal (January 2021, Journal of machine learning research)

Full Text Available

« Prev Next »

Search for: All records