NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Bi-stochastically normalized graph Laplacian: convergence to manifold Laplacian and robustness to outlier noise

https://doi.org/10.1093/imaiai/iaae026

Cheng, Xiuyuan; Landa, Boris (September 2024, Information and Inference: A Journal of the IMA)

Abstract Bi-stochastic normalization provides an alternative normalization of graph Laplacians in graph-based data analysis and can be computed efficiently by Sinkhorn–Knopp (SK) iterations. This paper proves the convergence of bi-stochastically normalized graph Laplacian to manifold (weighted-)Laplacian with rates, when $$n$$ data points are i.i.d. sampled from a general $$d$$-dimensional manifold embedded in a possibly high-dimensional space. Under certain joint limit of $$n \to \infty $$ and kernel bandwidth $$\epsilon \to 0$$, the point-wise convergence rate of the graph Laplacian operator (under 2-norm) is proved to be $$ O( n^{-1/(d/2+3)})$$ at finite large $$n$$ up to log factors, achieved at the scaling of $$\epsilon \sim n^{-1/(d/2+3)} $$. When the manifold data are corrupted by outlier noise, we theoretically prove the graph Laplacian point-wise consistency which matches the rate for clean manifold data plus an additional term proportional to the boundedness of the inner-products of the noise vectors among themselves and with data vectors. Motivated by our analysis, which suggests that not exact bi-stochastic normalization but an approximate one will achieve the same consistency rate, we propose an approximate and constrained matrix scaling problem that can be solved by SK iterations with early termination. Numerical experiments support our theoretical results and show the robustness of bi-stochastically normalized graph Laplacian to high-dimensional outlier noise.
more » « less
Statistical inference using GLEaM model with spatial heterogeneity and correlation between regions

https://doi.org/10.1038/s41598-022-18775-8

Tan, Yixuan; Zhang, Yuan; Cheng, Xiuyuan; Zhou, Xiao-Hua (October 2022, Scientific Reports)

Abstract A better understanding of various patterns in the coronavirus disease 2019 (COVID-19) spread in different parts of the world is crucial to its prevention and control. Motivated by the previously developed Global Epidemic and Mobility (GLEaM) model, this paper proposes a new stochastic dynamic model to depict the evolution of COVID-19. The model allows spatial and temporal heterogeneity of transmission parameters and involves transportation between regions. Based on the proposed model, this paper also designs a two-step procedure for parameter inference, which utilizes the correlation between regions through a prior distribution that imposes graph Laplacian regularization on transmission parameters. Experiments on simulated data and real-world data in China and Europe indicate that the proposed model achieves higher accuracy in predicting the newly confirmed cases than baseline models.
more » « less
The G-invariant graph Laplacian part II: Diffusion maps

https://doi.org/10.1016/j.acha.2024.101695

Rosen, Eitan; Cheng, Xiuyuan; Shkolnisky, Yoel (November 2024, Applied and Computational Harmonic Analysis)

Full Text Available
The G-invariant graph Laplacian Part I: Convergence rate and eigendecomposition

https://doi.org/10.1016/j.acha.2024.101637

Rosen, Eitan; Hoyos, Paulina; Cheng, Xiuyuan; Kileel, Joe; Shkolnisky, Yoel (July 2024, Applied and Computational Harmonic Analysis)

Full Text Available
Robust Inference of Manifold Density and Geometry by Doubly Stochastic Scaling

https://doi.org/10.1137/22M1516968

Landa, Boris; Cheng, Xiuyuan (September 2023, SIAM Journal on Mathematics of Data Science)

Full Text Available
Eigen-convergence of Gaussian kernelized graph Laplacian by manifold heat interpolation

https://doi.org/10.1016/j.acha.2022.06.003

Cheng, Xiuyuan; Wu, Nan (November 2022, Applied and Computational Harmonic Analysis)

Full Text Available
SpecNet2: Orthogonalization-free spectral embedding by neural networks

Chen, Ziyu; Li, Yingzhou; Cheng, Xiuyuan (August 2022, Proceedings of Mathematical and Scientific Machine Learning)

Full Text Available
Scaling positive random matrices: concentration and asymptotic convergence

https://doi.org/10.1214/22-ECP502

Landa, Boris (January 2022, Electronic Communications in Probability)

Full Text Available
Convergence of graph Laplacian with kNN self-tuned kernels

https://doi.org/10.1093/imaiai/iaab019

Cheng, Xiuyuan; Wu, Hau-Tieng (September 2021, Information and Inference: A Journal of the IMA)

Abstract Kernelized Gram matrix $$W$$ constructed from data points $$\{x_i\}_{i=1}^N$$ as $$W_{ij}= k_0( \frac{ \| x_i - x_j \|^2} {\sigma ^2} ) $$ is widely used in graph-based geometric data analysis and unsupervised learning. An important question is how to choose the kernel bandwidth $$\sigma $$, and a common practice called self-tuned kernel adaptively sets a $$\sigma _i$$ at each point $$x_i$$ by the $$k$$-nearest neighbor (kNN) distance. When $$x_i$$s are sampled from a $$d$$-dimensional manifold embedded in a possibly high-dimensional space, unlike with fixed-bandwidth kernels, theoretical results of graph Laplacian convergence with self-tuned kernels have been incomplete. This paper proves the convergence of graph Laplacian operator $$L_N$$ to manifold (weighted-)Laplacian for a new family of kNN self-tuned kernels $$W^{(\alpha )}_{ij} = k_0( \frac{ \| x_i - x_j \|^2}{ \epsilon \hat{\rho }(x_i) \hat{\rho }(x_j)})/\hat{\rho }(x_i)^\alpha \hat{\rho }(x_j)^\alpha $$, where $$\hat{\rho }$$ is the estimated bandwidth function by kNN and the limiting operator is also parametrized by $$\alpha $$. When $$\alpha = 1$$, the limiting operator is the weighted manifold Laplacian $$\varDelta _p$$. Specifically, we prove the point-wise convergence of $$L_N f $$ and convergence of the graph Dirichlet form with rates. Our analysis is based on first establishing a $C^0$ consistency for $$\hat{\rho }$$ which bounds the relative estimation error $$|\hat{\rho } - \bar{\rho }|/\bar{\rho }$$ uniformly with high probability, where $$\bar{\rho } = p^{-1/d}$$ and $$p$$ is the data density function. Our theoretical results reveal the advantage of the self-tuned kernel over the fixed-bandwidth kernel via smaller variance error in low-density regions. In the algorithm, no prior knowledge of $$d$$ or data density is needed. The theoretical results are supported by numerical experiments on simulated data and hand-written digit image data.
more » « less
Full Text Available

Search for: All records