Abstract Kernelized Gram matrix $$W$$ constructed from data points $$\{x_i\}_{i=1}^N$$ as $$W_{ij}= k_0( \frac{ \| x_i - x_j \|^2} {\sigma ^2} ) $$ is widely used in graph-based geometric data analysis and unsupervised learning. An important question is how to choose the kernel bandwidth $$\sigma $$, and a common practice called self-tuned kernel adaptively sets a $$\sigma _i$$ at each point $$x_i$$ by the $$k$$-nearest neighbor (kNN) distance. When $$x_i$$s are sampled from a $$d$$-dimensional manifold embedded in a possibly high-dimensional space, unlike with fixed-bandwidth kernels, theoretical results of graph Laplacian convergence with self-tuned kernels have been incomplete. This paper proves the convergence of graph Laplacian operator $$L_N$$ to manifold (weighted-)Laplacian for a new family of kNN self-tuned kernels $$W^{(\alpha )}_{ij} = k_0( \frac{ \| x_i - x_j \|^2}{ \epsilon \hat{\rho }(x_i) \hat{\rho }(x_j)})/\hat{\rho }(x_i)^\alpha \hat{\rho }(x_j)^\alpha $$, where $$\hat{\rho }$$ is the estimated bandwidth function by kNN and the limiting operator is also parametrized by $$\alpha $$. When $$\alpha = 1$$, the limiting operator is the weighted manifold Laplacian $$\varDelta _p$$. Specifically, we prove the point-wise convergence of $$L_N f $$ and convergence of the graph Dirichlet form with rates. Our analysis is based on first establishing a $C^0$ consistency for $$\hat{\rho }$$ which bounds the relative estimation error $$|\hat{\rho } - \bar{\rho }|/\bar{\rho }$$ uniformly with high probability, where $$\bar{\rho } = p^{-1/d}$$ and $$p$$ is the data density function. Our theoretical results reveal the advantage of the self-tuned kernel over the fixed-bandwidth kernel via smaller variance error in low-density regions. In the algorithm, no prior knowledge of $$d$$ or data density is needed. The theoretical results are supported by numerical experiments on simulated data and hand-written digit image data.
more »
« less
Optimal orthogonal group synchronization and rotation group synchronization
Abstract We study the statistical estimation problem of orthogonal group synchronization and rotation group synchronization. The model is $$Y_{ij} = Z_i^* Z_j^{*T} + \sigma W_{ij}\in{\mathbb{R}}^{d\times d}$$ where $$W_{ij}$$ is a Gaussian random matrix and $$Z_i^*$$ is either an orthogonal matrix or a rotation matrix, and each $$Y_{ij}$$ is observed independently with probability $$p$$. We analyze an iterative polar decomposition algorithm for the estimation of $Z^*$ and show it has an error of $$(1+o(1))\frac{\sigma ^2 d(d-1)}{2np}$$ when initialized by spectral methods. A matching minimax lower bound is further established that leads to the optimality of the proposed algorithm as it achieves the exact minimax risk.
more »
« less
- Award ID(s):
- 1847590
- PAR ID:
- 10398202
- Publisher / Repository:
- Oxford University Press
- Date Published:
- Journal Name:
- Information and Inference: A Journal of the IMA
- Volume:
- 12
- Issue:
- 2
- ISSN:
- 2049-8772
- Page Range / eLocation ID:
- p. 591-632
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract In synchronization problems, the goal is to estimate elements of a group from noisy measurements of their ratios. A popular estimation method for synchronization is the spectral method. It extracts the group elements from eigenvectors of a block matrix formed from the measurements. The eigenvectors must be projected, or ‘rounded’, onto the group. The rounding procedures are constructed ad hoc and increasingly so when applied to synchronization problems over non-compact groups. In this paper, we develop a spectral approach to synchronization over the non-compact group $$\mathrm{SE}(3)$$, the group of rigid motions of $$\mathbb{R}^{3}$$. We based our method on embedding $$\mathrm{SE}(3)$$ into the algebra of dual quaternions, which has deep algebraic connections with the group $$\mathrm{SE}(3)$$. These connections suggest a natural rounding procedure considerably more straightforward than the current state of the art for spectral $$\mathrm{SE}(3)$$ synchronization, which uses a matrix embedding of $$\mathrm{SE}(3)$$. We show by numerical experiments that our approach yields comparable results with the current state of the art in $$\mathrm{SE}(3)$$ synchronization via the spectral method. Thus, our approach reaps the benefits of the dual quaternion embedding of $$\mathrm{SE}(3)$$ while yielding estimators of similar quality.more » « less
-
We consider the problem of estimating a $$p$$ -dimensional vector $$\beta$$ from $$n$$ observations $$Y=X\beta+W$$ , where $$\beta_{j}\mathop{\sim}^{\mathrm{i.i.d}.}\pi$$ for a real-valued distribution $$\pi$$ with zero mean and unit variance’ $$X_{ij}\mathop{\sim}^{\mathrm{i.i.d}.}\mathcal{N}(0,1)$$ , and $$W_{i}\mathop{\sim}^{\mathrm{i.i.d}.}\mathcal{N}(0,\ \sigma^{2})$$ . In the asymptotic regime where $$n/p\rightarrow\delta$$ and $$p/\sigma^{2}\rightarrow$$ snr for two fixed constants $$\delta,\ \mathsf{snr}\in(0,\ \infty)$$ as $$p\rightarrow\infty$$ , the limiting (normalized) minimum mean-squared error (MMSE) has been characterized by a single-letter (additive Gaussian scalar) channel. In this paper, we show that if the MMSE function of the single-letter channel converges to a step function, then the limiting MMSE of estimating $$\beta$$ converges to a step function which jumps from 1 to 0 at a critical threshold. Moreover, we establish that the limiting mean-squared error of the (MSE-optimal) approximate message passing algorithm also converges to a step function with a larger threshold, providing evidence for the presence of a computational-statistical gap between the two thresholds.more » « less
-
null (Ed.)Abstract We propose a general framework for solving the group synchronization problem, where we focus on the setting of adversarial or uniform corruption and sufficiently small noise. Specifically, we apply a novel message passing procedure that uses cycle consistency information in order to estimate the corruption levels of group ratios and consequently solve the synchronization problem in our setting. We first explain why the group cycle consistency information is essential for effectively solving group synchronization problems. We then establish exact recovery and linear convergence guarantees for the proposed message passing procedure under a deterministic setting with adversarial corruption. These guarantees hold as long as the ratio of corrupted cycles per edge is bounded by a reasonable constant. We also establish the stability of the proposed procedure to sub-Gaussian noise. We further establish exact recovery with high probability under a common uniform corruption model.more » « less
-
The algorithmic advancement of synchronizing maps is important in order to solve a wide range of practice problems with possible large-scale dataset. In this paper, we provide theoretical justifications for spectral techniques for the map synchronization problem, i.e., it takes as input a collection of objects and noisy maps estimated between pairs of objects, and outputs clean maps between all pairs of objects. We show that a simple normalized spectral method that projects the blocks of the top eigenvectors of a data matrix to the map space leads to surprisingly good results. As the noise is modelled naturally as random permutation matrix, this algorithm NormSpecSync leads to competing theoretical guarantees as state-of-the-art convex optimization techniques, yet it is much more efficient. We demonstrate the usefulness of our algorithm in a couple of applications, where it is optimal in both complexity and exactness among existing methods.more » « less