Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to nonfederal websites. Their policies may differ from this site.

Free, publiclyaccessible full text available March 1, 2024

Abstract Kernelized Gram matrix $W$ constructed from data points $\{x_i\}_{i=1}^N$ as $W_{ij}= k_0( \frac{ \ x_i  x_j \^2} {\sigma ^2} ) $ is widely used in graphbased geometric data analysis and unsupervised learning. An important question is how to choose the kernel bandwidth $\sigma $, and a common practice called selftuned kernel adaptively sets a $\sigma _i$ at each point $x_i$ by the $k$nearest neighbor (kNN) distance. When $x_i$s are sampled from a $d$dimensional manifold embedded in a possibly highdimensional space, unlike with fixedbandwidth kernels, theoretical results of graph Laplacian convergence with selftuned kernels have been incomplete. This paper proves the convergence of graph Laplacian operator $L_N$ to manifold (weighted)Laplacian for a new family of kNN selftuned kernels $W^{(\alpha )}_{ij} = k_0( \frac{ \ x_i  x_j \^2}{ \epsilon \hat{\rho }(x_i) \hat{\rho }(x_j)})/\hat{\rho }(x_i)^\alpha \hat{\rho }(x_j)^\alpha $, where $\hat{\rho }$ is the estimated bandwidth function by kNN and the limiting operator is also parametrized by $\alpha $. When $\alpha = 1$, the limiting operator is the weighted manifold Laplacian $\varDelta _p$. Specifically, we prove the pointwise convergence of $L_N f $ and convergence of the graph Dirichlet form with rates. Our analysis is based on first establishing a $C^0$ consistency for $\hat{\rho }$ which bounds the relative estimation error $\hat{\rho }  \bar{\rho }/\bar{\rho }$ uniformly with high probability, where $\bar{\rho } = p^{1/d}$ and $p$ is the data density function. Our theoretical results reveal the advantage of the selftuned kernel over the fixedbandwidth kernel via smaller variance error in lowdensity regions. In the algorithm, no prior knowledge of $d$ or data density is needed. The theoretical results are supported by numerical experiments on simulated data and handwritten digit image data.more » « less

Abstract The $p$Laplacian has attracted more and more attention in data analysis disciplines in the past decade. However, there is still a knowledge gap about its behavior, which limits its practical application. In this paper, we are interested in its iterative behavior in domains contained in twodimensional Euclidean space. Given a connected set $\varOmega _0 \subset \mathbb{R}^2$, define a sequence of sets $(\varOmega _n)_{n=0}^{\infty }$ where $\varOmega _{n+1}$ is the subset of $\varOmega _n$ where the first eigenfunction of the (properly normalized) Neumann $p$Laplacian $ \varDelta ^{(p)} \phi = \lambda _1 \phi ^{p2} \phi $ is positive (or negative). For $p=1$, this is also referred to as the ratio cut of the domain. We conjecture that these sets converge to the set of rectangles with eccentricity bounded by 2 in the Gromovâ€“Hausdorff distance as long as they have a certain distance to the boundary $\partial \varOmega _0$. We establish some aspects of this conjecture for $p=1$ where we prove that (1) the 1Laplacian spectral cut of domains sufficiently close to rectangles is a circular arc that is closer to flat than the original domain (leading eventually to quadrilaterals) and (2) quadrilaterals close to a rectangle of aspect ratio $2$ stay close to quadrilaterals and move closer to rectangles in a suitable metric. We also discuss some numerical aspects and pose many open questions.

Summary Periodicity and trend are features describing an observed sequence, and extracting these features is an important issue in many scientific fields. However, it is not an easy task for existing methods to analyse simultaneously the trend and dynamics of the periodicity such as time varying frequency and amplitude, and the adaptivity of the analysis to such dynamics and robustness to heteroscedastic dependent errors are not guaranteed. These tasks become even more challenging when there are multiple periodic components. We propose a nonparametric model to describe the dynamics of multicomponent periodicity and investigate the recently developed synchrosqueezing transform in extracting these features in the presence of a trend and heteroscedastic dependent errors. The identifiability problem of the nonparametric periodicity model is studied, and the adaptivity and robustness properties of the synchrosqueezing transform are theoretically justified in both discrete and continuous time settings. Consequently we have a new technique for decoupling the trend, periodicity and heteroscedastic, dependent error process in a general nonparametric setup. Results of a series of simulations are provided, and the incidence time series of varicella and herpes zoster in Taiwan and respiratory signals observed from a sleep study are analysed.