skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Convergence analysis of kernel conjugate gradient for functional linear regression
In this paper, we discuss the convergence analysis of the conjugate gradient-based algorithm for the functional linear model in the reproducing kernel Hilbert space framework, utilizing early stopping results in regularization against over-fitting. We establish the convergence rates depending on the regularity condition of the slope function and the decay rate of the eigenvalues of the operator composition of covariance and kernel operator. Our convergence rates match the minimax rate available from the literature.  more » « less
Award ID(s):
1945396
PAR ID:
10498839
Author(s) / Creator(s):
; ;
Publisher / Repository:
Journal of Applied and Numerical Analysis
Date Published:
Journal Name:
Journal of Applied and Numerical Analysis
Volume:
1
Issue:
1
ISSN:
2786-815X
Page Range / eLocation ID:
33 to 47
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Kernelized Gram matrix $$W$$ constructed from data points $$\{x_i\}_{i=1}^N$$ as $$W_{ij}= k_0( \frac{ \| x_i - x_j \|^2} {\sigma ^2} ) $$ is widely used in graph-based geometric data analysis and unsupervised learning. An important question is how to choose the kernel bandwidth $$\sigma $$, and a common practice called self-tuned kernel adaptively sets a $$\sigma _i$$ at each point $$x_i$$ by the $$k$$-nearest neighbor (kNN) distance. When $$x_i$$s are sampled from a $$d$$-dimensional manifold embedded in a possibly high-dimensional space, unlike with fixed-bandwidth kernels, theoretical results of graph Laplacian convergence with self-tuned kernels have been incomplete. This paper proves the convergence of graph Laplacian operator $$L_N$$ to manifold (weighted-)Laplacian for a new family of kNN self-tuned kernels $$W^{(\alpha )}_{ij} = k_0( \frac{ \| x_i - x_j \|^2}{ \epsilon \hat{\rho }(x_i) \hat{\rho }(x_j)})/\hat{\rho }(x_i)^\alpha \hat{\rho }(x_j)^\alpha $$, where $$\hat{\rho }$$ is the estimated bandwidth function by kNN and the limiting operator is also parametrized by $$\alpha $$. When $$\alpha = 1$$, the limiting operator is the weighted manifold Laplacian $$\varDelta _p$$. Specifically, we prove the point-wise convergence of $$L_N f $$ and convergence of the graph Dirichlet form with rates. Our analysis is based on first establishing a $C^0$ consistency for $$\hat{\rho }$$ which bounds the relative estimation error $$|\hat{\rho } - \bar{\rho }|/\bar{\rho }$$ uniformly with high probability, where $$\bar{\rho } = p^{-1/d}$$ and $$p$$ is the data density function. Our theoretical results reveal the advantage of the self-tuned kernel over the fixed-bandwidth kernel via smaller variance error in low-density regions. In the algorithm, no prior knowledge of $$d$$ or data density is needed. The theoretical results are supported by numerical experiments on simulated data and hand-written digit image data. 
    more » « less
  2. Radial basis functions (RBFs) are prominent examples for reproducing kernels with associated reproducing kernel Hilbert spaces (RKHSs). The convergence theory for the kernel-based interpolation in that space is well understood and optimal rates for the whole RKHS are often known. Schaback added the doubling trick [Math. Comp. 68 (1999), pp. 201–216], which shows that functions having double the smoothness required by the RKHS (along with specific, albeit complicated boundary behavior) can be approximated with higher convergence rates than the optimal rates for the whole space. Other advances allowed interpolation of target functions which are less smooth, and different norms which measure interpolation error. The current state of the art of error analysis for RBF interpolation treats target functions having smoothness up to twice that of the native space, but error measured in norms which are weaker than that required for membership in the RKHS. Motivated by the fact that the kernels and the approximants they generate are smoother than required by the native space, this article extends the doubling trick to error which measures higher smoothness. This extension holds for a family of kernels satisfying easily checked hypotheses which we describe in this article, and includes many prominent RBFs. In the course of the proof, new convergence rates are obtained for the abstract operator considered by Devore and Ron in [Trans. Amer. Math. Soc. 362 (2010), pp. 6205–6229], and new Bernstein estimates are obtained relating high order smoothness norms to the native space norm. 
    more » « less
  3. Abstract Bi-stochastic normalization provides an alternative normalization of graph Laplacians in graph-based data analysis and can be computed efficiently by Sinkhorn–Knopp (SK) iterations. This paper proves the convergence of bi-stochastically normalized graph Laplacian to manifold (weighted-)Laplacian with rates, when $$n$$ data points are i.i.d. sampled from a general $$d$$-dimensional manifold embedded in a possibly high-dimensional space. Under certain joint limit of $$n \to \infty $$ and kernel bandwidth $$\epsilon \to 0$$, the point-wise convergence rate of the graph Laplacian operator (under 2-norm) is proved to be $$ O( n^{-1/(d/2+3)})$$ at finite large $$n$$ up to log factors, achieved at the scaling of $$\epsilon \sim n^{-1/(d/2+3)} $$. When the manifold data are corrupted by outlier noise, we theoretically prove the graph Laplacian point-wise consistency which matches the rate for clean manifold data plus an additional term proportional to the boundedness of the inner-products of the noise vectors among themselves and with data vectors. Motivated by our analysis, which suggests that not exact bi-stochastic normalization but an approximate one will achieve the same consistency rate, we propose an approximate and constrained matrix scaling problem that can be solved by SK iterations with early termination. Numerical experiments support our theoretical results and show the robustness of bi-stochastically normalized graph Laplacian to high-dimensional outlier noise. 
    more » « less
  4. This paper presents convergence analysis of kernel-based quadrature rules in misspecified settings, focusing on deterministic quadrature in Sobolev spaces. In particular, we deal with misspecified settings where a test integrand is less smooth than a Sobolev RKHS based on which a quadrature rule is constructed. We provide convergence guarantees based on two different assumptions on a quadrature rule: one on quadrature weights, and the other on design points. More precisely, we show that convergence rates can be derived (i) if the sum of absolute weights remains constant (or does not increase quickly), or (ii) if the minimum distance between design points does not decrease very quickly. As a consequence of the latter result, we derive a rate of convergence for Bayesian quadrature in misspecified settings. We reveal a condition on design points to make Bayesian quadrature robust to misspecification, and show that, under this condition, it may adaptively achieve the optimal rate of convergence in the Sobolev space of a lesser order (i.e., of the unknown smoothness of a test integrand), under a slightly stronger regularity condition on the integrand. 
    more » « less
  5. This article introduces a novel numerical approach, based on finite-volume techniques, for studying fully nonlinear coagulation–fragmentation models, where both the coagulation and fragmentation components of the collision operator are nonlinear. The models come from three-wave kinetic equations, a pivotal framework in wave turbulence theory. Despite the importance of wave turbulence theory in physics and mechanics, there have been very few numerical schemes for three-wave kinetic equations, in which no additional assumptions are manually imposed on the evolution of the solutions, and the current manuscript provides one of the first of such schemes. To the best of our knowledge, this also is the first numerical scheme capable of accurately capturing the long-term asymptotic behaviour of solutions to a fully nonlinear coagulation–fragmentation model. The scheme is implemented on some test problems, demonstrating strong alignment with theoretical predictions of energy cascade rates, rigorously obtained in the work (Soffer & Tran. 2020Commun. Math. Phys.376, 2229–2276. (doi:10.1007/BF01419532)). We further introduce a weighted finite-volume variant to ensure energy conservation across varying degrees of kernel homogeneity. Convergence and first-order consistency are established through theoretical analysis and verified by experimental convergence orders in test cases. 
    more » « less