skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Award ID contains: 2046393

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Free, publicly-accessible full text available May 1, 2026
  2. The kernel two-sample test based on the maximum mean discrepancy is one of the most popular methods for detecting differences between two distributions over general metric spaces. In this paper we propose a method to boost the power of the kernel test by combining maximum mean discrepancy estimates over multiple kernels using their Mahalanobis distance. We derive the asymptotic null distribution of the proposed test statistic and use a multiplier bootstrap approach to efficiently compute the rejection region. The resulting test is universally consistent and, since it is obtained by aggregating over a collection of kernels/bandwidths, is more powerful in detecting a wide range of alternatives in finite samples. We also derive the distribution of the test statistic for both fixed and local contiguous alternatives. The latter, in particular, implies that the proposed test is statistically efficient, that is, it has nontrivial asymptotic (Pitman) efficiency. The consistency properties of the Mahalanobis and other natural aggregation methods are also explored when the number of kernels is allowed to grow with the sample size. Extensive numerical experiments are performed on both synthetic and real-world datasets to illustrate the efficacy of the proposed method over single-kernel tests. The computational complexity of the proposed method is also studied, both theoretically and in simulations. Our asymptotic results rely on deriving the joint distribution of the maximum mean discrepancy estimates using the framework of multiple stochastic integrals, which is more broadly useful, specifically, in understanding the efficiency properties of recently proposed adaptive maximum mean discrepancy tests based on kernel aggregation and also in developing more computationally efficient, linear-time tests that combine multiple kernels. We conclude with an application of the Mahalanobis aggregation method for kernels with diverging scaling parameters. 
    more » « less
    Free, publicly-accessible full text available January 1, 2026
  3. Abstract Given a graphon $$W$$ and a finite simple graph $$H$$ , with vertex set $V(H)$ , denote by $$X_n(H, W)$$ the number of copies of $$H$$ in a $$W$$ -random graph on $$n$$ vertices. The asymptotic distribution of $$X_n(H, W)$$ was recently obtained by Hladký, Pelekis, and Šileikis [17] in the case where $$H$$ is a clique. In this paper, we extend this result to any fixed graph $$H$$ . Towards this we introduce a notion of $$H$$ -regularity of graphons and show that if the graphon $$W$$ is not $$H$$ -regular, then $$X_n(H, W)$$ has Gaussian fluctuations with scaling $$n^{|V(H)|-\frac{1}{2}}$$ . On the other hand, if $$W$$ is $$H$$ -regular, then the fluctuations are of order $$n^{|V(H)|-1}$$ and the limiting distribution of $$X_n(H, W)$$ can have both Gaussian and non-Gaussian components, where the non-Gaussian component is a (possibly) infinite weighted sum of centred chi-squared random variables with the weights determined by the spectral properties of a graphon derived from $$W$$ . Our proofs use the asymptotic theory of generalised $$U$$ -statistics developed by Janson and Nowicki [22]. We also investigate the structure of $$H$$ -regular graphons for which either the Gaussian or the non-Gaussian component of the limiting distribution (but not both) is degenerate. Interestingly, there are also $$H$$ -regular graphons $$W$$ for which both the Gaussian or the non-Gaussian components are degenerate, that is, $$X_n(H, W)$$ has a degenerate limit even under the scaling $$n^{|V(H)|-1}$$ . We give an example of this degeneracy with $$H=K_{1, 3}$$ (the 3-star) and also establish non-degeneracy in a few examples. This naturally leads to interesting open questions on higher order degeneracies. 
    more » « less
  4. Abstract The $$p$$-tensor Ising model is a one-parameter discrete exponential family for modeling dependent binary data, where the sufficient statistic is a multi-linear form of degree $$p \geqslant 2$$. This is a natural generalization of the matrix Ising model that provides a convenient mathematical framework for capturing, not just pairwise, but higher-order dependencies in complex relational data. In this paper, we consider the problem of estimating the natural parameter of the $$p$$-tensor Ising model given a single sample from the distribution on $$N$$ nodes. Our estimate is based on the maximum pseudolikelihood (MPL) method, which provides a computationally efficient algorithm for estimating the parameter that avoids computing the intractable partition function. We derive general conditions under which the MPL estimate is $$\sqrt N$$-consistent, that is, it converges to the true parameter at rate $$1/\sqrt N$$. Our conditions are robust enough to handle a variety of commonly used tensor Ising models, including spin glass models with random interactions and models where the rate of estimation undergoes a phase transition. In particular, this includes results on $$\sqrt N$$-consistency of the MPL estimate in the well-known $$p$$-spin Sherrington–Kirkpatrick model, spin systems on general $$p$$-uniform hypergraphs and Ising models on the hypergraph stochastic block model (HSBM). In fact, for the HSBM we pin down the exact location of the phase transition threshold, which is determined by the positivity of a certain mean-field variational problem, such that above this threshold the MPL estimate is $$\sqrt N$$-consistent, whereas below the threshold no estimator is consistent. Finally, we derive the precise fluctuations of the MPL estimate in the special case of the $$p$$-tensor Curie–Weiss model, which is the Ising model on the complete $$p$$-uniform hypergraph. An interesting consequence of our results is that the MPL estimate in the Curie–Weiss model saturates the Cramer–Rao lower bound at all points above the estimation threshold, that is, the MPL estimate incurs no loss in asymptotic statistical efficiency in the estimability regime, even though it is obtained by minimizing only an approximation of the true likelihood function for computational tractability. 
    more » « less