skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: On relationships between Chatterjee's and Spearman's correlation coefficients
In his seminal work, Chatterjee (2021) introduced a novel correlation measure which is distribution-free, asymptotically normal, and consistent against all alternatives. In this paper, we study the probabilistic relationships between Chatterjee's correlation and the widely used Spearman's correlation. We show that, under independence, the two sample-based correlations are asymptotically joint normal and asymptotically independent. Under dependence, the magnitudes of two correlations can be substantially different. We establish some extremal cases featuring large differences between these two correlations. Motivated by these findings, a new independence test is proposed by combining Chatterjee's and Spearman's correlations into a maximal strength measure of variable association. Our simulation study and real data application show the good sensitivity of the new test to different correlation patterns.  more » « less
Award ID(s):
2119968
PAR ID:
10422903
Author(s) / Creator(s):
Date Published:
Journal Name:
arXivorg
ISSN:
2331-8422
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. In his seminal work, Chatterjee (Citation2021) introduced a novel correlation measure that is distribution-free, asymptotically normal, and consistent against all alternatives. In this article, we study the probabilistic relationships between Chatterjee’s correlation and the widely used Spearman’s correlation. We show that, under independence, the two sample-based correlations are asymptotically joint normal and asymptotically independent. Under dependence, the magnitudes of two correlations can be substantially different. We establish some extreme cases featuring large differences between these two correlations. Motivated by these findings, a new independence test is proposed by combining Chatterjee’s and Spearman’s correlations into a maximal strength measure of variable association. Our simulation study and real-data application show the good sensitivity of the new test to different correlation patterns. 
    more » « less
  2. Summary Chatterjee (2021) introduced a simple new rank correlation coefficient that has attracted much attention recently. The coefficient has the unusual appeal that it not only estimates a population quantity first proposed by Dette et al. (2013) that is zero if and only if the underlying pair of random variables is independent, but also is asymptotically normal under independence. This paper compares Chatterjee’s new correlation coefficient with three established rank correlations that also facilitate consistent tests of independence, namely Hoeffding’s $$D$$, Blum–Kiefer–Rosenblatt’s $$R$$, and Bergsma–Dassios–Yanagimoto’s $$\tau^*$$. We compare the computational efficiency of these rank correlation coefficients in light of recent advances, and investigate their power against local rotation and mixture alternatives. Our main results show that Chatterjee’s coefficient is unfortunately rate-suboptimal compared to $$D$$, $$R$$ and $$\tau^*$$. The situation is more subtle for a related earlier estimator of Dette et al. (2013). These results favour $$D$$, $$R$$ and $$\tau^*$$ over Chatterjee’s new correlation coefficient for the purpose of testing independence. 
    more » « less
  3. Given a random sample of size n from a p dimensional random vector, we are interested in testing whether the p components of the random vector are mutually independent. This is the so-called complete independence test. In the multivariate normal case, it is equivalent to testing whether the correlation matrix is an identity matrix. In this paper, we propose a one-sided empirical likelihood method for the complete independence test based on squared sample correlation coefficients. The limiting distribution for our one-sided empirical likelihood test statistic is proved to be Z^2I(Z > 0) when both n and p tend to infinity, where Z is a standard normal random variable. In order to improve the power of the empirical likelihood test statistic, we also introduce a rescaled empirical likelihood test statistic. We carry out an extensive simulation study to compare the performance of the rescaled empirical likelihood method and two other statistics. 
    more » « less
  4. Abstract While researchers commonly use the bootstrap to quantify the uncertainty of an estimator, it has been noticed that the standard bootstrap, in general, does not work for Chatterjee’s rank correlation. In this paper, we provide proof of this issue under an additional independence assumption, and complement our theory with simulation evidence for general settings. Chatterjee’s rank correlation thus falls into a category of statistics that are asymptotically normal, but bootstrap inconsistent. Valid inferential methods in this case are Chatterjee’s original proposal for testing independence and the analytic asymptotic variance estimator of Lin & Han (2022) for more general purposes. [Received on 5 April 2023. Editorial decision on 10 January 2024] 
    more » « less
  5. Abstract Distance covariance is a popular dependence measure for two random vectors $$X$$ and $$Y$$ of possibly different dimensions and types. Recent years have witnessed concentrated efforts in the literature to understand the distributional properties of the sample distance covariance in a high-dimensional setting, with an exclusive emphasis on the null case that $$X$$ and $$Y$$ are independent. This paper derives the first non-null central limit theorem for the sample distance covariance, and the more general sample (Hilbert–Schmidt) kernel distance covariance in high dimensions, in the distributional class of $(X,Y)$ with a separable covariance structure. The new non-null central limit theorem yields an asymptotically exact first-order power formula for the widely used generalized kernel distance correlation test of independence between $$X$$ and $$Y$$. The power formula in particular unveils an interesting universality phenomenon: the power of the generalized kernel distance correlation test is completely determined by $$n\cdot \operatorname{dCor}^{2}(X,Y)/\sqrt{2}$$ in the high-dimensional limit, regardless of a wide range of choices of the kernels and bandwidth parameters. Furthermore, this separation rate is also shown to be optimal in a minimax sense. The key step in the proof of the non-null central limit theorem is a precise expansion of the mean and variance of the sample distance covariance in high dimensions, which shows, among other things, that the non-null Gaussian approximation of the sample distance covariance involves a rather subtle interplay between the dimension-to-sample ratio and the dependence between $$X$$ and $$Y$$. 
    more » « less