A Statistically and Numerically Efficient Independence Test Based on Random Projections and Distance Covariance

Huang, Cheng; Huo, Xiaoming

doi:10.3389/fams.2021.779841

Testing for independence plays a fundamental role in many statistical techniques. Among the nonparametric approaches, the distance-based methods (such as the distance correlation-based hypotheses testing for independence) have many advantages, compared with many other alternatives. A known limitation of the distance-based method is that its computational complexity can be high. In general, when the sample size is n , the order of computational complexity of a distance-based method, which typically requires computing of all pairwise distances, can be O ( n 2 ). Recent advances have discovered that in the univariate cases, a fast method with O ( n log n ) computational complexity and O ( n ) memory requirement exists. In this paper, we introduce a test of independence method based on random projection and distance correlation, which achieves nearly the same power as the state-of-the-art distance-based approach, works in the multivariate cases, and enjoys the O ( nK log n ) computational complexity and O ( max{ n , K }) memory requirement, where K is the number of random projections. Note that saving is achieved when K < n / log n . We name our method a Randomly Projected Distance Covariance (RPDC). The statistical theoretical analysis takes advantage of some techniques on the random projection which are rooted in contemporary machine learning. Numerical experiments demonstrate the efficiency of the proposed method, relative to numerous competitors.

More Like this