skip to main content


Title: On Distance and Kernel Measures of Conditional Dependence
Measuring conditional dependence is one of the important tasks in statistical inference and is fundamental in causal discovery, feature selection, dimensionality reduction, Bayesian network learning, and others. In this work, we explore the connection between conditional dependence measures induced by distances on a metric space and reproducing kernels associated with a reproducing kernel Hilbert space (RKHS). For certain distance and kernel pairs, we show the distance-based conditional dependence measures to be equivalent to that of kernel-based measures. On the other hand, we also show that some popular kernel conditional dependence measures based on the Hilbert-Schmidt norm of a certain crossconditional covariance operator, do not have a simple distance representation, except in some limiting cases.  more » « less
Award ID(s):
1945396
PAR ID:
10417057
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Journal of machine learning research
Volume:
24
ISSN:
1532-4435
Page Range / eLocation ID:
1-16
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Abstract We extend the Lebesgue decomposition of positive measures with respect to Lebesgue measure on the complex unit circle to the non-commutative (NC) multi-variable setting of (positive) NC measures. These are positive linear functionals on a certain self-adjoint subspace of the Cuntz–Toeplitz $C^{\ast }-$algebra, the $C^{\ast }-$algebra of the left creation operators on the full Fock space. This theory is fundamentally connected to the representation theory of the Cuntz and Cuntz–Toeplitz $C^{\ast }-$algebras; any *−representation of the Cuntz–Toeplitz $C^{\ast }-$algebra is obtained (up to unitary equivalence), by applying a Gelfand–Naimark–Segal construction to a positive NC measure. Our approach combines the theory of Lebesgue decomposition of sesquilinear forms in Hilbert space, Lebesgue decomposition of row isometries, free semigroup algebra theory, NC reproducing kernel Hilbert space theory, and NC Hardy space theory. 
    more » « less
  2. Distributionally robust optimization (DRO) is a powerful tool for decision making under uncertainty. It is particularly appealing because of its ability to leverage existing data. However, many practical problems call for decision- making with some auxiliary information, and DRO in the context of conditional distributions is not straightforward. We propose a conditional kernel distributionally robust optimiza- tion (CKDRO) method that enables robust decision making under conditional distributions through kernel DRO and the conditional mean operator in the reproducing kernel Hilbert space (RKHS). In particular, we consider problems where there is a correlation between the unknown variable y and an auxiliary observable variable x. Given past data of the two variables and a queried auxiliary variable, CKDRO represents the conditional distribution P(y|x) as the conditional mean operator in the RKHS space and quantifies the ambiguity set in the RKHS as well, which depends on the size of the dataset as well as the query point. To justify the use of RKHS, we demonstrate that the ambiguity set defined in RKHS can be viewed as a ball under a metric that is similar to the Wasserstein metric. The DRO is then dualized and solved via a finite dimensional convex program. The proposed CKDRO approach is applied to a generation scheduling problem and shows that the result of CKDRO is superior to common benchmarks in terms of quality and robustness. 
    more » « less
  3. This paper extends earlier results on the adaptive estimation of nonlinear terms in finite dimensional systems utilizing a reproducing kernel Hilbert space to a class of positive real infinite dimensional systems. The simplest class of strictly positive real infinite dimensional systems has collocated input and output operators with the state operator being the generator of an exponentially stable C 0 semigroup on the state space X . The parametrization of the nonlinear term is considered in a reproducing kernel Hilbert space Q and together with the adaptive observer, results in an evolution system considered in X × Q. Using Lyapunov-redesign methods, the adaptive laws for the parameter estimates are derived and the well-posedness of the resulting evolution error system is summarized. The adaptive estimate of the unknown nonlinearity is subsequently used to compensate for the nonlinearity. A special case of finite dimensional systems with an embedded reproducing kernel Hilbert space to handle the nonlinear term is also considered and the convergence results are summarized. A numerical example on a one-dimensional diffusion equation is considered. 
    more » « less
  4. Abstract

    The paper introduces a new kernel-based Maximum Mean Discrepancy (MMD) statistic for measuring the distance between two distributions given finitely many multivariate samples. When the distributions are locally low-dimensional, the proposed test can be made more powerful to distinguish certain alternatives by incorporating local covariance matrices and constructing an anisotropic kernel. The kernel matrix is asymmetric; it computes the affinity between $n$ data points and a set of $n_R$ reference points, where $n_R$ can be drastically smaller than $n$. While the proposed statistic can be viewed as a special class of Reproducing Kernel Hilbert Space MMD, the consistency of the test is proved, under mild assumptions of the kernel, as long as $\|p-q\| \sqrt{n} \to \infty $, and a finite-sample lower bound of the testing power is obtained. Applications to flow cytometry and diffusion MRI datasets are demonstrated, which motivate the proposed approach to compare distributions.

     
    more » « less
  5. Conditional mean embedding (CME) operators encode conditional probability densities within Reproducing Kernel Hilbert Space (RKHS). In this paper, we present a decentralized algorithm for a collection of agents to cooperatively approximate CME over a network. Communication constraints limit the agents from sending all data to their neighbors; we only allow sparse representations of covariance operators to be exchanged among agents, compositions of which defines CME. Using a coherence-based compression scheme, we present a consensus-type algorithm that preserves the average of the approximations of the covariance operators across the network. We theoretically prove that the iterative dynamics in RKHS is stable. We then empirically study our algorithm to estimate CMEs to learn spectra of Koopman operators for Markovian dynamical systems and to execute approximate value iteration for Markov decision processes (MDPs). 
    more » « less