skip to main content


Title: Identification of Linear Latent Variable Model with Arbitrary Distribution
An important problem across multiple disciplines is to infer and understand meaningful latent variables. One strategy commonly used is to model the measured variables in terms of the latent variables under suitable assumptions on the connectivity from the latents to the measured (known as measurement model). Furthermore, it might be even more interesting to discover the causal relations among the latent variables (known as structural model). Recently, some methods have been proposed to estimate the structural model by assuming that the noise terms in the measured and latent variables are non-Gaussian. However, they are not suitable when some of the noise terms become Gaussian. To bridge this gap, we investigate the problem of identification of the structural model with arbitrary noise distributions. We provide necessary and sufficient condition under which the structural model is identifiable: it is identifiable iff for each pair of adjacent latent variables Lx, Ly, (1) at least one of Lx and Ly has non-Gaussian noise, or (2) at least one of them has a non-Gaussian ancestor and is not d-separated from the non-Gaussian component of this ancestor by the common causes of Lx and Ly. This identifiability result relaxes the non-Gaussianity requirements to only a (hopefully small) subset of variables, and accordingly elegantly extends the application scope of the structural model. Based on the above identifiability result, we further propose a practical algorithm to learn the structural model. We verify the correctness of the identifiability result and the effectiveness of the proposed method through empirical studies.  more » « less
Award ID(s):
2134901
NSF-PAR ID:
10380985
Author(s) / Creator(s):
; ; ; ; ;
Date Published:
Journal Name:
Proceedings of the AAAI Conference on Artificial Intelligence
Volume:
36
Issue:
6
ISSN:
2159-5399
Page Range / eLocation ID:
6350 to 6357
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Identification of causal direction between a causal-effect pair from observed data has recently attracted much attention. Various methods based on functional causal models have been proposed to solve this problem, by assuming the causal process satisfies some (structural) constraints and showing that the reverse direction violates such constraints. The nonlinear additive noise model has been demonstrated to be effective for this purpose, but the model class is not transitive--even if each direct causal relation follows this model, indirect causal influences, which result from omitted intermediate causal variables and are frequently encountered in practice, do not necessarily follow the model constraints; as a consequence, the nonlinear additive noise model may fail to correctly discover causal direction. In this work, we propose a cascade nonlinear additive noise model to represent such causal influences--each direct causal relation follows the nonlinear additive noise model but we observe only the initial cause and final effect. We further propose a method to estimate the model, including the unmeasured intermediate variables, from data, under the variational auto-encoder framework. Our theoretical results show that with our model, causal direction is identifiable under suitable technical conditions on the data generation process. Simulation results illustrate the power of the proposed method in identifying indirect causal relations across various settings, and experimental results on real data suggest that the proposed model and method greatly extend the applicability of causal discovery based on functional causal models in nonlinear cases.

     
    more » « less
  2. S. Koyejo ; S. Mohamed ; A. Agarwal ; D. Belgrave ; K. Cho, A. Oh (Ed.)
    We focus on causal discovery in the presence of measurement error in linear systems where the mixing matrix, i.e., the matrix indicating the independent exogenous noise terms pertaining to the observed variables, is identified up to permutation and scaling of the columns. We demonstrate a somewhat surprising connection between this problem and causal discovery in the presence of unobserved parentless causes, in the sense that there is a mapping, given by the mixing matrix, between the underlying models to be inferred in these problems. Consequently, any identifiability result based on the mixing matrix for one model translates to an identifiability result for the other model. We characterize to what extent the causal models can be identified under a two-part faithfulness assumption. Under only the first part of the assumption (corresponding to the conventional definition of faithfulness), the structure can be learned up to the causal ordering among an ordered grouping of the variables but not all the edges across the groups can be identified. We further show that if both parts of the faithfulness assumption are imposed, the structure can be learned up to a more refined ordered grouping. As a result of this refinement, for the latent variable model with unobserved parentless causes, the structure can be identified. Based on our theoretical results, we propose causal structure learning methods for both models, and evaluate their performance on synthetic data. 
    more » « less
  3. Chaudhuri, Kamalika ; Jegelka, Stefanie ; Song, Le ; Szepesvari, Csaba ; Niu, Gang ; Sabato, Sivan (Ed.)
    Traditional causal discovery methods mainly focus on estimating causal relations among measured variables, but in many real-world problems, such as questionnaire-based psychometric studies, measured variables are generated by latent variables that are causally related. Accordingly, this paper investigates the problem of discovering the hidden causal variables and estimating the causal structure, including both the causal relations among latent variables and those between latent and measured variables. We relax the frequently-used measurement assumption and allow the children of latent variables to be latent as well, and hence deal with a specific type of latent hierarchical causal structure. In particular, we define a minimal latent hierarchical structure and show that for linear non-Gaussian models with the minimal latent hierarchical structure, the whole structure is identifiable from only the measured variables. Moreover, we develop a principled method to identify the structure by testing for Generalized Independent Noise (GIN) conditions in specific ways. Experimental results on both synthetic and real-world data show the effectiveness of the proposed approach. 
    more » « less
  4. Abstract

    This paper addresses the parameter estimation problem for lithium-ion battery pack models comprising cells in series. This valuable information can be exploited in fault diagnostics to estimate the number of cells that are exhibiting abnormal behaviour, e.g. large resistances or small capacities. In particular, we use a Bayesian approach to estimate the parameters of a two-cell arrangement modelled using equivalent circuits. Although our modeling framework has been extensively reported in the literature, its structural identifiability properties have not been reported yet to the best of the authors’ knowledge. Moreover, most contributions in the literature tackle the estimation problem through point-wise estimates assuming Gaussian noise using e.g. least-squares methods (maximum likelihood estimation) or Kalman filters (maximum a posteriori estimation). In contrast, we apply methods that are suitable for nonlinear and non-Gaussian estimation problems and estimate the full posterior probability distribution of the parameters. We study how the model structure, available measurements and prior knowledge of the model parameters impact the underlying posterior probability distribution that is recovered for the parameters. For two cells in series, a bimodal distribution is obtained whose modes are centered around the real values of the parameters for each cell. Therefore, bounds on the model parameters for a battery pack can be derived.

     
    more » « less
  5. Independent component analysis (ICA) decomposes multivariate data into mutually independent components (ICs). The ICA model is subject to a constraint that at most one of these components is Gaussian, which is required for model identifiability. Linear non‐Gaussian component analysis (LNGCA) generalizes the ICA model to a linear latent factor model with any number of both non‐Gaussian components (signals) and Gaussian components (noise), where observations are linear combinations of independent components. Although the individual Gaussian components are not identifiable, the Gaussian subspace is identifiable. We introduce an estimator along with its optimization approach in which non‐Gaussian and Gaussian components are estimated simultaneously, maximizing the discrepancy of each non‐Gaussian component from Gaussianity while minimizing the discrepancy of each Gaussian component from Gaussianity. When the number of non‐Gaussian components is unknown, we develop a statistical test to determine it based on resampling and the discrepancy of estimated components. Through a variety of simulation studies, we demonstrate the improvements of our estimator over competing estimators, and we illustrate the effectiveness of our test to determine the number of non‐Gaussian components. Further, we apply our method to real data examples and show its practical value.

     
    more » « less