Abstract High-dimensional categorical data are routinely collected in biomedical and social sciences. It is of great importance to build interpretable parsimonious models that perform dimension reduction and uncover meaningful latent structures from such discrete data. Identifiability is a fundamental requirement for valid modeling and inference in such scenarios, yet is challenging to address when there are complex latent structures. In this article, we propose a class of identifiable multilayer (potentially deep) discrete latent structure models for discrete data, termed Bayesian Pyramids. We establish the identifiability of Bayesian Pyramids by developing novel transparent conditions on the pyramid-shaped deep latent directed graph. The proposed identifiability conditions can ensure Bayesian posterior consistency under suitable priors. As an illustration, we consider the two-latent-layer model and propose a Bayesian shrinkage estimation approach. Simulation results for this model corroborate the identifiability and estimatability of model parameters. Applications of the methodology to DNA nucleotide sequence data uncover useful discrete latent features that are highly predictive of sequence types. The proposed framework provides a recipe for interpretable unsupervised learning of discrete data and can be a useful alternative to popular machine learning methods. 
                        more » 
                        « less   
                    
                            
                            Diagnostic Classification Models for Testlets: Methods and Theory
                        
                    
    
            Diagnostic classification models (DCMs) have seen wide applications in educational and psychological measurement, especially in formative assessment. DCMs in the presence of testlets have been studied in recent literature. A key ingredient in the statistical modeling and analysis of testlet-based DCMs is the superposition of two latent structures, the attribute profile and the testlet effect. This paper extends the standard testlet DINA (T-DINA) model to accommodate the potential correlation between the two latent structures. Model identifiability is studied and a set of sufficient conditions are proposed. As a byproduct, the identifiability of the standard T-DINA is also established. The proposed model is applied to a dataset from the 2015 Programme for International Student Assessment. Comparisons are made with DINA and T-DINA, showing that there is substantial improvement in terms of the goodness of fit. Simulations are conducted to assess the performance of the new method under various settings. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 2015417
- PAR ID:
- 10511273
- Editor(s):
- Sinharay, Sandip
- Publisher / Repository:
- psychometrika
- Date Published:
- Journal Name:
- Psychometrika
- Edition / Version:
- 1
- Volume:
- 9
- Issue:
- 2
- ISSN:
- 0033-3123
- Page Range / eLocation ID:
- 1-26
- Subject(s) / Keyword(s):
- diagnostic classification model, testlet DINA, identifiability, PISA, Q-matrix, interaction, hypothesis testing, model selection.
- Format(s):
- Medium: X Size: 1MB Other: pdf
- Size(s):
- 1MB
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            Identifying latent variables and causal structures from observational data is essential to many real-world applications involving biological data, medical data, and unstructured data such as images and languages. However, this task can be highly challenging, especially when observed variables are generated by causally related latent variables and the relationships are nonlinear. In this work, we investigate the identification problem for nonlinear latent hierarchical causal models in which observed variables are generated by a set of causally related latent variables, and some latent variables may not have observed children. We show that the identifiability of causal structures and latent variables (up to invertible transformations) can be achieved under mild assumptions: on causal structures, we allow for multiple paths between any pair of variables in the graph, which relaxes latent tree assumptions in prior work; on structural functions, we permit general nonlinearity and multi-dimensional continuous variables, alleviating existing work's parametric assumptions. Specifically, we first develop an identification criterion in the form of novel identifiability guarantees for an elementary latent variable model. Leveraging this criterion, we show that both causal structures and latent variables of the hierarchical model can be identified asymptotically by explicitly constructing an estimation procedure. To the best of our knowledge, our work is the first to establish identifiability guarantees for both causal structures and latent variables in nonlinear latent hierarchical models.more » « less
- 
            Identifying latent variables and causal structures from observational data is essential to many real-world applications involving biological data, medical data, and unstructured data such as images and languages. However, this task can be highly challenging, especially when observed variables are generated by causally related latent variables and the relationships are nonlinear. In this work, we investigate the identification problem for nonlinear latent hierarchical causal models in which observed variables are generated by a set of causally related latent variables, and some latent variables may not have observed children. We show that the identifiability of both causal structure and latent variables can be achieved under mild assumptions: on causal structures, we allow for the existence of multiple paths between any pair of variables in the graph, which relaxes latent tree assumptions in prior work; on structural functions, we do not make parametric assumptions, thus permitting general nonlinearity and multi-dimensional continuous variables. Specifically, we first develop a basic identification criterion in the form of novel identifiability guarantees for an elementary latent variable model. Leveraging this criterion, we show that both causal structures and latent variables of the hierarchical model can be identified asymptotically by explicitly constructing an estimation procedure. To the best of our knowledge, our work is the first to establish identifiability guarantees for both causal structures and latent variables in nonlinear latent hierarchical models.more » « less
- 
            An important problem across multiple disciplines is to infer and understand meaningful latent variables. One strategy commonly used is to model the measured variables in terms of the latent variables under suitable assumptions on the connectivity from the latents to the measured (known as measurement model). Furthermore, it might be even more interesting to discover the causal relations among the latent variables (known as structural model). Recently, some methods have been proposed to estimate the structural model by assuming that the noise terms in the measured and latent variables are non-Gaussian. However, they are not suitable when some of the noise terms become Gaussian. To bridge this gap, we investigate the problem of identification of the structural model with arbitrary noise distributions. We provide necessary and sufficient condition under which the structural model is identifiable: it is identifiable iff for each pair of adjacent latent variables Lx, Ly, (1) at least one of Lx and Ly has non-Gaussian noise, or (2) at least one of them has a non-Gaussian ancestor and is not d-separated from the non-Gaussian component of this ancestor by the common causes of Lx and Ly. This identifiability result relaxes the non-Gaussianity requirements to only a (hopefully small) subset of variables, and accordingly elegantly extends the application scope of the structural model. Based on the above identifiability result, we further propose a practical algorithm to learn the structural model. We verify the correctness of the identifiability result and the effectiveness of the proposed method through empirical studies.more » « less
- 
            We prove identifiability of a broad class of deep latent variable models that (a) have universal approximation capabilities and (b) are the decoders of variational auto-encoders that are commonly used in practice. Unlike existing work, our analysis does not require weak supervision, auxiliary information, or conditioning in the latent space. Specifically, we show that for a broad class of generative (i.e. unsupervised) models with universal approximation capabilities, the side information u is not necessary: We prove identifiability of the entire generative model where we do not observe u and only observe the data x. The models we consider match auto-encoder architectures used in practice that leverage mixture priors in the latent space and ReLU/leaky-ReLU activations in the encoder, such as VaDE and MFC-VAE. Our main result is an identifiability hierarchy that significantly generalizes previous work and exposes how different assumptions lead to different “strengths” of identifiability, and includes certain “vanilla” VAEs with isotropic Gaussian priors as a special case. For example, our weakest result establishes (unsupervised) identifiability up to an affine transformation, and thus partially resolves an open problem regarding model identifiability raised in prior work. These theoretical results are augmented with experiments on both simulated and real data.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    