skip to main content


Title: Nonlinear Multiview Analysis: Identifiability and Neural Network-based Implementation
Multiview analysis aims to extract common information from data entities across different domains (e.g., acoustic, visual, text). Canonical correlation analysis (CCA) is one of the classic tools for this problem, which estimates the shared latent information via linear transforming the different views of data. CCA has also been generalized to the nonlinear regime, where kernel methods and neural networks are introduced to replace the linear transforms. While the theoretical aspects of linear CCA are relatively well understood, nonlinear multiview analysis is still largely intuition-driven. In this work, our interest lies in the identifiability of shared latent information under a nonlinear multiview analysis framework. We propose a model identification criterion for learning latent information from multiview data, under a reasonable data generating model. We show that minimizing this criterion leads to identification of the latent shared information up to certain indeterminacy. We also propose a neural network based implementation and an efficient algorithm to realize the criterion. Our analysis is backed by experiments on both synthetic and real data.  more » « less
Award ID(s):
1808159
NSF-PAR ID:
10183842
Author(s) / Creator(s):
;
Date Published:
Journal Name:
2020 IEEE 11th Sensor Array and Multichannel Signal Processing Workshop (SAM)
Page Range / eLocation ID:
1 to 5
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Canonical correlation analysis (CCA) has been essential in unsupervised multimodal/multiview latent representation learning and data fusion. Classic CCA extracts shared information from multiple modalities of data using linear transformations. In recent years, deep neural networks-based nonlinear feature extractors were combined with CCA to come up with new variants, namely the ``DeepCCA'' line of work. These approaches were shown to have enhanced performance in many applications. However, theoretical supports of DeepCCA are often lacking. To address this challenge, the recent work of Lyu and Fu (2020) showed that, under a reasonable postnonlinear generative model, a carefully designed DeepCCA criterion provably removes unknown distortions in data generation and identifies the shared information across modalities. Nonetheless, a critical assumption used by Lyu and Fu (2020) for identifiability analysis was that unlimited data is available, which is unrealistic. This brief paper puts forth a finite-sample analysis of the DeepCCA method by Lyu and Fu (2020). The main result is that the finite-sample version of the method can still estimate the shared information with a guaranteed accuracy when the number of samples is sufficiently large. Our analytical approach is a nontrivial integration of statistical learning, numerical differentiation, and robust system identification, which may be of interest beyond the scope of DeepCCA and benefit other unsupervised learning paradigms. 
    more » « less
  2. Abstract Background

    In Alzheimer’s Diseases (AD) research, multimodal imaging analysis can unveil complementary information from multiple imaging modalities and further our understanding of the disease. One application is to discover disease subtypes using unsupervised clustering. However, existing clustering methods are often applied to input features directly, and could suffer from the curse of dimensionality with high-dimensional multimodal data. The purpose of our study is to identify multimodal imaging-driven subtypes in Mild Cognitive Impairment (MCI) participants using a multiview learning framework based on Deep Generalized Canonical Correlation Analysis (DGCCA), to learn shared latent representation with low dimensions from 3 neuroimaging modalities.

    Results

    DGCCA applies non-linear transformation to input views using neural networks and is able to learn correlated embeddings with low dimensions that capture more variance than its linear counterpart, generalized CCA (GCCA). We designed experiments to compare DGCCA embeddings with single modality features and GCCA embeddings by generating 2 subtypes from each feature set using unsupervised clustering. In our validation studies, we found that amyloid PET imaging has the most discriminative features compared with structural MRI and FDG PET which DGCCA learns from but not GCCA. DGCCA subtypes show differential measures in 5 cognitive assessments, 6 brain volume measures, and conversion to AD patterns. In addition, DGCCA MCI subtypes confirmed AD genetic markers with strong signals that existing late MCI group did not identify.

    Conclusion

    Overall, DGCCA is able to learn effective low dimensional embeddings from multimodal data by learning non-linear projections. MCI subtypes generated from DGCCA embeddings are different from existing early and late MCI groups and show most similarity with those identified by amyloid PET features. In our validation studies, DGCCA subtypes show distinct patterns in cognitive measures, brain volumes, and are able to identify AD genetic markers. These findings indicate the promise of the imaging-driven subtypes and their power in revealing disease structures beyond early and late stage MCI.

     
    more » « less
  3. Unsupervised mixture learning (UML) aims at identifying linearly or nonlinearly mixed latent components in a blind manner. UML is known to be challenging: Even learning linear mixtures requires highly nontrivial analytical tools, e.g., independent component analysis or nonnegative matrix factorization. In this work, the post-nonlinear (PNL) mixture model---where {\it unknown} element-wise nonlinear functions are imposed onto a linear mixture---is revisited. The PNL model is widely employed in different fields ranging from brain signal classification, speech separation, remote sensing, to causal discovery. To identify and remove the unknown nonlinear functions, existing works often assume different properties on the latent components (e.g., statistical independence or probability-simplex structures). This work shows that under a carefully designed UML criterion, the existence of a nontrivial {\it null space} associated with the underlying mixing system suffices to guarantee identification/removal of the unknown nonlinearity. Compared to prior works, our finding largely relaxes the conditions of attaining PNL identifiability, and thus may benefit applications where no strong structural information on the latent components is known. A finite-sample analysis is offered to characterize the performance of the proposed approach under realistic settings. To implement the proposed learning criterion, a block coordinate descent algorithm is proposed. A series of numerical experiments corroborate our theoretical claims. 
    more » « less
  4. Abstract

    Soil moisture (SM) influences near‐surface air temperature by partitioning downwelling radiation into latent and sensible heat fluxes, through which dry soils generally lead to higher temperatures. The strength of this coupled soil moisture‐temperature (SM‐T) relationship is not spatially uniform, and numerous methods have been developed to assess SM‐T coupling strength across the globe. These methods tend to involve either idealized climate‐model experiments or linear statistical methods which cannot fully capture nonlinear SM‐T coupling. In this study, we propose a nonlinear machine‐learning (ML)‐based approach for analyzing SM‐T coupling and apply this method to various mid‐latitude regions using historical reanalysis datasets. We first train convolutional neural networks (CNNs) to predict daily maximum near‐surface air temperature (TMAX) given daily SM and geopotential height fields. We then use partial dependence analysis to isolate the average sensitivity of each CNN's TMAX prediction to the SM input under daily atmospheric conditions. The resulting SM‐T relationships broadly agree with previous assessments of SM‐T coupling strength. Over many regions, we find nonlinear relationships between the CNN's TMAX prediction and the SM input map. These nonlinearities suggest that the coupled interactions governing SM‐T relationships vary under different SM conditions, but these variations are regionally dependent. We also apply this method to test the influence of SM memory on SM‐T coupling and find that our results are consistent with previous studies. Although our study focuses specifically on local SM‐T coupling, our ML‐based method can be extended to investigate other coupled interactions within the climate system using observed or model‐derived datasets.

     
    more » « less
  5. null (Ed.)
    This work studies the model identification problem of a class of post-nonlinear mixture models in the presence of dependent latent components. Particularly, our interest lies in latent components that are nonnegative and sum-to-one. This problem is motivated by applications such as hyperspectral unmixing under nonlinear distortion effects. Many prior works tackled nonlinear mixture analysis using statistical independence among the latent components, which is not applicable in our case. A recent work by Yang et al. put forth a solution for this problem leveraging functional equations. However, the identifiability conditions derived there are somewhat restrictive. The associated implementation also has difficulties-the function approximator used in their work may not be able to represent general nonlinear distortions and the formulated constrained neural network optimization problem may be challenging to handle. In this work, we advance both the theoretical and practical aspects of the problem of interest. On the theory side, we offer a new identifiability condition that circumvents a series of stringent assumptions in Yang et al.'s work. On the algorithm side, we propose an easy-to-implement unconstrained neural network-based algorithm-without sacrificing function approximation capabilities. Numerical experiments are employed to support our design. 
    more » « less