NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Joint association and classification analysis of multi‐view data

https://doi.org/10.1111/biom.13536

Zhang, Yunfeng; Gaynanova, Irina (August 2021, Biometrics)

Abstract Multi‐view data, which is matched sets of measurements on the same subjects, have become increasingly common with advances in multi‐omics technology. Often, it is of interest to find associations between the views that are related to the intrinsic class memberships. Existing association methods cannot directly incorporate class information, while existing classification methods do not take into account between‐views associations. In this work, we propose a framework for Joint Association and Classification Analysis of multi‐view data (JACA). Our goal is not to merely improve the misclassification rates, but to provide a latent representation of high‐dimensional data that is both relevant for the subtype discrimination and coherent across the views. We motivate the methodology by establishing a connection between canonical correlation analysis and discriminant analysis. We also establish the estimation consistency of JACA in high‐dimensional settings. A distinct advantage of JACA is that it can be applied to the multi‐view data with block‐missing structure, that is to cases where a subset of views or class labels is missing for some subjects. The application of JACA to quantify the associations between RNAseq and miRNA views with respect to consensus molecular subtypes in colorectal cancer data from The Cancer Genome Atlas project leads to improved misclassification rates and stronger found associations compared to existing methods.
more » « less
Double-Matched Matrix Decomposition for Multi-View Data

https://doi.org/10.1080/10618600.2022.2067860

Yuan, Dongbang; Gaynanova, Irina (January 2022, Journal of Computational and Graphical Statistics)

Full Text Available
Sparse semiparametric canonical correlation analysis for data of mixed types

https://doi.org/10.1093/biomet/asaa007

Yoon, Grace; Carroll, Raymond J; Gaynanova, Irina (April 2020, Biometrika)

Summary Canonical correlation analysis investigates linear relationships between two sets of variables, but it often works poorly on modern datasets because of high dimensionality and mixed data types such as continuous, binary and zero-inflated. To overcome these challenges, we propose a semiparametric approach to sparse canonical correlation analysis based on the Gaussian copula. The main result of this paper is a truncated latent Gaussian copula model for data with excess zeros, which allows us to derive a rank-based estimator of the latent correlation matrix for mixed variable types without estimation of marginal transformation functions. The resulting canonical correlation analysis method works well in high-dimensional settings, as demonstrated via numerical studies, and when applied to the analysis of association between gene expression and microRNA data from breast cancer patients.
more » « less
Full Text Available
Prediction and estimation consistency of sparse multi-class penalized optimal scoring

https://doi.org/10.3150/19-BEJ1126

Gaynanova, Irina (February 2020, Bernoulli)

Full Text Available
Microbial Networks in SPRING - Semi-parametric Rank-Based Correlation and Partial Correlation Estimation for Quantitative Microbiome Data

https://doi.org/10.3389/fgene.2019.00516

Yoon, Grace; Gaynanova, Irina; Müller, Christian L. (June 2019, Frontiers in Genetics)

Full Text Available
Sparse feature selection in kernel discriminant analysis via optimal scoring

Lapanowski, Alexander F.; Gaynanova, Irina (April 2019, Proceedings of Machine Learning Research)

We consider the two-group classification problem and propose a kernel classifier based on the optimal scoring framework. Unlike previous approaches, we provide theoretical guarantees on the expected risk consistency of the method. We also allow for feature selection by imposing structured sparsity using weighted kernels. We propose fully-automated methods for selection of all tuning parameters, and in particular adapt kernel shrinkage ideas for ridge parameter selection. Numerical studies demonstrate the superior classification performance of the proposed approach compared to existing nonparametric classifiers.
more » « less
Full Text Available
Sparse quadratic classification rules via linear dimension reduction

https://doi.org/10.1016/j.jmva.2018.09.011

Gaynanova, Irina; Wang, Tianying (January 2019, Journal of Multivariate Analysis)

Full Text Available

Search for: All records