NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Dissection of tumoral niches using spatial transcriptomics and deep learning

https://doi.org/10.1016/j.isci.2025.112214

Paniagua, Karla; Jin, Yu-Fang; Chen, Yidong; Gao, Shou-Jiang; Huang, Yufei; Flores, Mario (April 2025, iScience)

Free, publicly-accessible full text available April 1, 2026
Multi-omics Integrative Analysis for Incomplete Data Using Weighted p-Value Adjustment Approaches

https://doi.org/10.1007/s13253-024-00603-3

Zhang, Wenda; Ma, Zichen; Ho, Yen-Yi; Yang, Shuyi; Habiger, Joshua; Huang, Hsin-Hsiung; Huang, Yufei (February 2024, Journal of Agricultural, Biological and Environmental Statistics)

Abstract The advancements in high-throughput technologies provide exciting opportunities to obtain multi-omics data from the same individuals in a biomedical study, and joint analyses of data from multiple sources offer many benefits. However, the occurrence of missing values is an inevitable issue in multi-omics data because measurements such as mRNA gene expression levels often require invasive tissue sampling from patients. Common approaches for addressing missing measurements include analyses based on observations with complete data or multiple imputation methods. In this paper, we propose a novel integrative multi-omics analytical framework based onp-value weight adjustment in order to incorporate observations with incomplete data into the analysis. By splitting the data into a complete set with full information and an incomplete set with missing measurements, we introduce mechanisms to derive weights and weight-adjustedp-values from the two sets. Through simulation analyses, we demonstrate that the proposed framework achieves considerable statistical power gains compared to a complete case analysis or multiple imputation approaches. We illustrate the implementation of our proposed framework in a study of preterm infant birth weights by a joint analysis of DNA methylation, mRNA, and the phenotypic outcome. Supplementary materials accompanying this paper appear online.
more » « less
Full Text Available
Characterizing Macrophages Diversity in COVID-19 Patients Using Deep Learning

https://doi.org/10.3390/genes13122264

Flores, Mario A.; Paniagua, Karla; Huang, Wenjian; Ramirez, Ricardo; Falcon, Leonardo; Liu, Andy; Chen, Yidong; Huang, Yufei; Jin, Yufang (December 2022, Genes)

The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the etiological agent responsible for coronavirus disease 2019 (COVID-19), has affected the lives of billions and killed millions of infected people. This virus has been demonstrated to have different outcomes among individuals, with some of them presenting a mild infection, while others present severe symptoms or even death. The identification of the molecular states related to the severity of a COVID-19 infection has become of the utmost importance to understanding the differences in critical immune response. In this study, we computationally processed a set of publicly available single-cell RNA-Seq (scRNA-Seq) data of 12 Bronchoalveolar Lavage Fluid (BALF) samples diagnosed as having a mild, severe, or no infection, and generated a high-quality dataset that consists of 63,734 cells, each with 23,916 genes. We extended the cell-type and sub-type composition identification and our analysis showed significant differences in cell-type composition in mild and severe groups compared to the normal. Importantly, inflammatory responses were dramatically elevated in the severe group, which was evidenced by the significant increase in macrophages, from 10.56% in the normal group to 20.97% in the mild group and 34.15% in the severe group. As an indicator of immune defense, populations of T cells accounted for 24.76% in the mild group and decreased to 7.35% in the severe group. To verify these findings, we developed several artificial neural networks (ANNs) and graph convolutional neural network (GCNN) models. We showed that the GCNN models reach a prediction accuracy of the infection of 91.16% using data from subtypes of macrophages. Overall, our study indicates significant differences in the gene expression profiles of inflammatory response and immune cells of severely infected patients.
more » « less
Full Text Available
Toward Deep Learning Based Access Control

https://doi.org/10.1145/3508398.3511497

Nobi, Mohammad Nur; Krishnan, Ram; Huang, Yufei; Shakarami, Mehrnoosh; Sandhu, Ravi (April 2022, Proceedings of the Twelfth ACM Conference on Data and Application Security and Privacy (CODASPY '22))

Full Text Available
Transformer for Gene Expression Modeling (T-GEM): An Interpretable Deep Learning Model for Gene Expression-Based Phenotype Predictions

https://doi.org/10.3390/cancers14194763

Zhang, Ting-He; Hasib, Md Musaddaqul; Chiu, Yu-Chiao; Han, Zhi-Feng; Jin, Yu-Fang; Flores, Mario; Chen, Yidong; Huang, Yufei (October 2022, Cancers)

Deep learning has been applied in precision oncology to address a variety of gene expression-based phenotype predictions. However, gene expression data’s unique characteristics challenge the computer vision-inspired design of popular Deep Learning (DL) models such as Convolutional Neural Network (CNN) and ask for the need to develop interpretable DL models tailored for transcriptomics study. To address the current challenges in developing an interpretable DL model for modeling gene expression data, we propose a novel interpretable deep learning architecture called T-GEM, or Transformer for Gene Expression Modeling. We provided the detailed T-GEM model for modeling gene–gene interactions and demonstrated its utility for gene expression-based predictions of cancer-related phenotypes, including cancer type prediction and immune cell type classification. We carefully analyzed the learning mechanism of T-GEM and showed that the first layer has broader attention while higher layers focus more on phenotype-related genes. We also showed that T-GEM’s self-attention could capture important biological functions associated with the predicted phenotypes. We further devised a method to extract the regulatory network that T-GEM learns by exploiting the attributions of self-attention weights for classifications and showed that the network hub genes were likely markers for the predicted phenotypes.
more » « less
Full Text Available
Access Control Policy Generation from User Stories Using Machine Learning

https://doi.org/10.1007/978-3-030-81242-3_10

Heaps, John; Krishnan, Ram; Huang, Yufei; Niu, Jianwei; Sandhu, Ravi (July 2021, Annual IFIP WG 11.3 Working Conference on Data and Applications Security and Privacy (DBSec))
null (Ed.)
Full Text Available
Malware Detection in Cloud Infrastructures using Convolutional Neural Networks

Abdelsalam, Mahmoud; Krishnan, Ram; Huang, Yufei; Sandhu, Ravi (July 2018, 11th IEEE International Conference on Cloud Computing (CLOUD), San Francisco, CA, July 2-7, 2018)

A major challenge in Infrastructure as a Service (IaaS) clouds is its exposure to malware. Malware can spread rapidly within a datacenter and can cause major disruption to a cloud service provider and its clients. This paper introduces and discusses an effective malware detection approach in cloud infrastructure using Convolutional Neural Network (CNN), a deep learning approach. We initially employ a standard 2d CNN by training on metadata available for each of the processes in a virtual machine (VM) obtained by means of the hypervisor. We enhance the CNN classifier accuracy by using a novel 3d CNN (where an input is a collection of samples over a time interval), which greatly helps reduce mislabelled samples during data collection and training. Our experiments are performed on data collected by running various malware (mostly Trojans and Rootkits) on VMs. The malware used in our experiments are randomly selected. This reduces the selection bias of known-to-be highly active malware for easy detection. We demonstrate that our 2d CNN model reaches an accuracy of ' 79%, and our 3d CNN model significantly improves the accuracy to ' 90%.
more » « less
Full Text Available

Search for: All records