NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

SPIN: sex-specific and pathway-based interpretable neural network for sexual dimorphism analysis

https://doi.org/10.1093/bib/bbae239

Ko, Euiseong; Kim, Youngsoon; Shokoohi, Farhad; Mersha, Tesfaye B; Kang, Mingon (May 2024, Briefings in Bioinformatics)

Abstract Sexual dimorphism in prevalence, severity and genetic susceptibility exists for most common diseases. However, most genetic and clinical outcome studies are designed in sex-combined framework considering sex as a covariate. Few sex-specific studies have analyzed males and females separately, which failed to identify gene-by-sex interaction. Here, we propose a novel unified biologically interpretable deep learning-based framework (named SPIN) for sexual dimorphism analysis. We demonstrate that SPIN significantly improved the C-index up to 23.6% in TCGA cancer datasets, and it was further validated using asthma datasets. In addition, SPIN identifies sex-specific and -shared risk loci that are often missed in previous sex-combined/-separate analysis. We also show that SPIN is interpretable for explaining how biological pathways contribute to sexual dimorphism and improve risk prediction in an individual level, which can result in the development of precision medicine tailored to a specific individual’s characteristics.
more » « less
Full Text Available
Evidential deep learning for trustworthy prediction of enzyme commission number

https://doi.org/10.1093/bib/bbad401

Han, So-Ra; Park, Mingyu; Kosaraju, Sai; Lee, JeungMin; Lee, Hyun; Lee, Jun Hyuck; Oh, Tae-Jin; Kang, Mingon (November 2023, Briefings in Bioinformatics)

Abstract The rapid growth of uncharacterized enzymes and their functional diversity urge accurate and trustworthy computational functional annotation tools. However, current state-of-the-art models lack trustworthiness on the prediction of the multilabel classification problem with thousands of classes. Here, we demonstrate that a novel evidential deep learning model (named ECPICK) makes trustworthy predictions of enzyme commission (EC) numbers with data-driven domain-relevant evidence, which results in significantly enhanced predictive power and the capability to discover potential new motif sites. ECPICK learns complex sequential patterns of amino acids and their hierarchical structures from 20 million enzyme data. ECPICK identifies significant amino acids that contribute to the prediction without multiple sequence alignment. Our intensive assessment showed not only outstanding enhancement of predictive performance on the largest databases of Uniprot, Protein Data Bank (PDB) and Kyoto Encyclopedia of Genes and Genomes (KEGG), but also a capability to discover new motif sites in microorganisms. ECPICK is a reliable EC number prediction tool to identify protein functions of an increasing number of uncharacterized enzymes.
more » « less
Full Text Available
Enhancing Clinical Trial Summarization: Leveraging Large Language Models and Knowledge Graphs for Entity Preservation

Nahed, P; Kambar, M; Taghva, K (July 2024, Proceedings of Ninth International Congress on Information and Communication Technology)

ClinicalTrials.gov is an accessible online medical resource for researchers, healthcare professionals, and policy designers seeking detailed information on clinical trials. Summarizing these long clinical records can significantly reduce the time needed for the database users as the process transforms comprehensive information into concise synopses, preserving the essential meaning and facilitating understanding. In this paper, we employ the Bidirectional and Auto-Regressive Transformers model to generate the trials’ brief summaries. Our contributions provide new preprocessing techniques for model training, which leads to a robust summarization model. The fine-tuned model significantly enhanced ROUGE-1, ROUGE-2, and ROUGE-L F1-scores by 14%, 23%, and 20%, respectively, compared to previous studies. Additionally, we present an innovative knowledge graph based on entity classes to assess the generated summaries. This graph not only quantifies the essential entities transformed from the original text to the summaries but also provides insights into their specific order and arrangement in sentences.
more » « less
Full Text Available
Multi-layered self-attention mechanism for weakly supervised semantic segmentation

https://doi.org/10.1016/j.cviu.2023.103886

Yaganapu, Avinash; Kang, Mingon (February 2024, Computer Vision and Image Understanding)

Weakly Supervised Semantic Segmentation (WSSS) provides efficient solutions for semantic image segmentation using image-level annotations. WSSS requires no pixel-level labeling that Fully Supervised Semantic Segmentation (FSSS) does, which is time-consuming and label-intensive. Most WSSS approaches have leveraged Class Activation Maps (CAM) or Self-Attention (SA) to generate pseudo pixel-level annotations to perform semantic segmentation tasks coupled with fully supervised approaches (e.g., Fully Convolutional Network). However, those approaches often provides incomplete supervision that mainly includes discriminative regions from the last convolutional layer. They may fail to capture regions of low- or intermediate-level features that may not be present in the last convolutional layer. To address the issue, we proposed a novel Multi-layered Self-Attention (Multi-SA) method that applies a self-attention module to multiple convolutional layers, and then stack feature maps from the self-attention layers to generate pseudo pixel-level annotations. We demonstrated that integrated feature maps from multiple self-attention layers produce higher coverage in semantic segmentation than using only the last convolutional layer through intensive experiments using standard benchmark datasets.
more » « less
Full Text Available

Search for: All records