NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

UnCOT-AD: Unpaired Cross-Omics Translation Enables Multi-Omics Integration for Alzheimer’s Disease Prediction

https://doi.org/10.1093/bib/bbaf438

Abir, Abrar_Rahman; Dip, Sajib_Acharjee; Zhang, Liqing (August 2025, Briefings in Bioinformatics)

Abstract Alzheimer’s Disease (AD) is a progressive neurodegenerative disorder, posing a growing public health challenge. Traditional machine learning models for AD prediction have relied on single omics data or phenotypic assessments, limiting their ability to capture the disease’s molecular complexity and resulting in poor performance. Recent advances in high-throughput multi-omics have provided deeper biological insights. However, due to the scarcity of paired omics datasets, existing multi-omics AD prediction models rely on unpaired omics data, where different omics profiles are combined without being derived from the same biological sample, leading to biologically less meaningful pairings and causing less accurate predictions. To address these issues, we propose UnCOT-AD, a novel deep learning framework for Unpaired Cross-Omics Translation enabling effective multi-omics integration for AD prediction. Our method introduces the first-ever cross-omics translation model trained on unpaired omics datasets, using two coupled Variational Autoencoders and a novel cycle consistency mechanism to ensure accurate bidirectional translation between omics types. We integrate adversarial training to ensure that the generated omics profiles are biologically realistic. Moreover, we employ contrastive learning to capture the disease specific patterns in latent space to make the cross-omics translation more accurate and biologically relevant. We rigorously validate UnCOT-AD on both cross-omics translation and AD prediction tasks. Results show that UnCOT-AD empowers multi-omics based AD prediction by combining real omics profiles with corresponding omics profiles generated by our cross-omics translation module and achieves state-of-the-art performance in accuracy and robustness. Source code is available at https://github.com/abrarrahmanabir/UnCOT-AD
more » « less
ProtAlign-ARG: antibiotic resistance gene characterization integrating protein language models and alignment-based scoring

https://doi.org/10.1038/s41598-025-14545-4

Ahmed, Shafayat; Emon, Muhit_Islam; Moumi, Nazifa_Ahmed; Huang, Lifu; Zhou, Dawei; Vikesland, Peter; Pruden, Amy; Zhang, Liqing (August 2025, Scientific Reports)
ARGContextProfiler: extracting and scoring the genomic contexts of antibiotic resistance genes using assembly graphs

https://doi.org/10.3389/fmicb.2025.1604461

Moumi, Nazifa Ahmed; Ahmed, Shafayat; Brown, Connor; Pruden, Amy; Zhang, Liqing (May 2025, Frontiers in Microbiology)

Antibiotic resistance (AR) presents a global health challenge, necessitating an improved understanding of the ecology, evolution, and dissemination of antibiotic resistance genes (ARGs). Several tools, databases, and algorithms are now available to facilitate the identification of ARGs in metagenomic sequencing data; however, direct annotation of short-read data provides limited contextual information. Knowledge of whether an ARG is carried in the chromosome or on a specific mobile genetic element (MGE) is critical to understanding mobility, persistence, and potential for co-selection. Here we developed ARGContextProfiler, a pipeline designed to extract and visualize ARG genomic contexts. By leveraging the assembly graph for genomic neighborhood extraction and validating contexts through read mapping, ARGContextProfiler minimizes chimeric errors that are a common artifact of assembly outputs. Testing on real, synthetic, and semi-synthetic data, including long-read sequencing data from environmental samples, demonstrated that ARGContextProfiler offers superior accuracy, precision, and sensitivity compared to conventional assembly-based methods. ARGContextProfiler thus provides a powerful tool for uncovering the genomic context of ARGs in metagenomic sequencing data, which can be of value to both fundamental and applied research aimed at understanding and stemming the spread of AR. The source code of ARGContextProfiler is publicly available atGitHub.
more » « less
Free, publicly-accessible full text available May 21, 2026
CIWARS: A Web Server for Antibiotic Resistance Surveillance Using Longitudinal Metagenomic Data

https://doi.org/10.1016/j.jmb.2025.169159

Emon, Muhit Islam; Cheung, Yat Fei; Stoll, James; Rumi, Monjura Afrin; Brown, Connor; Choi, Joung Min; Moumi, Nazifa Ahmed; Ahmed, Shafayat; Song, Haoqiu; Sein, Justin; et al (August 2025, Journal of Molecular Biology)

Free, publicly-accessible full text available August 1, 2026
Virseqimprover: an integrated pipeline for viral contig error correction, extension, and annotation

https://doi.org/10.7717/peerj.18515

Song, Haoqiu; Tithi, Saima Sultana; Brown, Connor; Aylward, Frank O; Jensen, Roderick; Zhang, Liqing (January 2025, PeerJ)

Despite the recent surge of viral metagenomic studies, it remains a significant challenge to recover complete virus genomes from metagenomic data. The majority of viral contigs generated from de novo assembly programs are highly fragmented, presenting significant challenges to downstream analysis and inference. To address this issue, we have developed Virseqimprover, a computational pipeline that can extend assembled contigs to complete or nearly complete genomes while maintaining extension quality. Virseqimprover first examines whether there is any chimeric sequence based on read coverage, breaks the sequence into segments if there is, then extends the longest segment with uniform depth of coverage, and repeats these procedures until the sequence cannot be extended. Finally, Virseqimprover annotates the gene content of the resulting sequence. Results show that Virseqimprover has good performances on correcting and extending viral contigs to their full lengths, hence can be a useful tool to improve the completeness and minimize the assembly errors of viral contigs. Both a web server and a conda package for Virseqimprover are provided to the research community free of charge.
more » « less
Full Text Available
Global scale exploration of human faecal and sewage resistomes as a function of socio-economic status

https://doi.org/10.1038/s44221-024-00310-w

Gupta, Suraj; Wu, Xiaowei; Pruden, Amy; Zhang, Liqing; Vikesland, Peter (October 2024, Nature Water)

Full Text Available
A machine learning framework to predict PPCP removal through various wastewater and water reuse treatment trains

https://doi.org/10.1039/d4ew00892h

Choi, Joung Min; Manthapuri, Vineeth; Keenum, Ishi; Brown, Connor L; Xia, Kang; Chen, Chaoqi; Vikesland, Peter J; Blair, Matthew F; Bott, Charles; Pruden, Amy; et al (January 2025, Environmental Science: Water Research & Technology)

ML Framework for PPCPs fate in WWTPs.
more » « less
Free, publicly-accessible full text available January 30, 2026
MetaCompare 2.0: differential ranking of ecological and human health resistome risks

https://doi.org/10.1093/femsec/fiae155

Rumi, Monjura_Afrin; Oh, Min; Davis, Benjamin_C; Brown, Connor_L; Juvekar, Adheesh; Vikesland, Peter_J; Pruden, Amy; Zhang, Liqing (November 2024, FEMS Microbiology Ecology)

Abstract While numerous environmental factors contribute to the spread of antibiotic resistance genes (ARGs), quantifying their relative contributions remains a fundamental challenge. Similarly, it is important to differentiate acute human health risks from environmental exposure, versus broader ecological risk of ARG evolution and spread across microbial taxa. Recent studies have proposed various methods for achieving such aims. Here, we introduce MetaCompare 2.0, which improves upon original MetaCompare pipeline by differentiating indicators of human health resistome risk (potential for human pathogens of acute resistance concern to acquire ARGs) from ecological resistome risk (overall mobility of ARGs and potential for pathogen acquisition). The updated pipeline's sensitivity was demonstrated by analyzing diverse publicly-available metagenomes from wastewater, surface water, soil, sediment, human gut, and synthetic microbial communities. MetaCompare 2.0 provided distinct rankings of the metagenomes according to both human health resistome risk and ecological resistome risk, with both scores trending higher when influenced by anthropogenic impact or other stress. We evaluated the robustness of the pipeline to sequence assembly methods, sequencing depth, contig count, and metagenomic library coverage bias. The risk scores were remarkably consistent despite variations in these technological aspects. We packaged the improved pipeline into a publicly-available web service (http://metacompare.cs.vt.edu/) that provides an easy-to-use interface for computing resistome risk scores and visualizing results.
more » « less
Mastering Long-Tail Complexity on Graphs: Characterization, Learning, and Generalization

https://doi.org/10.1145/3637528.3671880

Wang, Haohui; Jing, Baoyu; Ding, Kaize; Zhu, Yada; Cheng, Wei; Zhang, Si; Fan, Yonghui; Zhang, Liqing; Zhou, Dawei (August 2024, ACM)

Full Text Available
DeepMRG: a multi-label deep learning classifier for predicting bacterial metal resistance genes

https://doi.org/10.1101/2023.11.14.566903

Emon, Muhit Islam; Zhang, Liqing (November 2023, bioRxiv)

Abstract The widespread misuse of antibiotics has escalated antibiotic resistance into a critical global public health concern. Beyond antibiotics, metals function as antibacterial agents. Metal resistance genes (MRGs) enable bacteria to tolerate metal-based antibacterials and may also foster antibiotic resistance within bacterial communities through co-selection. Thus, predicting bacterial MRGs is vital for elucidating their involvement in antibiotic resistance and metal tolerance mechanisms. The “best hit” approach is mainly utilized to identify and annotate MRGs. This method is sensitive to cutoff values and produces a high false negative rate. Other than the best hit approach, only a few antimicrobial resistance (AMR) detection tools exist for predicting MRGs. However, these tools lack comprehensive annotation for MRGs conferring resistance to multiple metals. To address such limitations, we introduce DeepMRG, a deep learning-based multi-label classifier, to predict bacterial MRGs. Because a bacterial MRG can confer resistance to multiple metals, DeepMRG is designed as a multi-label classifier capable of predicting multiple metal labels associated with an MRG. It leverages bit score-based similarity distribution of sequences with experimentally verified MRGs. To ensure unbiased model evaluation, we employed a clustering method to partition our dataset into six subsets, five for cross-validation and one for testing, with non-homologous sequences, mitigating the impact of sequence homology. DeepMRG consistently achieved high overall F1-scores and significantly reduced false negative rates across a wide range of datasets. It can be used to predict bacterial MRGs in metagenomic or isolate assemblies. The web server of DeepMRG can be accessed athttps://deepmrg.cs.vt.edu/deepmrgand the source code is available athttps://github.com/muhit-emon/DeepMRGunder the MIT license.
more » « less
Full Text Available

« Prev Next »

Search for: All records