skip to main content


Search for: All records

Creators/Authors contains: "Liu, Xiao"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Free, publicly-accessible full text available July 15, 2025
  2. Despite recent progress in Graph Neural Networks (GNNs), explaining predictions made by GNNs remains a challenging and nascent problem. The leading method mainly considers the local explanations, i.e., important subgraph structure and node features, to interpret why a GNN model makes the prediction for a single instance, e.g. a node or a graph. As a result, the explanation generated is painstakingly customized at the instance level. The unique explanation interpreting each instance independently is not sufficient to provide a global understanding of the learned GNN model, leading to the lack of generalizability and hindering it from being used in the inductive setting. Besides, training the explanation model explaining for each instance is time-consuming for large-scale real-life datasets. In this study, we address these key challenges and propose PGExplainer, a parameterized explainer for GNNs. PGExplainer adopts a deep neural network to parameterize the generation process of explanations, which renders PGExplainer a natural approach to multi-instance explanations. Compared to the existing work, PGExplainer has better generalization ability and can be utilized in an inductive setting without training the model for new instances. Thus, PGExplainer is much more efficient than the leading method with significant speed-up. In addition, the explanation networks can also be utilized as a regularizer to improve the generalization power of existing GNNs when jointly trained with downstream tasks. Experiments on both synthetic and real-life datasets show highly competitive performance with up to 24.7% relative improvement in AUC on explaining graph classification over the leading baseline. 
    more » « less
    Free, publicly-accessible full text available August 1, 2025
  3. Archaea produce unique membrane-spanning lipids (MSLs), termed glycerol dialkyl glycerol tetraethers (GDGTs), which aid in adaptive responses to various environmental challenges. GDGTs can be modified through cyclization, cross-linking, methylation, hydroxylation, and desaturation, resulting in structurally distinct GDGT lipids. Here, we report the identification of radical SAM proteins responsible for two of these modifications—a glycerol monoalkyl glycerol tetraether (GMGT) synthase (Gms), responsible for covalently cross-linking the two hydrocarbon tails of a GDGT to produce GMGTs, and a GMGT methylase (Gmm), capable of methylating the core hydrocarbon tail. Heterologous expression of Gms proteins from various archaea inThermococcus kodakarensisresults in the production of GMGTs in two isomeric forms. Further, coexpression of Gms and Gmm produces mono- and dimethylated GMGTs and minor amounts of trimethylated GMGTs with only trace GDGT methylation. Phylogenetic analyses reveal the presence of Gms homologs in diverse archaeal genomes spanning all four archaeal superphyla and in multiple bacterial phyla with the genetic potential to synthesize fatty acid–based MSLs, demonstrating that GMGT production may be more widespread than previously appreciated. We demonstrate GMGT production in three Gms-encoding archaea, identifying an increase in GMGTs in response to elevated temperature in twoArchaeoglobusspecies and the production of GMGTs with up to six rings inVulcanisaeta distributa.The occurrence of such highly cyclized GMGTs has been limited to environmental samples and their detection in culture demonstrates the utility of combining genetic, bioinformatic, and lipid analyses to identify producers of distinct archaeal membrane lipids.

     
    more » « less
    Free, publicly-accessible full text available June 25, 2025
  4. Soil carbon loss is likely to increase due to climate warming, but microbiomes and microenvironments may dampen this effect. In a 30-year warming experiment, physical protection within soil aggregates affected the thermal responses of soil microbiomes and carbon dynamics. In this study, we combined metagenomic analysis with physical characterization of soil aggregates to explore mechanisms by which microbial communities respond to climate warming across different soil microenvironments. Long-term warming decreased the relative abundances of genes involved in degrading labile compounds (e.g., cellulose), but increased those genes involved in degrading recalcitrant compounds (e.g., lignin) across aggregate sizes. These changes were observed in most phyla of bacteria, especially for Acidobacteria, Actinobacteria, Bacteroidetes, Chloroflexi, and Planctomycetes. Microbial community composition was considerably altered by warming, leading to declined diversity for bacteria and fungi but not for archaea. Microbial functional genes, diversity, and community composition differed between macroaggregates and microaggregates, indicating the essential role of physical protection in controlling microbial community dynamics. Our findings suggest that microbes have the capacity to employ various strategies to acclimate or adapt to climate change (e.g., warming, heat stress) by shifting functional gene abundances and community structures in varying microenvironments, as regulated by soil physical protection. 
    more » « less
    Free, publicly-accessible full text available April 6, 2025
  5. Free, publicly-accessible full text available April 1, 2025
  6. Free, publicly-accessible full text available January 1, 2025
  7. Free, publicly-accessible full text available March 1, 2025
  8. Abstract

    Type 1 polyketides are a major class of natural products used as antiviral, antibiotic, antifungal, antiparasitic, immunosuppressive, and antitumor drugs. Analysis of public microbial genomes leads to the discovery of over sixty thousand type 1 polyketide gene clusters. However, the molecular products of only about a hundred of these clusters are characterized, leaving most metabolites unknown. Characterizing polyketides relies on bioactivity-guided purification, which is expensive and time-consuming. To address this, we present Seq2PKS, a machine learning algorithm that predicts chemical structures derived from Type 1 polyketide synthases. Seq2PKS predicts numerous putative structures for each gene cluster to enhance accuracy. The correct structure is identified using a variable mass spectral database search. Benchmarks show that Seq2PKS outperforms existing methods. Applying Seq2PKS to Actinobacteria datasets, we discover biosynthetic gene clusters for monazomycin, oasomycin A, and 2-aminobenzamide-actiphenol.

     
    more » « less