skip to main content


Search for: All records

Creators/Authors contains: "Guo, Wei"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Abstract

    The escalating drug addiction crisis in the United States underscores the urgent need for innovative therapeutic strategies. This study embarked on an innovative and rigorous strategy to unearth potential drug repurposing candidates for opioid and cocaine addiction treatment, bridging the gap between transcriptomic data analysis and drug discovery. We initiated our approach by conducting differential gene expression analysis on addiction-related transcriptomic data to identify key genes. We propose a novel topological differentiation to identify key genes from a protein–protein interaction network derived from DEGs. This method utilizes persistent Laplacians to accurately single out pivotal nodes within the network, conducting this analysis in a multiscale manner to ensure high reliability. Through rigorous literature validation, pathway analysis and data-availability scrutiny, we identified three pivotal molecular targets, mTOR, mGluR5 and NMDAR, for drug repurposing from DrugBank. We crafted machine learning models employing two natural language processing (NLP)-based embeddings and a traditional 2D fingerprint, which demonstrated robust predictive ability in gauging binding affinities of DrugBank compounds to selected targets. Furthermore, we elucidated the interactions of promising drugs with the targets and evaluated their drug-likeness. This study delineates a multi-faceted and comprehensive analytical framework, amalgamating bioinformatics, topological data analysis and machine learning, for drug repurposing in addiction treatment, setting the stage for subsequent experimental validation. The versatility of the methods we developed allows for applications across a range of diseases and transcriptomic datasets.

     
    more » « less
  2. Free, publicly-accessible full text available October 1, 2024
  3. Abstract

    Protein engineering is an emerging field in biotechnology that has the potential to revolutionize various areas, such as antibody design, drug discovery, food security, ecology, and more. However, the mutational space involved is too vast to be handled through experimental means alone. Leveraging accumulative protein databases, machine learning (ML) models, particularly those based on natural language processing (NLP), have considerably expedited protein engineering. Moreover, advances in topological data analysis (TDA) and artificial intelligence-based protein structure prediction, such as AlphaFold2, have made more powerful structure-based ML-assisted protein engineering strategies possible. This review aims to offer a comprehensive, systematic, and indispensable set of methodological components, including TDA and NLP, for protein engineering and to facilitate their future development.

     
    more » « less
  4. Free, publicly-accessible full text available July 10, 2024
  5. ABSTRACT

    We conduct a systematic search for quasars with periodic variations from the archival photometric data of the Zwicky Transient Facility by cross-matching with the quasar catalogues of the Sloan Digital Sky Survey and Véron-Cetty and Véron. We first select out 184 primitive periodic candidates using the generalized Lomb–Scargle periodogram and autocorrelation function and then estimate their statistical significance of periodicity based on two red-noise models, i.e. damped random walk (DRW) and single power-law (SPL) models. As such, we finally identify 106 (DRW) and 86 (SPL) candidates with the most significant periodic variations out of 143 700 quasars. We further compare DRW and SPL models using Bayes factors, which indicate a relative preference of the SPL model for our primitive sample. We thus adopt the candidates identified with SPL as the final sample and summarize its basic properties. We extend the light curves of the selected candidates by supplying other archival survey data to verify their periodicity. However, only three candidates (with 6–8 cycles of periods) meet the selection criteria. This result clearly implies that, instead of being strictly periodic, the variability must be quasi-periodic or caused by stochastic red-noise. This exerts a challenge to the existing search approaches and calls for developing new effective methods.

     
    more » « less
  6. Free, publicly-accessible full text available June 7, 2024
  7. Free, publicly-accessible full text available June 1, 2024
  8. The pioneering work of William F. Vinen (also known as Joe Vinen) on thermal counterflow turbulence in superfluid helium-4 largely inaugurated the research on quantum turbulence. Despite decades of research on this topic, there are still open questions remaining to be solved. One such question is related to the anomalous increase in the vortex-line density L(t) during the decay of counterflow turbulence, which is often termed as the “bump” on the L(t) curve. In 2016, Vinen and colleagues developed a theoretical model to explain this puzzling phenomenon (JETP Letters, 103, 648-652 (2016)). However, he realized in the last a few years of his life that this theory must be at least inadequate. In remembrance of Joe, we discuss in this paper his latest thoughts on counterflow turbulence and its decay. We also briefly outline our recent experimental and numerical work on this topic. 
    more » « less
  9. Abstract

    Virtual screening (VS) is a critical technique in understanding biomolecular interactions, particularly in drug design and discovery. However, the accuracy of current VS models heavily relies on three-dimensional (3D) structures obtained through molecular docking, which is often unreliable due to the low accuracy. To address this issue, we introduce a sequence-based virtual screening (SVS) as another generation of VS models that utilize advanced natural language processing (NLP) algorithms and optimized deepK-embedding strategies to encode biomolecular interactions without relying on 3D structure-based docking. We demonstrate that SVS outperforms state-of-the-art performance for four regression datasets involving protein-ligand binding, protein-protein, protein-nucleic acid binding, and ligand inhibition of protein-protein interactions and five classification datasets for protein-protein interactions in five biological species. SVS has the potential to transform current practices in drug discovery and protein engineering.

     
    more » « less