skip to main content


Search for: All records

Creators/Authors contains: "Yu, Lei"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. ABSTRACT

    The abundance of various cell types can vary significantly among patients with varying phenotypes and even those with the same phenotype. Recent scientific advancements provide mounting evidence that other clinical variables, such as age, gender, and lifestyle habits, can also influence the abundance of certain cell types. However, current methods for integrating single-cell-level omics data with clinical variables are inadequate. In this study, we propose a regularized Bayesian Dirichlet-multinomial regression framework to investigate the relationship between single-cell RNA sequencing data and patient-level clinical data. Additionally, the model employs a novel hierarchical tree structure to identify such relationships at different cell-type levels. Our model successfully uncovers significant associations between specific cell types and clinical variables across three distinct diseases: pulmonary fibrosis, COVID-19, and non-small cell lung cancer. This integrative analysis provides biological insights and could potentially inform clinical interventions for various diseases.

     
    more » « less
  2. Free, publicly-accessible full text available April 1, 2025
  3. Decision forest, including RandomForest, XGBoost, and Light-GBM, dominates the machine learning tasks over tabular data. Recently, several frameworks were developed for decision forest inference, such as ONNX, TreeLite from Amazon, TensorFlow Decision Forest from Google, HummingBirdfrom Microsoft, Nvidia FIL, and lleaves. While these frameworks are fully optimized for inference computations, they are all decoupled with databases and general data management frameworks, which leads to cross-system performance overheads. We first provided a DICT model to understand the performance gaps between decoupled and in-database inference. We further identified that for in-database inference, in addition to the popular UDF-centric representation that encapsulates the ML into one User Defined Function(UDF), there also exists a relation-centric representation that breaks down the decision forest inference into several fine-grained SQL operations. The relation-centric representation can achieve significantly better performance for large models. We optimized both implementations and conducted a comprehensive benchmark to compare these two implementations to the aforementioned decoupled inference pipelines and existing in-database inference pipelines such as Spark-SQL and PostgresML. The evaluation results validated the DICT model and demonstrated the superior performance of our in-database inference design compared to the baselines. 
    more » « less
  4. de Groot, Bert L. (Ed.)
    Intrinsically disordered proteins (IDPs) are highly dynamic systems that play an important role in cell signaling processes and their misfunction often causes human disease. Proper understanding of IDP function not only requires the realistic characterization of their three-dimensional conformational ensembles at atomic-level resolution but also of the time scales of interconversion between their conformational substates. Large sets of experimental data are often used in combination with molecular modeling to restrain or bias models to improve agreement with experiment. It is shown here for the N-terminal transactivation domain of p53 (p53TAD) and Pup, which are two IDPs that fold upon binding to their targets, how the latest advancements in molecular dynamics (MD) simulations methodology produces native conformational ensembles by combining replica exchange with series of microsecond MD simulations. They closely reproduce experimental data at the global conformational ensemble level, in terms of the distribution properties of the radius of gyration tensor, and at the local level, in terms of NMR properties including 15 N spin relaxation, without the need for reweighting. Further inspection revealed that 10–20% of the individual MD trajectories display the formation of secondary structures not observed in the experimental NMR data. The IDP ensembles were analyzed by graph theory to identify dominant inter-residue contact clusters and characteristic amino-acid contact propensities. These findings indicate that modern MD force fields with residue-specific backbone potentials can produce highly realistic IDP ensembles sampling a hierarchy of nano- and picosecond time scales providing new insights into their biological function. 
    more » « less
  5. In an epoch dominated by escalating concerns over climate change and looming energy crises, the imperative to design highly efficient catalysts that can facilitate the sequestration and transformation of carbon dioxide (CO2) into beneficial chemicals is paramount. This research presents the successful synthesis of nanofiber catalysts, incorporating monometallic nickel (Ni) and cobalt (Co) and their bimetallic blend, NiCo, via a facile electrospinning technique, with precise control over the Ni/Co molar ratios. Application of an array of advanced analytical methods, including SEM, TGA–DSC, FTIR-ATR, XRD, Raman, XRF, and ICP-MS, validated the effective integration and homogeneous distribution of active Ni/Co catalysts within the nanofibers. The catalytic performance of these mono- and bimetallic Ni/Co nanofiber catalysts was systematically examined under ambient pressure conditions for CO2 hydrogenation reactions. The bimetallic NiCo nanofiber catalysts, specifically with a Ni/Co molar ratio of 1:2, and thermally treated at 1050 °C, demonstrated a high CO selectivity (98.5%) and a marked increase in CO2 conversion rate—up to 16.7 times that of monometallic Ni nanofiber catalyst and 10.8 times that of the monometallic Co nanofiber catalyst. This significant enhancement in catalytic performance is attributed to the improved accessibility of active sites, minimized particle size, and the strong Ni–Co–C interactions within these nanofiber structures. These nanofiber catalysts offer a unique model system that illuminates the fundamental aspects of supported catalysis and accentuates its crucial role in addressing pressing environmental challenges. 
    more » « less
  6. Background and Aim:

    Copper is an essential trace metal serving as a cofactor in innate immunity, metabolism, and iron transport. We hypothesize that copper deficiency may influence survival in patients with cirrhosis through these pathways.

    Methods:

    We performed a retrospective cohort study involving 183 consecutive patients with cirrhosis or portal hypertension. Copper from blood and liver tissues was measured using inductively coupled plasma mass spectrometry. Polar metabolites were measured using nuclear magnetic resonance spectroscopy. Copper deficiency was defined by serum or plasma copper below 80 µg/dL for women or 70 µg/dL for men.

    Results:

    The prevalence of copper deficiency was 17% (N=31). Copper deficiency was associated with younger age, race, zinc and selenium deficiency, and higher infection rates (42% vs. 20%,p=0.01). Serum copper correlated positively with albumin, ceruloplasmin, hepatic copper, and negatively with IL-1β. Levels of polar metabolites involved in amino acids catabolism, mitochondrial transport of fatty acids, and gut microbial metabolism differed significantly according to copper deficiency status. During a median follow-up of 396 days, mortality was 22.6% in patients with copper deficiency compared with 10.5% in patients without. Liver transplantation rates were similar (32% vs. 30%). Cause-specific competing risk analysis showed that copper deficiency was associated with a significantly higher risk of death before transplantation after adjusting for age, sex, MELD-Na, and Karnofsky score (HR: 3.40, 95% CI, 1.18–9.82,p=0.023).

    Conclusions:

    In advanced cirrhosis, copper deficiency is relatively common and is associated with an increased infection risk, a distinctive metabolic profile, and an increased risk of death before transplantation.

     
    more » « less
  7. Serving deep learning models from relational databases brings significant benefits. First, features extracted from databases do not need to be transferred to any decoupled deep learning systems for inferences, and thus the system management overhead can be significantly reduced. Second, in a relational database, data management along the storage hierarchy is fully integrated with query processing, and thus it can continue model serving even if the working set size exceeds the available memory. Applying model deduplication can greatly reduce the storage space, memory footprint, cache misses, and inference latency. However, existing data deduplication techniques are not applicable to the deep learning model serving applications in relational databases. They do not consider the impacts on model inference accuracy as well as the inconsistency between tensor blocks and database pages. This work proposed synergistic storage optimization techniques for duplication detection, page packing, and caching, to enhance database systems for model serving. Evaluation results show that our proposed techniques significantly improved the storage efficiency and the model inference latency, and outperformed existing deep learning frameworks in targeting scenarios. 
    more » « less
  8. A lithium-air battery based on lithium oxide (Li2O) formation can theoretically deliver an energy density that is comparable to that of gasoline. Lithium oxide formation involves a four-electron reaction that is more difficult to achieve than the one- and two-electron reaction processes that result in lithium superoxide (LiO2) and lithium peroxide (Li2O2), respectively. By using a composite polymer electrolyte based on Li10GeP2S12nanoparticles embedded in a modified polyethylene oxide polymer matrix, we found that Li2O is the main product in a room temperature solid-state lithium-air battery. The battery is rechargeable for 1000 cycles with a low polarization gap and can operate at high rates. The four-electron reaction is enabled by a mixed ion–electron-conducting discharge product and its interface with air.

     
    more » « less