skip to main content


Search for: All records

Creators/Authors contains: "Fang, Hong"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Free, publicly-accessible full text available July 1, 2024
  2. Abstract

    To use next-generation sequencing technology such as RNA-seq for medical and health applications, choosing proper analysis methods for biomarker identification remains a critical challenge for most users. The US Food and Drug Administration (FDA) has led the Sequencing Quality Control (SEQC) project to conduct a comprehensive investigation of 278 representative RNA-seq data analysis pipelines consisting of 13 sequence mapping, three quantification, and seven normalization methods. In this article, we focused on the impact of the joint effects of RNA-seq pipelines on gene expression estimation as well as the downstream prediction of disease outcomes. First, we developed and applied three metrics (i.e., accuracy, precision, and reliability) to quantitatively evaluate each pipeline’s performance on gene expression estimation. We then investigated the correlation between the proposed metrics and the downstream prediction performance using two real-world cancer datasets (i.e., SEQC neuroblastoma dataset and the NIH/NCI TCGA lung adenocarcinoma dataset). We found that RNA-seq pipeline components jointly and significantly impacted the accuracy of gene expression estimation, and its impact was extended to the downstream prediction of these cancer outcomes. Specifically, RNA-seq pipelines that produced more accurate, precise, and reliable gene expression estimation tended to perform better in the prediction of disease outcome. In the end, we provided scenarios as guidelines for users to use these three metrics to select sensible RNA-seq pipelines for the improved accuracy, precision, and reliability of gene expression estimation, which lead to the improved downstream gene expression-based prediction of disease outcome.

     
    more » « less
  3. Abstract

    Reconstructing the chemical and structural characteristics of the plant cell wall represents a promising solution to overcoming lignocellulosic biomass recalcitrance to biochemical deconstruction. This study aims to leverage hydroxyproline (Hyp)‐O‐glycosylation, a process unique to plant cell wall glycoproteins, as an innovative technology for de novo design and engineering in planta of Hyp‐O‐glycosylated biopolymers (HypGP) that facilitate plant cell wall reconstruction. HypGP consisting of 18 tandem repeats of “Ser–Hyp–Hyp–Hyp–Hyp” motif or (SP4)18was designed and engineered into tobacco plants as a fusion peptide with either a reporter protein enhanced green fluorescence protein or the catalytic domain of a thermophilic E1 endoglucanase (E1cd) fromAcidothermus cellulolyticus. The engineered (SP4)18module was extensively Hyp‐O‐glycosylated with arabino‐oligosaccharides, which facilitated the deposition of the fused protein/enzyme in the cell wall matrix and improved the accumulation of the protein/enzyme in planta by 1.5–11‐fold. The enzyme activity of the recombinant E1cd was not affected by the fused (SP4)18module, showing an optimal temperature of 80°C and optimal pH between 5 and 8. The plant biomass engineered with the (SP4)18‐tagged protein/enzyme increased the biomass saccharification efficiency by up to 3.5‐fold without having adverse impact on the plant growth.

     
    more » « less