skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Comparative Analysis of rRNA Removal Methods for RNA-Seq Differential Expression in Halophilic Archaea
Despite intense recent research interest in archaea, the scientific community has experienced a bottleneck in the study of genome-scale gene expression experiments by RNA-seq due to the lack of commercial and specifically designed rRNA depletion kits. The high rRNA:mRNA ratio (80–90%: ~10%) in prokaryotes hampers global transcriptomic analysis. Insufficient ribodepletion results in low sequence coverage of mRNA, and therefore, requires a substantially higher number of replicate samples and/or sequencing reads to achieve statistically reliable conclusions regarding the significance of differential gene expression between case and control samples. Here, we show that after the discontinuation of the previous version of RiboZero (Illumina, San Diego, CA, USA) that was useful in partially or completely depleting rRNA from archaea, archaeal transcriptomics studies have experienced a slowdown. To overcome this limitation, here, we analyze the efficiency for four different hybridization-based kits from three different commercial suppliers, each with two sets of sequence-specific probes to remove rRNA from four different species of halophilic archaea. We conclude that the key for transcriptomic success with the currently available tools is the probe-specificity for the rRNA sequence hybridization. With this paper, we provide insights into the archaeal community for selecting certain reagents and strategies over others depending on the archaeal species of interest. These methods yield improved RNA-seq sensitivity and enhanced detection of low abundance transcripts.  more » « less
Award ID(s):
1651117 1936024
PAR ID:
10329346
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Biomolecules
Volume:
12
Issue:
5
ISSN:
2218-273X
Page Range / eLocation ID:
682
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract BackgroundRNA sequencing is a powerful approach to quantify the genome-wide distribution of mRNA molecules in a population to gain deeper understanding of cellular functions and phenotypes. However, unlike eukaryotic cells, mRNA sequencing of bacterial samples is more challenging due to the absence of a poly-A tail that typically enables efficient capture and enrichment of mRNA from the abundant rRNA molecules in a cell. Moreover, bacterial cells frequently contain 100-fold lower quantities of RNA compared to mammalian cells, which further complicates mRNA sequencing from non-cultivable and non-model bacterial species. To overcome these limitations, we report EMBR-seq (Enrichment of mRNA by Blocked rRNA), a method that efficiently depletes 5S, 16S and 23S rRNA using blocking primers to prevent their amplification. ResultsEMBR-seq results in 90% of the sequenced RNA molecules from anE. coliculture deriving from mRNA. We demonstrate that this increased efficiency provides a deeper view of the transcriptome without introducing technical amplification-induced biases. Moreover, compared to recent methods that employ a large array of oligonucleotides to deplete rRNA, EMBR-seq uses a single or a few oligonucleotides per rRNA, thereby making this new technology significantly more cost-effective, especially when applied to varied bacterial species. Finally, compared to existing commercial kits for bacterial rRNA depletion, we show that EMBR-seq can be used to successfully quantify the transcriptome from more than 500-fold lower starting total RNA. ConclusionsEMBR-seq provides an efficient and cost-effective approach to quantify global gene expression profiles from low input bacterial samples. 
    more » « less
  2. Abstract The gene expression landscape across different tissues and developmental stages reflects their biological functions and evolutionary patterns. Integrative and comprehensive analyses of all transcriptomic data in an organism are instrumental to obtaining a comprehensive picture of gene expression landscape. Such studies are still very limited in sorghum, which limits the discovery of the genetic basis underlying complex agricultural traits in sorghum. We characterized the genome‐wide expression landscape for sorghum using 873 RNA‐sequencing (RNA‐seq) datasets representing 19 tissues. Our integrative analysis of these RNA‐seq data provides the most comprehensive transcriptomic atlas for sorghum, which will be valuable for the sorghum research community for functional characterizations of sorghum genes. Based on the transcriptome atlas, we identified 595 housekeeping genes (HKGs) and 2080 tissue‐specific expression genes (TEGs) for the 19 tissues. We identified different gene features between HKGs and TEGs, and we found that HKGs have experienced stronger selective constraints than TEGs. Furthermore, we built a transcriptome‐wide co‐expression network (TW‐CEN) comprising 35 modules with each module enriched in specific Gene Ontology terms. High‐connectivity genes in TW‐CEN tend to express at high levels while undergoing intensive selective pressure. We also built global and seed‐preferential co‐expression networks of starch synthesis pathways, which indicated that photosynthesis and microtubule‐based movement play important roles in starch synthesis. The global transcriptome atlas of sorghum generated by this study provides an important functional genomics resource for trait discovery and insight into starch synthesis regulation in sorghum. 
    more » « less
  3. null (Ed.)
    Corals from the northern Red Sea and Gulf of Aqaba exhibit extreme thermal tolerance. To examine the underlying gene expression dynamics, we exposed Stylophora pistillata from the Gulf of Aqaba to short-term (hours) and long-term (weeks) heat stress with peak seawater temperatures ranging from their maximum monthly mean of 27 °C (baseline) to 29.5 °C, 32 °C, and 34.5 °C. Corals were sampled at the end of the heat stress as well as after a recovery period at baseline temperature. Changes in coral host and symbiotic algal gene expression were determined via RNA-sequencing (RNA-Seq). Shifts in coral microbiome composition were detected by complementary DNA (cDNA)-based 16S ribosomal RNA (rRNA) gene sequencing. In all experiments up to 32 °C, RNA-Seq revealed fast and pervasive changes in gene expression, primarily in the coral host, followed by a return to baseline gene expression for the majority of coral (>94%) and algal (>71%) genes during recovery. At 34.5 °C, large differences in gene expression were observed with minimal recovery, high coral mortality, and a microbiome dominated by opportunistic bacteria (including Vibrio species), indicating that a lethal temperature threshold had been crossed. Our results show that the S. pistillata holobiont can mount a rapid and pervasive gene expression response contingent on the amplitude and duration of the thermal stress. We propose that the transcriptomic resilience and transcriptomic acclimation observed are key to the extraordinary thermal tolerance of this holobiont and, by inference, of other northern Red Sea coral holobionts, up to seawater temperatures of at least 32 °C, that is, 5 °C above their current maximum monthly mean. 
    more » « less
  4. Abstract Background Quantification of gene expression from RNA-seq data is a prerequisite for transcriptome analysis such as differential gene expression analysis and gene co-expression network construction. Individual RNA-seq experiments are larger and combining multiple experiments from sequence repositories can result in datasets with thousands of samples. Processing hundreds to thousands of RNA-seq data can result in challenges related to data management, access to sufficient computational resources, navigation of high-performance computing (HPC) systems, installation of required software dependencies, and reproducibility. Processing of larger and deeper RNA-seq experiments will become more common as sequencing technology matures. Results GEMmaker, is a nf-core compliant, Nextflow workflow, that quantifies gene expression from small to massive RNA-seq datasets. GEMmaker ensures results are highly reproducible through the use of versioned containerized software that can be executed on a single workstation, institutional compute cluster, Kubernetes platform or the cloud. GEMmaker supports popular alignment and quantification tools providing results in raw and normalized formats. GEMmaker is unique in that it can scale to process thousands of local or remote stored samples without exceeding available data storage. Conclusions Workflows that quantify gene expression are not new, and many already address issues of portability, reusability, and scale in terms of access to CPUs. GEMmaker provides these benefits and adds the ability to scale despite low data storage infrastructure. This allows users to process hundreds to thousands of RNA-seq samples even when data storage resources are limited. GEMmaker is freely available and fully documented with step-by-step setup and execution instructions. 
    more » « less
  5. Komeili, Arash (Ed.)
    ABSTRACT Histone proteins are found across diverse lineages of Archaea , many of which package DNA and form chromatin. However, previous research has led to the hypothesis that the histone-like proteins of high-salt-adapted archaea, or halophiles, function differently. The sole histone protein encoded by the model halophilic species Halobacterium salinarum , HpyA, is nonessential and expressed at levels too low to enable genome-wide DNA packaging. Instead, HpyA mediates the transcriptional response to salt stress. Here we compare the features of genome-wide binding of HpyA to those of HstA, the sole histone of another model halophile, Haloferax volcanii . hstA , like hpyA , is a nonessential gene. To better understand HpyA and HstA functions, protein-DNA binding data (chromatin immunoprecipitation sequencing [ChIP-seq]) of these halophilic histones are compared to publicly available ChIP-seq data from DNA binding proteins across all domains of life, including transcription factors (TFs), nucleoid-associated proteins (NAPs), and histones. These analyses demonstrate that HpyA and HstA bind the genome infrequently in discrete regions, which is similar to TFs but unlike NAPs, which bind a much larger genomic fraction. However, unlike TFs that typically bind in intergenic regions, HpyA and HstA binding sites are located in both coding and intergenic regions. The genome-wide dinucleotide periodicity known to facilitate histone binding was undetectable in the genomes of both species. Instead, TF-like and histone-like binding sequence preferences were detected for HstA and HpyA, respectively. Taken together, these data suggest that halophilic archaeal histones are unlikely to facilitate genome-wide chromatin formation and that their function defies categorization as a TF, NAP, or histone. IMPORTANCE Most cells in eukaryotic species—from yeast to humans—possess histone proteins that pack and unpack DNA in response to environmental cues. These essential proteins regulate genes necessary for important cellular processes, including development and stress protection. Although the histone fold domain originated in the domain of life Archaea , the function of archaeal histone-like proteins is not well understood relative to those of eukaryotes. We recently discovered that, unlike histones of eukaryotes, histones in hypersaline-adapted archaeal species do not package DNA and can act as transcription factors (TFs) to regulate stress response gene expression. However, the function of histones across species of hypersaline-adapted archaea still remains unclear. Here, we compare hypersaline histone function to a variety of DNA binding proteins across the tree of life, revealing histone-like behavior in some respects and specific transcriptional regulatory function in others. 
    more » « less