skip to main content

Title: Jumper enables discontinuous transcript assembly in coronaviruses

Genes in SARS-CoV-2 and other viruses in the order ofNidoviralesare expressed by a process of discontinuous transcription which is distinct from alternative splicing in eukaryotes and is mediated by the viral RNA-dependent RNA polymerase. Here, we introduce the DISCONTINUOUS TRANSCRIPT ASSEMBLYproblem of finding transcripts and their abundances given an alignment of paired-end short reads under a maximum likelihood model that accounts for varying transcript lengths. We show, using simulations, that our method, JUMPER, outperforms existing methods for classical transcript assembly. On short-read data of SARS-CoV-1, SARS-CoV-2 and MERS-CoV samples, we find that JUMPER not only identifies canonical transcripts that are part of the reference transcriptome, but also predicts expression of non-canonical transcripts that are supported by subsequent orthogonal analyses. Moreover, application of JUMPER on samples with and without treatment reveals viral drug response at the transcript level. As such, JUMPER enables detailed analyses ofNidoviralestranscriptomes under varying conditions.

; ; ;
Award ID(s):
2027669 2046488 1850502 1652815
Publication Date:
Journal Name:
Nature Communications
Nature Publishing Group
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Background

    SARS-CoV-2 is an RNA virus responsible for the coronavirus disease 2019 (COVID-19) pandemic. Viruses exist in complex microbial environments, and recent studies have revealed both synergistic and antagonistic effects of specific bacterial taxa on viral prevalence and infectivity. We set out to test whether specific bacterial communities predict SARS-CoV-2 occurrence in a hospital setting.


    We collected 972 samples from hospitalized patients with COVID-19, their health care providers, and hospital surfaces before, during, and after admission. We screened for SARS-CoV-2 using RT-qPCR, characterized microbial communities using 16S rRNA gene amplicon sequencing, and used these bacterial profiles to classify SARS-CoV-2 RNA detection with a random forest model.


    Sixteen percent of surfaces from COVID-19 patient rooms had detectable SARS-CoV-2 RNA, although infectivity was not assessed. The highest prevalence was in floor samples next to patient beds (39%) and directly outside their rooms (29%). Although bed rail samples more closely resembled the patient microbiome compared to floor samples, SARS-CoV-2 RNA was detected less often in bed rail samples (11%). SARS-CoV-2 positive samples had higher bacterial phylogenetic diversity in both human and surface samples and higher biomass in floor samples. 16S microbial community profiles enabled high classifier accuracy for SARS-CoV-2 status in not onlymore »nares, but also forehead, stool, and floor samples. Across these distinct microbial profiles, a single amplicon sequence variant from the genusRothiastrongly predicted SARS-CoV-2 presence across sample types, with greater prevalence in positive surface and human samples, even when compared to samples from patients in other intensive care units prior to the COVID-19 pandemic.


    These results contextualize the vast diversity of microbial niches where SARS-CoV-2 RNA is detected and identify specific bacterial taxa that associate with the viral RNA prevalence both in the host and hospital environment.

    « less
  2. Abstract

    Wastewater surveillance has proven to be an effective tool to monitor the transmission and emergence of infectious agents at a community scale. Workflows for wastewater surveillance generally rely on concentration steps to increase the probability of detection of low-abundance targets, but preconcentration can substantially increase the time and cost of analyses while also introducing additional loss of target during processing. To address some of these issues, we conducted a longitudinal study implementing a simplified workflow for SARS-CoV-2 detection from wastewater, using a direct column-based extraction approach. Composite influent wastewater samples were collected weekly for 1 year between June 2020 and June 2021 in Athens-Clarke County, Georgia, USA. Bypassing any concentration step, low volumes (280 µl) of influent wastewater were extracted using a commercial kit, and immediately analyzed by RT-qPCR for the SARS-CoV-2 N1 and N2 gene targets. SARS-CoV-2 viral RNA was detected in 76% (193/254) of influent samples, and the recovery of the surrogate bovine coronavirus was 42% (IQR: 28%, 59%). N1 and N2 assay positivity, viral concentration, and flow-adjusted daily viral load correlated significantly with per-capita case reports of COVID-19 at the county-level (ρ = 0.69–0.82). To compensate for the method’s high limit of detection (approximately 106–107 copies l−1more »in wastewater), we extracted multiple small-volume replicates of each wastewater sample. With this approach, we detected as few as five cases of COVID-19 per 100 000 individuals. These results indicate that a direct-extraction-based workflow for SARS-CoV-2 wastewater surveillance can provide informative and actionable results.

    « less
  3. Abstract

    The ongoing COVID-19 pandemic highlights the necessity for a more fundamental understanding of the coronavirus life cycle. The causative agent of the disease, SARS-CoV-2, is being studied extensively from a structural standpoint in order to gain insight into key molecular mechanisms required for its survival. Contained within the untranslated regions of the SARS-CoV-2 genome are various conserved stem-loop elements that are believed to function in RNA replication, viral protein translation, and discontinuous transcription. While the majority of these regions are variable in sequence, a 41-nucleotide s2m element within the genome 3′ untranslated region is highly conserved among coronaviruses and three other viral families. In this study, we demonstrate that the SARS-CoV-2 s2m element dimerizes by forming an intermediate homodimeric kissing complex structure that is subsequently converted to a thermodynamically stable duplex conformation. This process is aided by the viral nucleocapsid protein, potentially indicating a role in mediating genome dimerization. Furthermore, we demonstrate that the s2m element interacts with multiple copies of host cellular microRNA (miRNA) 1307-3p. Taken together, our results highlight the potential significance of the dimer structures formed by the s2m element in key biological processes and implicate the motif as a possible therapeutic drug target for COVID-19more »and other coronavirus-related diseases.

    « less
  4. Abstract

    Long-range ribonucleic acid (RNA)–RNA interactions (RRI) are prevalent in positive-strand RNA viruses, including Beta-coronaviruses, and these take part in regulatory roles, including the regulation of sub-genomic RNA production rates. Crosslinking of interacting RNAs and short read-based deep sequencing of resulting RNA–RNA hybrids have shown that these long-range structures exist in severe acute respiratory syndrome coronavirus (SARS-CoV)-2 on both genomic and sub-genomic levels and in dynamic topologies. Furthermore, co-evolution of coronaviruses with their hosts is navigated by genetic variations made possible by its large genome, high recombination frequency and a high mutation rate. SARS-CoV-2’s mutations are known to occur spontaneously during replication, and thousands of aggregate mutations have been reported since the emergence of the virus. Although many long-range RRIs have been experimentally identified using high-throughput methods for the wild-type SARS-CoV-2 strain, evolutionary trajectory of these RRIs across variants, impact of mutations on RRIs and interaction of SARS-CoV-2 RNAs with the host have been largely open questions in the field. In this review, we summarize recent computational tools and experimental methods that have been enabling the mapping of RRIs in viral genomes, with a specific focus on SARS-CoV-2. We also present available informatics resources to navigate the RRI maps andmore »shed light on the impact of mutations on the RRI space in viral genomes. Investigating the evolution of long-range RNA interactions and that of virus–host interactions can contribute to the understanding of new and emerging variants as well as aid in developing improved RNA therapeutics critical for combating future outbreaks.

    « less
  5. Abstract Background

    Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) Delta variant has caused a dramatic resurgence in infections in the United Sates, raising questions regarding potential transmissibility among vaccinated individuals.


    Between October 2020 and July 2021, we sequenced 4439 SARS-CoV-2 full genomes, 23% of all known infections in Alachua County, Florida, including 109 vaccine breakthrough cases. Univariate and multivariate regression analyses were conducted to evaluate associations between viral RNA burden and patient characteristics. Contact tracing and phylogenetic analysis were used to investigate direct transmissions involving vaccinated individuals.


    The majority of breakthrough sequences with lineage assignment were classified as Delta variants (74.6%) and occurred, on average, about 3 months (104 ± 57.5 days) after full vaccination, at the same time (June-July 2021) of Delta variant exponential spread within the county. Six Delta variant transmission pairs between fully vaccinated individuals were identified through contact tracing, 3 of which were confirmed by phylogenetic analysis. Delta breakthroughs exhibited broad viral RNA copy number values during acute infection (interquartile range, 1.2-8.64 Log copies/mL), on average 38% lower than matched unvaccinated patients (3.29-10.81 Log copies/mL, P < .00001). Nevertheless, 49% to 50% of all breakthroughs, and 56% to 60% of Delta-infected breakthroughs exhibited viral RNA levels above the transmissibility threshold (4more »Log copies/mL) irrespective of time after vaccination.


    Delta infection transmissibility and general viral RNA quantification patterns in vaccinated individuals suggest limited levels of sterilizing immunity that need to be considered by public health policies. In particular, ongoing evaluation of vaccine boosters should specifically address whether extra vaccine doses curb breakthrough contribution to epidemic spread.

    « less