skip to main content


Title: YeasTSS: an integrative web database of yeast transcription start sites
Abstract The transcription initiation landscape of eukaryotic genes is complex and highly dynamic. In eukaryotes, genes can generate multiple transcript variants that differ in 5′ boundaries due to usages of alternative transcription start sites (TSSs), and the abundance of transcript isoforms are highly variable. Due to a large number and complexity of the TSSs, it is not feasible to depict details of transcript initiation landscape of all genes using text-format genome annotation files. Therefore, it is necessary to provide data visualization of TSSs to represent quantitative TSS maps and the core promoters (CPs). In addition, the selection and activity of TSSs are influenced by various factors, such as transcription factors, chromatin remodeling and histone modifications. Thus, integration and visualization of functional genomic data related to these features could provide a better understanding of the gene promoter architecture and regulatory mechanism of transcription initiation. Yeast species play important roles for the research and human society, yet no database provides visualization and integration of functional genomic data in yeast. Here, we generated quantitative TSS maps for 12 important yeast species, inferred their CPs and built a public database, YeasTSS (www.yeastss.org). YeasTSS was designed as a central portal for visualization and integration of the TSS maps, CPs and functional genomic data related to transcription initiation in yeast. YeasTSS is expected to benefit the research community and public education for improving genome annotation, studies of promoter structure, regulated control of transcription initiation and inferring gene regulatory network.  more » « less
Award ID(s):
1566292
NSF-PAR ID:
10107141
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
Database
Volume:
2019
ISSN:
1758-0463
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Transcription initiation is regulated in a highly organized fashion to ensure proper cellular functions. Accurate identification of transcription start sites (TSSs) and quantitative characterization of transcription initiation activities are fundamental steps for studies of regulated transcriptions and core promoter structures. Several high-throughput techniques have been developed to sequence the very 5′end of RNA transcripts (TSS sequencing) on the genome scale. Bioinformatics tools are essential for processing, analysis, and visualization of TSS sequencing data. Here, we present TSSr, an R package that provides rich functions for mapping TSS and characterizations of structures and activities of core promoters based on all types of TSS sequencing data. Specifically, TSSr implements several newly developed algorithms for accurately identifying TSSs from mapped sequencing reads and inference of core promoters, which are a prerequisite for subsequent functional analyses of TSS data. Furthermore, TSSr also enables users to export various types of TSS data that can be visualized by genome browser for inspection of promoter activities in association with other genomic features, and to generate publication-ready TSS graphs. These user-friendly features could greatly facilitate studies of transcription initiation based on TSS sequencing data. The source code and detailed documentations of TSSr can be freely accessed at https://github.com/Linlab-slu/TSSr.

     
    more » « less
  2. null (Ed.)
    Regulation of gene expression starts from the transcription initiation. Regulated transcription initiation is critical for generating correct transcripts with proper abundance. The impact of epigenetic control, such as histone modifications and chromatin remodelling, on gene regulation has been extensively investigated, but their specific role in regulating transcription initiation is far from well understood. Here we aimed to better understand the roles of genes involved in histone H3 methylations and chromatin remodelling on the regulation of transcription initiation at a genome-scale using the budding yeast as a study system. We obtained and compared maps of transcription start site (TSS) at single-nucleotide resolution by nAnT-iCAGE for a strain with depletion of MINC (Mot1-Ino80C-Nc2) by Mot1p and Ino80p anchor-away (Mot1&Ino80AA) and a strain with loss of histone methylation (set1Δset2Δdot1Δ) to their wild-type controls. Our study showed that the depletion of MINC stimulated transcription initiation from many new sites flanking the dominant TSS of genes, while the loss of histone methylation generates more TSSs in the coding region. Moreover, the depletion of MINC led to less confined boundaries of TSS clusters (TCs) and resulted in broader core promoters, and such patterns are not present in the ssdΔ mutant. Our data also exhibits that the MINC has distinctive impacts on TATA-containing and TATA-less promoters. In conclusion, our study shows that MINC is required for accurate identification of bona fide TSSs, particularly in TATA-containing promoters, and histone methylation contributes to the repression of transcription initiation in coding regions. 
    more » « less
  3. Abstract

    Promoters and the noncoding sequences that drive their function are fundamental aspects of genes that are critical to their regulation. The transcription preinitiation complex binds and assembles on promoters where it facilitates transcription. The transcription start site (TSS) is located downstream of the promoter sequence and is defined as the location in the genome where polymerase begins transcribing DNA into RNA. Knowing the location of TSSs is useful for annotation of genes, identification of non‐coding sequences important to gene regulation, detection of alternative TSSs, and understanding of 5′ UTR content. Several existing techniques make it possible to accurately identify TSSs, but are often difficult to perform experimentally, require large amounts of input RNA, or are unable to identify a large number of TSSs from a single sample. Many of these protocols take advantage of template switching reverse transcriptases (TSRTs), which reliably place an adaptor at the 5′ end of a first strand synthesis of cDNA. Here, we introduce a protocol that exploits TSRT activity combined with rolling circle amplification to identify TSSs with several unique advantages over existing methods. Sequence adaptors are placed on the 5′ and 3′ end of the full‐length cDNA copy of a transcript. A splint compatible with those adaptors is then used to circularize the full‐length cDNA. Linear DNA containing concatemers of the cDNA are generated using rolling circle amplification, and a sequencing library is formed by fragmenting the concatemers. This protocol is straightforward to execute, requiring limited bench time with relatively stable reagents. Using extremely low amounts of RNA input, this protocol produces large numbers of accurate, deduplicated TSSs genome wide. © 2023 The Authors. Current Protocols published by Wiley Periodicals LLC.

    Basic Protocol 1: Splint generation

    Basic Protocol 2: RNA extraction

    Basic Protocol 3: cDNA synthesis

    Basic Protocol 4: cDNA circularization and amplification

    Basic Protocol 5: Library generation

     
    more » « less
  4. Abstract Motivation MicroRNAs (miRNAs) are small noncoding RNAs that play important roles in gene regulation and phenotype development. The identification of miRNA transcription start sites (TSSs) is critical to understand the functional roles of miRNA genes and their transcriptional regulation. Unlike protein-coding genes, miRNA TSSs are not directly detectable from conventional RNA-Seq experiments due to miRNA-specific process of biogenesis. In the past decade, large-scale genome-wide TSS-Seq and transcription activation marker profiling data have become available, based on which, many computational methods have been developed. These methods have greatly advanced genome-wide miRNA TSS annotation. Results In this study, we summarized recent computational methods and their results on miRNA TSS annotation. We collected and performed a comparative analysis of miRNA TSS annotations from 14 representative studies. We further compiled a robust set of miRNA TSSs (RSmirT) that are supported by multiple studies. Integrative genomic and epigenomic data analysis on RSmirT revealed the genomic and epigenomic features of miRNA TSSs as well as their relations to protein-coding and long non-coding genes. Contact xiaoman@mail.ucf.edu, haihu@cs.ucf.edu 
    more » « less
  5. Regulation of gene expression is a fundamental biological process that relies on transcription factors (TF) recognizing specific cis motifs in the regulatory regions of the genes that they control. In most eukaryotic organisms, cis-regulatory elements are significantly enriched around the transcription start site (TSS). However, different from other genic features, TSSs need to be experimentally determined, becoming then important components of genome annotations. One of the methods for experimentally determining TSSs at the genome-wide level is CAGE (cap analysis of gene expression). This chapter describes how to prepare a CAGE library for sequencing, starting with RNA extraction, library construction, and quality controls before proceed to sequencing in the Illumina platform. We then describe how to use a computational pipeline to determine, from the alignment of CAGE tags, the genome-wide location of TSSs, followed with statistical approaches required to cluster TSSs that operate as transcriptional units, and to determine core promoter properties such as shape. The analyses described here focus on maize, since its large and yet deficiently annotated genome creates some unique challenges, but with some modifications can be easily adopted for other organisms as well. 
    more » « less