skip to main content


Title: Cis-regulatory sequences in plants: their importance, discovery, and future challenges
Abstract The identification and characterization of cis-regulatory DNA sequences and how they function to coordinate responses to developmental and environmental cues is of paramount importance to plant biology. Key to these regulatory processes are cis-regulatory modules (CRMs), which include enhancers and silencers. Despite the extraordinary advances in high-quality sequence assemblies and genome annotations, the identification and understanding of CRMs, and how they regulate gene expression, lag significantly behind. This is especially true for their distinguishing characteristics and activity states. Here, we review the current knowledge on CRMs and breakthrough technologies enabling identification, characterization, and validation of CRMs; we compare the genomic distributions of CRMs with respect to their target genes between different plant species, and discuss the role of transposable elements harboring CRMs in the evolution of gene expression. This is an exciting time to study cis-regulomes in plants; however, significant existing challenges need to be overcome to fully understand and appreciate the role of CRMs in plant biology and in crop improvement.  more » « less
Award ID(s):
1733633 1856627 1856143 2026554
NSF-PAR ID:
10313991
Author(s) / Creator(s):
; ;
Publisher / Repository:
The Plant Cell
Date Published:
Journal Name:
The Plant Cell
ISSN:
1532-298X
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. cis-Regulatory elements encode the genomic blueprints that ensure the proper spatiotemporal patterning of gene expression necessary for appropriate development and responses to the environment. Accumulating evidence implicates changes to gene expression as a major source of phenotypic novelty in eukaryotes, including acute phenotypes such as disease and cancer in mammals. Moreover, genetic and epigenetic variation affecting cis-regulatory sequences over longer evolutionary timescales has become a recurring theme in studies of morphological divergence and local adaptation. Here, we discuss the functions of and methods used to identify various classes of cis-regulatory elements, as well as their role in plant development and response to the environment. We highlight opportunities to exploit cis-regulatory variants underlying plant development and environmental responses for crop improvement efforts. Although a comprehensive understanding of cis-regulatory mechanisms in plants has lagged behind that in animals, we showcase several breakthrough findings that have profoundly influenced plant biology and shaped the overall understanding of transcriptional regulation in eukaryotes. 
    more » « less
  2. Abstract Background

    Mouse is probably the most important model organism to study mammal biology and human diseases. A better understanding of the mouse genome will help understand the human genome, biology and diseases. However, despite the recent progress, the characterization of the regulatory sequences in the mouse genome is still far from complete, limiting its use to understand the regulatory sequences in the human genome.

    Results

    Here, by integrating binding peaks in ~ 9,000 transcription factor (TF) ChIP-seq datasets that cover 79.9% of the mouse mappable genome using an efficient pipeline, we were able to partition these binding peak-covered genome regions into acis-regulatory module (CRM) candidate (CRMC) set and a non-CRMC set. The CRMCs contain 912,197 putative CRMs and 38,554,729 TF binding sites (TFBSs) islands, covering 55.5% and 24.4% of the mappable genome, respectively. The CRMCs tend to be under strong evolutionary constraints, indicating that they are likelycis-regulatory; while the non-CRMCs are largely selectively neutral, indicating that they are unlikelycis-regulatory. Based on evolutionary profiles of the genome positions, we further estimated that 63.8% and 27.4% of the mouse genome might code for CRMs and TFBSs, respectively.

    Conclusions

    Validation using experimental data suggests that at least most of the CRMCs are authentic. Thus, this unprecedentedly comprehensive map of CRMs and TFBSs can be a good resource to guide experimental studies of regulatory genomes in mice and humans.

     
    more » « less
  3. ABSTRACT The posterior end of the follicular epithelium is patterned by midline (MID) and its paralog H15, the Drosophila homologs of the mammalian Tbx20 transcription factor. We have previously identified two cis-regulatory modules (CRMs) that recapitulate the endogenous pattern of mid in the follicular epithelium. Here, using CRISPR/Cas9 genome editing, we demonstrate redundant activity of these mid CRMs. Although the deletion of either CRM alone generated marginal change in mid expression, the deletion of both CRMs reduced expression by 60%. Unexpectedly, the deletion of the 5′ proximal CRM of mid eliminated H15 expression. Interestingly, expression of these paralogs in other tissues remained unaffected in the CRM deletion backgrounds. These results suggest that the paralogs are regulated by a shared CRM that coordinates gene expression during posterior fate determination. The consistent overlapping expression of mid and H15 in various tissues may indicate that the paralogs could also be under shared regulation by other CRMs in these tissues. 
    more » « less
  4. SUMMARY

    Cis‐regulatory elements (CREs) are important sequences for gene expression and for plant biological processes such as development, evolution, domestication, and stress response. However, studying CREs in plant genomes has been challenging. The totipotent nature of plant cells, coupled with the inability to maintain plant cell types in culture and the inherent technical challenges posed by the cell wall has limited our understanding of how plant cell types acquire and maintain their identities and respond to the environment via CRE usage. Advances in single‐cell epigenomics have revolutionized the field of identifying cell‐type‐specific CREs. These new technologies have the potential to significantly advance our understanding of plant CRE biology, and shed light on how the regulatory genome gives rise to diverse plant phenomena. However, there are significant biological and computational challenges associated with analyzing single‐cell epigenomic datasets. In this review, we discuss the historical and foundational underpinnings of plant single‐cell research, challenges, and common pitfalls in the analysis of plant single‐cell epigenomic data, and highlight biological challenges unique to plants. Additionally, we discuss how the application of single‐cell epigenomic data in various contexts stands to transform our understanding of the importance of CREs in plant genomes.

     
    more » « less
  5. Abstract Background

    The genetic information contained in the genome of an organism is organized in genes and regulatory elements that control gene expression. The genomes of multiple plants species have already been sequenced and the gene repertory have been annotated, however,cis-regulatory elements remain less characterized, limiting our understanding of genome functionality. These elements act as open platforms for recruiting both positive- and negative-acting transcription factors, and as such, chromatin accessibility is an important signature for their identification.

    Results

    In this work we developed a transgenic INTACT [isolation of nuclei tagged in specific cell types] system in tetraploid wheat for nuclei purifications. Then, we combined the INTACT system together with the assay for transposase-accessible chromatin with sequencing [ATAC-seq] to identify open chromatin regions in wheat root tip samples. Our ATAC-seq results showed a large enrichment of open chromatin regions in intergenic and promoter regions, which is expected for regulatory elements and that is similar to ATAC-seq results obtained in other plant species. In addition, root ATAC-seq peaks showed a significant overlap with a previously published ATAC-seq data from wheat leaf protoplast, indicating a high reproducibility between the two experiments and a large overlap between open chromatin regions in root and leaf tissues. Importantly, we observed overlap between ATAC-seq peaks andcis-regulatory elements that have been functionally validated in wheat, and a good correlation between normalized accessibility and gene expression levels.

    Conclusions

    We have developed and validated an INTACT system in tetraploid wheat that allows rapid and high-quality nuclei purification from root tips. Those nuclei were successfully used to performed ATAC-seq experiments that revealed open chromatin regions in the wheat genome that will be useful to identify cis-regulatory elements. The INTACT system presented here will facilitate the development of ATAC-seq datasets in other tissues, growth stages, and under different growing conditions to generate a more complete landscape of the accessible DNA regions in the wheat genome.

     
    more » « less