skip to main content

Title: Cis-regulatory sequences in plants: their importance, discovery, and future challenges
Abstract The identification and characterization of cis-regulatory DNA sequences and how they function to coordinate responses to developmental and environmental cues is of paramount importance to plant biology. Key to these regulatory processes are cis-regulatory modules (CRMs), which include enhancers and silencers. Despite the extraordinary advances in high-quality sequence assemblies and genome annotations, the identification and understanding of CRMs, and how they regulate gene expression, lag significantly behind. This is especially true for their distinguishing characteristics and activity states. Here, we review the current knowledge on CRMs and breakthrough technologies enabling identification, characterization, and validation of CRMs; we compare the genomic distributions of CRMs with respect to their target genes between different plant species, and discuss the role of transposable elements harboring CRMs in the evolution of gene expression. This is an exciting time to study cis-regulomes in plants; however, significant existing challenges need to be overcome to fully understand and appreciate the role of CRMs in plant biology and in crop improvement.
; ;
Award ID(s):
1733633 1856627 1856143 2026554
Publication Date:
Journal Name:
The Plant Cell
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Background

    Mouse is probably the most important model organism to study mammal biology and human diseases. A better understanding of the mouse genome will help understand the human genome, biology and diseases. However, despite the recent progress, the characterization of the regulatory sequences in the mouse genome is still far from complete, limiting its use to understand the regulatory sequences in the human genome.


    Here, by integrating binding peaks in ~ 9,000 transcription factor (TF) ChIP-seq datasets that cover 79.9% of the mouse mappable genome using an efficient pipeline, we were able to partition these binding peak-covered genome regions into acis-regulatory module (CRM) candidate (CRMC) set and a non-CRMC set. The CRMCs contain 912,197 putative CRMs and 38,554,729 TF binding sites (TFBSs) islands, covering 55.5% and 24.4% of the mappable genome, respectively. The CRMCs tend to be under strong evolutionary constraints, indicating that they are likelycis-regulatory; while the non-CRMCs are largely selectively neutral, indicating that they are unlikelycis-regulatory. Based on evolutionary profiles of the genome positions, we further estimated that 63.8% and 27.4% of the mouse genome might code for CRMs and TFBSs, respectively.


    Validation using experimental data suggests that at least most of the CRMCs are authentic. Thus, this unprecedentedly comprehensivemore »map of CRMs and TFBSs can be a good resource to guide experimental studies of regulatory genomes in mice and humans.

    « less
  2. The evolution of transcriptional regulatory mechanisms is central to how stress response and tolerance differ between species. However, it remains largely unknown how divergence in cis-regulatory sites and, subsequently, transcription factor (TF) binding specificity contribute to stress-responsive expression divergence, particularly between wild and domesticated spe-cies. By profiling wound-responsive gene transcriptomes in wild Solanum pennellii and do-mesticated S. lycopersicum, we found extensive wound-response divergence and identified 493 S. lycopersicum and 278 S. pennellii putative cis-regulatory elements (pCREs) that were predictive of wound-responsive gene expression. Only 24-52% of these wound-response pCREs (depending on wound-response patterns) were consistently enriched in the putative promoter regions of wound-responsive genes across species. In addition, between these two species, their differences in pCRE site sequences were significantly and positively correlated with differences in wound-responsive gene expression. Furthermore, ~11-39% of pCREs were specific to only one of the species and likely bound by TFs from different families. These findings indicate substantial regulatory divergence in these two plant species that di-verged ~3-7 million years ago. Our study provides insights into the mechanistic basis of how the transcriptional response to wounding is regulated and, importantly, the contribution of cis-regulatory components to variation in wound-responsive gene expression between a wild andmore »a domesticated plant species.« less
  3. Purugganan, Michael (Ed.)
    Abstract The field of evolutionary developmental biology can help address how morphological novelties evolve, a key question in evolutionary biology. In Arabidopsis thaliana, APETALA2 (AP2) plays a role in the development of key plant innovations including seeds, flowers, and fruits. AP2 belongs to the AP2/ETHYLENE RESPONSIVE ELEMENT BINDING FACTOR family which has members in all viridiplantae, making it one of the oldest and most diverse gene lineages. One key subclade, present across vascular plants is the euAPETALA2 (euAP2) clade, whose founding member is AP2. We reconstructed the evolution of the euAP2 gene lineage in vascular plants to better understand its impact on the morphological evolution of plants, identifying seven major duplication events. We also performed spatiotemporal expression analyses of euAP2/TOE3 genes focusing on less explored vascular plant lineages, including ferns, gymnosperms, early diverging angiosperms and early diverging eudicots. Altogether, our data suggest that euAP2 genes originally contributed to spore and sporangium development, and were subsequently recruited to ovule, fruit and floral organ development. Finally, euAP2 protein sequences are highly conserved; therefore, changes in the role of euAP2 homologs during development are most likely due to changes in regulatory regions.
  4. Storz, Gisela (Ed.)
    ABSTRACT Mutations in regulatory mechanisms that control gene expression contribute to phenotypic diversity and thus facilitate the adaptation of microbes and other organisms to new niches. Comparative genomics can be used to infer rewiring of regulatory architecture based on large effect mutations like loss or acquisition of transcription factors but may be insufficient to identify small changes in noncoding, intergenic DNA sequence of regulatory elements that drive phenotypic divergence. In human-derived Vibrio cholerae , the response to distinct chemical cues triggers production of multiple transcription factors that can regulate the type VI secretion system (T6), a broadly distributed weapon for interbacterial competition. However, to date, the signaling network remains poorly understood because no regulatory element has been identified for the major T6 locus. Here we identify a conserved cis -acting single nucleotide polymorphism (SNP) controlling T6 transcription and activity. Sequence alignment of the T6 regulatory region from diverse V. cholerae strains revealed conservation of the SNP that we rewired to interconvert V. cholerae T6 activity between chitin-inducible and constitutive states. This study supports a model of pathogen evolution through a noncoding cis -regulatory mutation and preexisting, active transcription factors that confers a different fitness advantage to tightly regulated strains insidemore »a human host and unfettered strains adapted to environmental niches. IMPORTANCE Organisms sense external cues with regulatory circuits that trigger the production of transcription factors, which bind specific DNA sequences at promoters (“ cis ” regulatory elements) to activate target genes. Mutations of transcription factors or their regulatory elements create phenotypic diversity, allowing exploitation of new niches. Waterborne pathogen Vibrio cholerae encodes the type VI secretion system “nanoweapon” to kill competitor cells when activated. Despite identification of several transcription factors, no regulatory element has been identified in the promoter of the major type VI locus, to date. Combining phenotypic, genetic, and genomic analysis of diverse V. cholerae strains, we discovered a single nucleotide polymorphism in the type VI promoter that switches its killing activity between a constitutive state beneficial outside hosts and an inducible state for constraint in a host. Our results support a role for noncoding DNA in adaptation of this pathogen.« less
  5. Abstract More accurate and more complete predictions of cis-regulatory modules (CRMs) and constituent transcription factor (TF) binding sites (TFBSs) in genomes can facilitate characterizing functions of regulatory sequences. Here, we developed a database predicted cis-regulatory modules (PCRMS) ( that stores highly accurate and unprecedentedly complete maps of predicted CRMs and TFBSs in the human and mouse genomes. The web interface allows the user to browse CRMs and TFBSs in an organism, find the closest CRMs to a gene, search CRMs around a gene and find all TFBSs of a TF. PCRMS can be a useful resource for the research community to characterize regulatory genomes. Database URL: