skip to main content

This content will become publicly available on January 10, 2023

Title: Comprehensive Genomic Discovery of Non-Coding Transcriptional Enhancers in the African Malaria Vector Anopheles coluzzii
Almost all regulation of gene expression in eukaryotic genomes is mediated by the action of distant non-coding transcriptional enhancers upon proximal gene promoters. Enhancer locations cannot be accurately predicted bioinformatically because of the absence of a defined sequence code, and thus functional assays are required for their direct detection. Here we used a massively parallel reporter assay, Self-Transcribing Active Regulatory Region sequencing (STARR-seq), to generate the first comprehensive genome-wide map of enhancers in Anopheles coluzzii , a major African malaria vector in the Gambiae species complex. The screen was carried out by transfecting reporter libraries created from the genomic DNA of 60 wild A. coluzzii from Burkina Faso into A. coluzzii 4a3A cells, in order to functionally query enhancer activity of the natural population within the homologous cellular context. We report a catalog of 3,288 active genomic enhancers that were significant across three biological replicates, 74% of them located in intergenic and intronic regions. The STARR-seq enhancer screen is chromatin-free and thus detects inherent activity of a comprehensive catalog of enhancers that may be restricted in vivo to specific cell types or developmental stages. Testing of a validation panel of enhancer candidates using manual luciferase assays confirmed enhancer function in 26 more » of 28 (93%) of the candidates over a wide dynamic range of activity from two to at least 16-fold activity above baseline. The enhancers occupy only 0.7% of the genome, and display distinct composition features. The enhancer compartment is significantly enriched for 15 transcription factor binding site signatures, and displays divergence for specific dinucleotide repeats, as compared to matched non-enhancer genomic controls. The genome-wide catalog of A. coluzzii enhancers is publicly available in a simple searchable graphic format. This enhancer catalogue will be valuable in linking genetic and phenotypic variation, in identifying regulatory elements that could be employed in vector manipulation, and in better targeting of chromosome editing to minimize extraneous regulation influences on the introduced sequences. Importance: Understanding the role of the non-coding regulatory genome in complex disease phenotypes is essential, but even in well-characterized model organisms, identification of regulatory regions within the vast non-coding genome remains a challenge. We used a large-scale assay to generate a genome wide map of transcriptional enhancers. Such a catalogue for the important malaria vector, Anopheles coluzzii , will be an important research tool as the role of non-coding regulatory variation in differential susceptibility to malaria infection is explored and as a public resource for research on this important insect vector of disease. « less
; ; ; ; ; ; ; ; ; ; ;
Award ID(s):
Publication Date:
Journal Name:
Frontiers in Genetics
Sponsoring Org:
National Science Foundation
More Like this
  1. Changes in cis-regulatory elements play important roles in adaptation and phenotypic evolution. However, their contribution to metabolic adaptation of organisms is less understood. Here we have utilized a unique vertebrate model, Astyanax mexicanus, different morphotypes of which survive in nutrient-rich surface and nutrient-deprived cave water to uncover gene regulatory networks in metabolic adaptation. We performed genome-wide epigenetic profiling in the liver tissue of one surface and two independently derived cave populations. We find that many cis-regulatory elements differ in their epigenetic status/chromatin accessibility between surface fish and cavefish, while the two independently derived cave populations have evolved remarkably similar regulatory signatures. These differentially accessible regions are associated with genes of key pathways related to lipid metabolism, circadian rhythm and immune system that are known to be altered in cavefish. Using in vitro and in vivo functional testing of the candidate cis-regulatory elements, we find that genetic changes within them cause quantitative expression differences. We characterized one cis-regulatory element in the hpdb gene and found a genomic deletion in cavefish that abolishes binding of the transcriptional repressor IRF2 in vitro and derepresses enhancer activity in reporter assays. Genetic experiments further validated a cis-mediated role of the enhancer and suggest a rolemore »of this deletion in the upregulation of hpdb in wild cavefish populations. Selection of this mutation in multiple independent cave populations supports its importance in the adaptation to the cave environment, providing novel molecular insights into the evolutionary trade-off between loss of pigmentation and adaptation to a food-deprived cave environment.« less
  2. Abstract

    The role of a neural crest developmental transcriptional program, which critically involves Sox10 upregulation, is a key conserved aspect of melanoma initiation in both humans and zebrafish, yet transcriptional regulation ofsox10expression is incompletely understood. Here we used ATAC-Seq analysis of multiple zebrafish melanoma tumors to identify recurrently open chromatin domains as putative melanoma-specificsox10enhancers. Screening in vivo with EGFP reporter constructs revealed 9 of 11 putativesox10enhancers with embryonic activity in zebrafish. Focusing on the most active enhancer region in melanoma, we identified a region 23 kilobases upstream ofsox10, termedpeak5, that drivesEGFPreporter expression in a subset of neural crest cells, Kolmer-Agduhr neurons, and early melanoma patches and tumors with high specificity. A ~200 base pair region, conserved inCyprinidae, withinpeak5is required for transgenic reporter activity in neural crest and melanoma. This region contains dimeric SoxE/Sox10 dimeric binding sites essential forpeak5neural crest and melanoma activity. We show that deletion of the endogenouspeak5conserved genomic locus decreases embryonicsox10expression and disrupts adult stripe patterning in our melanoma model background. Our work demonstrates the power of linking developmental and cancer models to better understand neural crest identity in melanoma.

  3. We used capped analysis of gene expression with sequencing (CAGE-seq) to profile eRNA expression and enhancer activity during embryogenesis of a model echinoderm: the sea urchin, Strongylocentrotus purpuratus . We identified more than 18,000 enhancers that were active in mature oocytes and developing embryos and documented a burst of enhancer activation during cleavage and early blastula stages. We found that a large fraction (73.8%) of all enhancers active during the first 48 h of embryogenesis were hyperaccessible no later than the 128-cell stage and possibly even earlier. Most enhancers were located near gene bodies, and temporal patterns of eRNA expression tended to parallel those of nearby genes. Furthermore, enhancers near lineage-specific genes contained signatures of inputs from developmental gene regulatory networks deployed in those lineages. A large fraction (60%) of sea urchin enhancers previously shown to be active in transgenic reporter assays was associated with eRNA expression. Moreover, a large fraction (50%) of a representative subset of enhancers identified by eRNA profiling drove tissue-specific gene expression in isolation when tested by reporter assays. Our findings provide an atlas of developmental enhancers in a model sea urchin and support the utility of eRNA profiling as a tool for enhancer discovery andmore »regulatory biology. The data generated in this study are available at Echinobase, the public database of information related to echinoderm genomics.« less
  4. Abstract cis-regulatory modules(CRMs) formed by clusters of transcription factor (TF) binding sites (TFBSs) are as important as coding sequences in specifying phenotypes of humans. It is essential to categorize all CRMs and constituent TFBSs in the genome. In contrast to most existing methods that predict CRMs in specific cell types using epigenetic marks, we predict a largely cell type agonistic but more comprehensive map of CRMs and constituent TFBSs in the gnome by integrating all available TF ChIP-seq datasets. Our method is able to partition 77.47% of genome regions covered by available 6092 datasets into a CRM candidate (CRMC) set (56.84%) and a non-CRMC set (43.16%). Intriguingly, the predicted CRMCs are under strong evolutionary constraints, while the non-CRMCs are largely selectively neutral, strongly suggesting that the CRMCs are likely cis-regulatory, while the non-CRMCs are not. Our predicted CRMs are under stronger evolutionary constraints than three state-of-the-art predictions (GeneHancer, EnhancerAtlas and ENCODE phase 3) and substantially outperform them for recalling VISTA enhancers and non-coding ClinVar variants. We estimated that the human genome might encode about 1.47M CRMs and 68M TFBSs, comprising about 55% and 22% of the genome, respectively; for both of which, we predicted 80%. Therefore, the cis-regulatory genome appears to be more prevalent thanmore »originally thought.« less
  5. The mechanism by which transcriptional machinery is recruited to enhancers and promoters to regulate gene expression is one of the most challenging and extensively studied questions in modern biology. Here, we ask if inter-allelic interactions between two homologous alleles can affect gene regulation. Using MS2- and PP7-based, allele-specific live imaging assay, we visualized de novo transcription of a reporter gene in hemizygous and homozygous Drosophila embryos. Surprisingly, each homozygous allele produced fewer RNAs than the hemizygous allele, suggesting the possibility of allelic competition in homozygotes. Moreover, the MS2-yellow reporter gene showed reduced transcriptional activity when a partial transcription unit (enhancer or promoter only) was in the homologous position. We propose that the transcriptional machinery that binds to both the enhancer and promoter region, such as RNA Pol II or preinitiation complexes, may be responsible for the allelic competition. To support this idea, we showed that the homologous alleles did not interfere with each other in earlier nuclear cycles when Pol II is in excess, while the degree of interference gradually increased in nuclear cycle 14. Such allelic competition was observed for endogenous snail as well. Our study provides new insights into the role of 3D inter-allelic interactions in gene regulation.