Abstract Cis-regulatory elements (CREs) control gene expression, orchestrating tissue identity, developmental timing and stimulus responses, which collectively define the thousands of unique cell types in the body1–3. While there is great potential for strategically incorporating CREs in therapeutic or biotechnology applications that require tissue specificity, there is no guarantee that an optimal CRE for these intended purposes has arisen naturally. Here we present a platform to engineer and validate synthetic CREs capable of driving gene expression with programmed cell-type specificity. We take advantage of innovations in deep neural network modelling of CRE activity across three cell types, efficient in silico optimization and massively parallel reporter assays to design and empirically test thousands of CREs4–8. Through large-scale in vitro validation, we show that synthetic sequences are more effective at driving cell-type-specific expression in three cell lines compared with natural sequences from the human genome and achieve specificity in analogous tissues when tested in vivo. Synthetic sequences exhibit distinct motif vocabulary associated with activity in the on-target cell type and a simultaneous reduction in the activity of off-target cells. Together, we provide a generalizable framework to prospectively engineer CREs from massively parallel reporter assay models and demonstrate the required literacy to write fit-for-purpose regulatory code. 
                        more » 
                        « less   
                    
                            
                            A novel role for trithorax in the gene regulatory network for a rapidly evolving fruit fly pigmentation trait
                        
                    
    
            Animal traits develop through the expression and action of numerous regulatory and realizator genes that comprise a gene regulatory network (GRN). For each GRN, its underlying patterns of gene expression are controlled by cis -regulatory elements (CREs) that bind activating and repressing transcription factors. These interactions drive cell-type and developmental stage-specific transcriptional activation or repression. Most GRNs remain incompletely mapped, and a major barrier to this daunting task is CRE identification. Here, we used an in silico method to identify predicted CREs (pCREs) that comprise the GRN which governs sex-specific pigmentation of Drosophila melanogaster . Through in vivo assays, we demonstrate that many pCREs activate expression in the correct cell-type and developmental stage. We employed genome editing to demonstrate that two CREs control the pupal abdomen expression of trithorax , whose function is required for the dimorphic phenotype. Surprisingly, trithorax had no detectable effect on this GRN’s key trans -regulators, but shapes the sex-specific expression of two realizator genes. Comparison of sequences orthologous to these CREs supports an evolutionary scenario where these trithorax CREs predated the origin of the dimorphic trait. Collectively, this study demonstrates how in silico approaches can shed novel insights on the GRN basis for a trait’s development and evolution. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 2211833
- PAR ID:
- 10421262
- Editor(s):
- Kopp, Artyom
- Date Published:
- Journal Name:
- PLOS Genetics
- Volume:
- 19
- Issue:
- 2
- ISSN:
- 1553-7404
- Page Range / eLocation ID:
- e1010653
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            Abstract Sex determination, the developmental process by which sexually dimorphic phenotypes are established, evolves fast. Evolutionary turnover in a sex determination pathway may occur via selection on alleles that are genetically linked to a new master sex determining locus on a newly formed proto‐sex chromosome. Species with polygenic sex determination, in which master regulatory genes are found on multiple different proto‐sex chromosomes, are informative models to study the evolution of sex determination and sex chromosomes. House flies are such a model system, with male determining loci possible on all six chromosomes and a female‐determiner on one of the chromosomes as well. The two most common male‐determining proto‐Y chromosomes form latitudinal clines on multiple continents, suggesting that temperature variation is an important selection pressure responsible for maintaining polygenic sex determination in this species. Temperature‐dependent fitness effects could be manifested through temperature‐dependent gene expression differences across proto‐Y chromosome genotypes. These gene expression differences may be the result ofcisregulatory variants that affect the expression of genes on the proto‐sex chromosomes, ortranseffects of the proto‐Y chromosomes on genes elswhere in the genome. We used RNA‐seq to identify genes whose expression depends on proto‐Y chromosome genotype and temperature in adult male house flies. We found no evidence for ecologically meaningful temperature‐dependent expression differences of sex determining genes between male genotypes, but we were probably not sampling an appropriate developmental time‐point to identify such effects. In contrast, we identified many other genes whose expression depends on the interaction between proto‐Y chromosome genotype and temperature, including genes that encode proteins involved in reproduction, metabolism, lifespan, stress response, and immunity. Notably, genes with genotype‐by‐temperature interactions on expression were not enriched on the proto‐sex chromosomes. Moreover, there was no evidence that temperature‐dependent expression is driven by chromosome‐widecis‐regulatory divergence between the proto‐Y and proto‐X alleles. Therefore, if temperature‐dependent gene expression is responsible for differences in phenotypes and fitness of proto‐Y genotypes across house fly populations, these effects are driven by a small number of temperature‐dependent alleles on the proto‐Y chromosomes that may havetranseffects on the expression of genes on other chromosomes.more » « less
- 
            Khila, Abderrahman (Ed.)The evolution of sexual secondary characteristics necessitates regulatory factors that confer sexual identity to differentiating tissues and cells. InColias eurythemebutterflies, males exhibit two specialized wing scale types—ultraviolet-iridescent (UVI) and spatulate scales—which are absent in females and likely integral to male courtship behavior. This study investigates the regulatory mechanisms and single-nucleus transcriptomics underlying these two sexually dimorphic cell types during wing development. We show thatDoublesex(Dsx) expression is itself dimorphic and required to repress the UVI cell state in females, while unexpectedly, UVI activation in males is independent fromDsx. In the melanic marginal band,Dsxis required in each sex to enforce the presence of spatulate scales in males, and their absence in females. Single-nucleus RNAseq reveals that UVI and spatulate scale cell precursors each show distinctive gene expression profiles at 40% of pupal development, with marker genes that include regulators of transcription, cell signaling, cytoskeletal patterning, and chitin secretion. Both male-specific cell types share a low expression of theBric-a-brac(Bab) transcription factor, a key repressor of the UVI fate. Bab ChIP-seq profiling suggests that Bab binds thecis-regulatory regions of gene markers associated to UVI fate, including potential effector genes involved in the regulation of cytoskeletal processes and chitin secretion, and loci showing signatures of recent selective sweeps in a UVI-polymorphic population. These findings open new avenues for exploring wing patterning and scale development, shedding light on the mechanisms driving the specification of sex-specific cell states and the differentiation of specialized cell ultrastructures.more » « less
- 
            The gene regulatory network (GRN) that underlies echinoderm skeletogenesis is a prominent model of GRN architecture and evolution. KirrelL is an essential downstream effector gene in this network and encodes an Ig-superfamily protein required for the fusion of skeletogenic cells and the formation of the skeleton. In this study, we dissected the transcriptional control region of the kirrelL gene of the purple sea urchin, Strongylocentrotus purpuratus . Using plasmid- and bacterial artificial chromosome-based transgenic reporter assays, we identified key cis -regulatory elements (CREs) and transcription factor inputs that regulate Sp-kirrelL , including direct, positive inputs from two key transcription factors in the skeletogenic GRN, Alx1 and Ets1. We next identified kirrelL cis -regulatory regions from seven other echinoderm species that together represent all classes within the phylum. By introducing these heterologous regulatory regions into developing sea urchin embryos we provide evidence of their remarkable conservation across ~500 million years of evolution. We dissected in detail the kirrelL regulatory region of the sea star, Patiria miniata , and demonstrated that it also receives direct inputs from Alx1 and Ets1. Our findings identify kirrelL as a component of the ancestral echinoderm skeletogenic GRN. They support the view that GRN subcircuits, including specific transcription factor–CRE interactions, can remain stable over vast periods of evolutionary history. Lastly, our analysis of kirrelL establishes direct linkages between a developmental GRN and an effector gene that controls a key morphogenetic cell behavior, cell–cell fusion, providing a paradigm for extending the explanatory power of GRNs.more » « less
- 
            Inferring gene regulatory networks (GRNs) from single-cell gene expression datasets is a challenging task. Existing methods are often designed heuristically for specific datasets and lack the flexibility to incorporate additional information or compare against other algorithms. Further, current GRN inference methods do not provide uncertainty estimates with respect to the interactions that they predict, making inferred networks challenging to interpret. To overcome these challenges, we introduce Probabilistic Matrix Factorization for Gene Regulatory Network inference (PMF-GRN). PMF-GRN uses single-cell gene expression data to learn latent factors representing transcription factor activity as well as regulatory relationships between transcription factors and their target genes. This approach incorporates available experimental evidence into prior distributions over latent factors and scales well to single-cell gene expression datasets. By utilizing variational inference, we facilitate hyperparameter search for principled model selection and direct comparison to other generative models. To assess the accuracy of our method, we evaluate PMF-GRN using the model organisms Saccharomyces cerevisiae and Bacillus subtilis, benchmarking against database-derived gold standard interactions. We discover that, on average, PMF-GRN infers GRNs more accurately than current state-of-the-art single-cell GRN inference methods. Moreover, our PMF-GRN approach offers well-calibrated uncertainty estimates, as it performs gene regulatory network (GRN) inference in a probabilistic setting. These estimates are valuable for validation purposes, particularly when validated interactions are limited or a gold standard is incomplete.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    