Epistasis between genes is traditionally studied with mutations that eliminate protein activity, but most natural genetic variation is in cis-regulatory DNA and influences gene expression and function quantitatively. In this study, we used natural and engineered cis-regulatory alleles in a plant stem-cell circuit to systematically evaluate epistatic relationships controlling tomato fruit size. Combining a promoter allelic series with two other loci, we collected over 30,000 phenotypic data points from 46 genotypes to quantify how allele strength transforms epistasis. We revealed a saturating dose-dependent relationship but also allele-specific idiosyncratic interactions, including between alleles driving a step change in fruit size during domestication. Our approach and findings expose an underexplored dimension of epistasis, in which cis-regulatory allelic diversity within gene regulatory networks elicits nonlinear, unpredictable interactions that shape phenotypes. 
                        more » 
                        « less   
                    
                            
                            A mathematical model exhibiting the effect of DNA methylation on the stability boundary in cell-fate networks
                        
                    
    
            Cell-fate networks are traditionally studied within the framework of gene regulatory networks. This paradigm considers only interactions of genes through expressed transcription factors and does not incorporate chromatin modification processes. This paper introduces a mathematical model that seamlessly combines gene regulatory networks and DNA methylation (DNAm), with the goal of quantitatively characterizing the contribution of epigenetic regulation to gene silencing. The ‘Basin of Attraction percentage’ is introduced as a metric to quantify gene silencing abilities. As a case study, a computational and theoretical analysis is carried out for a model of the pluripotent stem cell circuit as well as a simplified self-activating gene model. The results confirm that the methodology quantitatively captures the key role that DNAm plays in enhancing the stability of the silenced gene state. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 1849588
- PAR ID:
- 10206365
- Date Published:
- Journal Name:
- Epigenetics
- ISSN:
- 1559-2294
- Page Range / eLocation ID:
- 1 to 22
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            INTRODUCTION Genome-wide association studies (GWASs) have identified thousands of human genetic variants associated with diverse diseases and traits, and most of these variants map to noncoding loci with unknown target genes and function. Current approaches to understand which GWAS loci harbor causal variants and to map these noncoding regulators to target genes suffer from low throughput. With newer multiancestry GWASs from individuals of diverse ancestries, there is a pressing and growing need to scale experimental assays to connect GWAS variants with molecular mechanisms. Here, we combined biobank-scale GWASs, massively parallel CRISPR screens, and single-cell sequencing to discover target genes of noncoding variants for blood trait loci with systematic targeting and inhibition of noncoding GWAS loci with single-cell sequencing (STING-seq). RATIONALE Blood traits are highly polygenic, and GWASs have identified thousands of noncoding loci that map to candidate cis -regulatory elements (CREs). By combining CRE-silencing CRISPR perturbations and single-cell readouts, we targeted hundreds of GWAS loci in a single assay, revealing target genes in cis and in trans . For select CREs that regulate target genes, we performed direct variant insertion. Although silencing the CRE can identify the target gene, direct variant insertion can identify magnitude and direction of effect on gene expression for the GWAS variant. In select cases in which the target gene was a transcription factor or microRNA, we also investigated the gene-regulatory networks altered upon CRE perturbation and how these networks differ across blood cell types. RESULTS We inhibited candidate CREs from fine-mapped blood trait GWAS variants (from ~750,000 individual of diverse ancestries) in human erythroid progenitors. In total, we targeted 543 variants (254 loci) mapping to candidate CREs, generating multimodal single-cell data including transcriptome, direct CRISPR gRNA capture, and cell surface proteins. We identified target genes in cis (within 500 kb) for 134 CREs. In most cases, we found that the target gene was the closest gene and that specific enhancer-associated biochemical hallmarks (H3K27ac and accessible chromatin) are essential for CRE function. Using multiple perturbations at the same locus, we were able to distinguished between causal variants from noncausal variants in linkage disequilibrium. For a subset of validated CREs, we also inserted specific GWAS variants using base-editing STING-seq (beeSTING-seq) and quantified the effect size and direction of GWAS variants on gene expression. Given our transcriptome-wide data, we examined dosage effects in cis and trans in cases in which the cis target is a transcription factor or microRNA. We found that trans target genes are also enriched for GWAS loci, and identified gene clusters within trans gene networks with distinct biological functions and expression patterns in primary human blood cells. CONCLUSION In this work, we investigated noncoding GWAS variants at scale, identifying target genes in single cells. These methods can help to address the variant-to-function challenges that are a barrier for translation of GWAS findings (e.g., drug targets for diseases with a genetic basis) and greatly expand our ability to understand mechanisms underlying GWAS loci. Identifying causal variants and their target genes with STING-seq. Uncovering causal variants and their target genes or function are a major challenge for GWASs. STING-seq combines perturbation of noncoding loci with multimodal single-cell sequencing to profile hundreds of GWAS loci in parallel. This approach can identify target genes in cis and trans , measure dosage effects, and decipher gene-regulatory networks.more » « less
- 
            Inferring gene regulatory networks (GRNs) from single-cell gene expression datasets is a challenging task. Existing methods are often designed heuristically for specific datasets and lack the flexibility to incorporate additional information or compare against other algorithms. Further, current GRN inference methods do not provide uncertainty estimates with respect to the interactions that they predict, making inferred networks challenging to interpret. To overcome these challenges, we introduce Probabilistic Matrix Factorization for Gene Regulatory Network inference (PMF-GRN). PMF-GRN uses single-cell gene expression data to learn latent factors representing transcription factor activity as well as regulatory relationships between transcription factors and their target genes. This approach incorporates available experimental evidence into prior distributions over latent factors and scales well to single-cell gene expression datasets. By utilizing variational inference, we facilitate hyperparameter search for principled model selection and direct comparison to other generative models. To assess the accuracy of our method, we evaluate PMF-GRN using the model organisms Saccharomyces cerevisiae and Bacillus subtilis, benchmarking against database-derived gold standard interactions. We discover that, on average, PMF-GRN infers GRNs more accurately than current state-of-the-art single-cell GRN inference methods. Moreover, our PMF-GRN approach offers well-calibrated uncertainty estimates, as it performs gene regulatory network (GRN) inference in a probabilistic setting. These estimates are valuable for validation purposes, particularly when validated interactions are limited or a gold standard is incomplete.more » « less
- 
            Local chromatin context regulates the genetic requirements of the heterochromatin spreading reactionvan Steensel, Bas (Ed.)Heterochromatin spreading, the expansion of repressive chromatin structure from sequence-specific nucleation sites, is critical for stable gene silencing. Spreading re-establishes gene-poor constitutive heterochromatin across cell cycles but can also invade gene-rich euchromatin de novo to steer cell fate decisions. How chromatin context (i.e. euchromatic, heterochromatic) or different nucleation pathways influence heterochromatin spreading remains poorly understood. Previously, we developed a single-cell sensor in fission yeast that can separately record heterochromatic gene silencing at nucleation sequences and distal sites. Here we couple our quantitative assay to a genetic screen to identify genes encoding nuclear factors linked to the regulation of heterochromatin nucleation and the distal spreading of gene silencing. We find that mechanisms underlying gene silencing distal to a nucleation site differ by chromatin context. For example, Clr6 histone deacetylase complexes containing the Fkh2 transcription factor are specifically required for heterochromatin spreading at constitutive sites. Fkh2 recruits Clr6 to nucleation-distal chromatin sites in such contexts. In addition, we find that a number of chromatin remodeling complexes antagonize nucleation-distal gene silencing. Our results separate the regulation of heterochromatic gene silencing at nucleation versus distal sites and show that it is controlled by context-dependent mechanisms. The results of our genetic analysis constitute a broad community resource that will support further analysis of the mechanisms underlying the spread of epigenetic silencing along chromatin.more » « less
- 
            Background: Single-cell gene expression measurements offer opportunities in deriving mechanistic understanding of complex diseases, including cancer. However, due to the complex regulatory machinery of the cell, gene regulatory network (GRN) model inference based on such data still manifests significant uncertainty. Results:The goal of this paper is to develop optimal classification of single-cell trajectories accounting for potential model uncertainty. Partially-observed Boolean dynamical systems (POBDS) are used for modeling gene regulatory networks observed through noisy gene-expression data. We derive the exact optimal Bayesian classifier (OBC) for binary classification of single-cell trajectories. The application of the OBC becomes impractical for large GRNs, due to computational and memory requirements. To address this, we introduce a particle-based single-cell classification method that is highly scalable for large GRNs with much lower complexity than the optimal solution. Conclusion:The performance of the proposed particle-based method is demonstrated through numerical experiments using a POBDS model of the well-known T-cell large granular lymphocyte (T-LGL) leukemia network with noisy time-series gene-expressionmore » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    