skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Learning gene networks under SNP perturbation using SNP and allele-specific expression data
Abstract Allele-specific expression quantification from RNA-seq reads provides opportunities to study the control of gene regulatory networks bycis-acting andtrans-acting genetic variants. Many existing methods performed a single-gene and single-SNP association analysis to identify expression quantitative trait loci (eQTLs), and placed the eQTLs against known gene networks for functional interpretation. Instead, we view eQTL data as a capture of the effects of perturbation of gene regulatory system by a large number of genetic variants and reconstruct a gene network perturbed by eQTLs. We introduce a statistical framework called CiTruss for simultaneously learning a gene network andcis-acting andtrans-acting eQTLs that perturb this network, given population allele-specific expression and SNP data. CiTruss uses a multi-level conditional Gaussian graphical model to modeltrans-acting eQTLs perturbing the expression of both alleles in gene network at the top level andcis-acting eQTLs perturbing the expression of each allele at the bottom level. We derive a transformation of this model that allows efficient learning for large-scale human data. Our analysis of the GTEx and LG×SM advanced intercross line mouse data for multiple tissue types with CiTruss provides new insights into genetics of gene regulation. CiTruss revealed that gene networks consist of local subnetworks over proximally located genes and global subnetworks over genes scattered across genome, and that several aspects of gene regulation by eQTLs such as the impact of genetic diversity, pleiotropy, tissue-specific gene regulation, and local and long-range linkage disequilibrium among eQTLs can be explained through these local and global subnetworks.  more » « less
Award ID(s):
2505285 2154089
PAR ID:
10611873
Author(s) / Creator(s):
;
Publisher / Repository:
bioRxiv
Date Published:
Format(s):
Medium: X
Institution:
bioRxiv
Sponsoring Org:
National Science Foundation
More Like this
  1. Lasky, Jesse R. (Ed.)
    Gene expression can be influenced by genetic variants that are closely linked to the expressed gene (cis eQTLs) and variants in other parts of the genome (trans eQTLs). We created a multiparental mapping population by sampling genotypes from a single natural population ofMimulus guttatusand scored gene expression in the leaves of 1,588 plants. We find that nearly every measured gene exhibits cis regulatory variation (91% have FDR < 0.05). cis eQTLs are usually allelic series with three or more functionally distinct alleles. The cis locus explains about two thirds of the standing genetic variance (on average) but varies among genes and tends to be greatest when there is high indel variation in the upstream regulatory region and high nucleotide diversity in the coding sequence. Despite mapping over 10,000 trans eQTL / affected gene pairs, most of the genetic variance generated by trans acting loci remains unexplained. This implies a large reservoir of trans acting genes with subtle or diffuse effects. Mapped trans eQTLs show lower allelic diversity but much higher genetic dominance than cis eQTLs. Several analyses also indicate that trans eQTLs make a substantial contribution to the genetic correlations in expression among different genes. They may thus be essential determinants of “gene expression modules,” which has important implications for the evolution of gene expression and how it is studied by geneticists. 
    more » « less
  2. Abstract Genome‐wide expression quantitative trait loci (eQTLs) mapping explores the relationship between gene expression and DNA variants, such as single‐nucleotide polymorphism (SNPs), to understand genetic basis of human diseases. Due to the large number of genes and SNPs that need to be assessed, current methods for eQTL mapping often suffer from low detection power, especially for identifyingtrans‐eQTLs. In this paper, we propose the idea of performing SNP ranking based on the higher criticism statistic, a summary statistic developed in large‐scale signal detection. We illustrate how the HC‐based SNP ranking can effectively prioritize eQTL signals over noise, greatly reduce the burden of joint modeling, and improve the power for eQTL mapping. Numerical results in simulation studies demonstrate the superior performance of our method compared to existing methods. The proposed method is also evaluated in HapMap eQTL data analysis and the results are compared to a database of known eQTLs. 
    more » « less
  3. Abstract BackgroundGenetic and epigenetic perturbation of cis-regulatory sequences can shift patterns of gene expression and result in novel phenotypes. Phased genome assemblies now enable the local dissection of linkages between cis-regulatory sequences, including their epigenetic state, and allele-specific gene expression to further characterize gene regulation and resulting phenotypes in heterozygous genomes. ResultsWe assembled a locally phased genome for a mandarin hybrid named ‘Fairchild’ to explore the molecular signatures of allele-specific gene expression. With local genome phasing, genes with allele-specific expression were paired with haplotype-specific chromatin states, including levels of chromatin accessibility, histone modifications, and DNA methylation. We found that 30% of variation in allele-specific expression could be attributed to haplotype associated factors, with allelic levels of chromatin accessibility and three histone modifications in gene bodies having the most influence. Structural variants in promoter regions were also associated with allele-specific expression, including specific enrichments of hAT and MULE-MuDR DNA transposon sequences. Integration of haplotype-resolved genetic and epigenetic landscapes with high-throughput phenotypic analysis of fruit traits in a panel of 154 accessions with mandarin and pummelo ancestry revealed that trait-associated variants were enriched in regions of open chromatin. Mining of trait-associated variants uncovered a Gypsy retrotransposon insertion in a gene that regulates potassium transport and may contribute to the reduction in fruit size that is observed in mandarins. Conclusions​​Using a locally phased assembly of a heterozygous cultivar of citrus, we dissected the interplay between genetic variants and molecular phenotypes to reveal cis-regulatory sequences with potential functional effects on phenotypes relevant for genetic improvement. 
    more » « less
  4. Interspecific hybridization is a common and effective strategy for producing disease resilient citrus cultivars, including those with tolerance to Huanglongbing (HLB) disease. Several HLB-tolerant cultivars have been developed through hybridization of mandarins (Citrus reticulata) with their wild relativePoncirus trifoliata. One such cultivar, ‘US-897’, exhibits robust tolerance to the bacteria causing HLB disease,Candidatus Liberibacter asiaticus(CLas). To explore the genetic architecture of the early transcriptional response toCandidatusLiberibacter asiaticus (CLas) infection in ‘US-897’, we performed transcriptomic analysis of the hybrid and its parents, ‘Cleopatra’ (C. reticulata) and ‘Flying Dragon’ (P. trifoliata). A haplotype-resolved genome for ‘US-897’ was generated using PacBio HiFi sequencing reads to support quantification of the expression of both theCitrus and Poncirusalleles. By profiling gene expression in this parent-offspring trio, we were able to determine the mode of inheritance for genes differentially expressed between parents (‘Cleopatra’ and ‘Flying Dragon’) and their interspecific hybrid (‘US-897’), with the majority genes exhibiting non-additive patterns of gene expression inheritance. Additionally, analysis of allele-specific expression in the hybrid ‘US-897’ revealed the contribution of cis- versus trans-acting regulatory variants on genes with additive and non-additive modes of inheritance. A strong correlation between differential expression between parents and allele-specific expression in ‘US-897’ suggests that cis-regulatory variation is a significant source of expression divergence between species. Finally, genes responsive to infection withCLas were identified to explore how gene regulation associated with tolerance to HLB was rewired betweenCitrusand its relativePoncirus. 
    more » « less
  5. Barbash, Daniel (Ed.)
    Abstract To understand the relative importance of cis and trans effects on regulation, we crossed multi-parent recombinant-inbred-lines (RILs) to a common tester and measured allele specific gene expression in the offspring. Testing difference of allelic imbalance between two RIL x Tester crosses is a test of cis or trans depending on the RIL alleles compared. The study design also enables to separate two sources of trans variation, genetic and environmental, detected via interactions with cis effects. We demonstrate the effectiveness of this approach in a long-read RNA-seq experiment in female abdominal tissue at two time points in Drosophila melanogaster. Among the 40% of all loci that show evidence of genetic variation in cis, trans effects due to environment are detectable in 31% of loci and trans effects due to genetic background in 19%, with little overlap in sources of trans variation. The genes identified in this study are associated with genes previously reported to exhibit genetic variation in gene expression. Eleven genes in a QTL for thermotolerance, previously shown to differ in expression based on temperature, have evidence for regulation of gene expression regardless of the environment, including the cuticular protein Cpr67B, suggesting a functional role for standing variation in gene expression. This study provides a blueprint for identifying regulatory variation in gene expression, as the tester design maximizes cis variation and enables the efficient assessment of all pairs of RIL alleles relative to the tester, a much smaller study compared to the pairwise direct assessment. 
    more » « less