skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Using knowledge graphs to infer gene expression in plants
Introduction Climate change is already affecting ecosystems around the world and forcing us to adapt to meet societal needs. The speed with which climate change is progressing necessitates a massive scaling up of the number of species with understood genotype-environment-phenotype (G×E×P) dynamics in order to increase ecosystem and agriculture resilience. An important part of predicting phenotype is understanding the complex gene regulatory networks present in organisms. Previous work has demonstrated that knowledge about one species can be applied to another using ontologically-supported knowledge bases that exploit homologous structures and homologous genes. These types of structures that can apply knowledge about one species to another have the potential to enable the massive scaling up that is needed through in silico experimentation. Methods We developed one such structure, a knowledge graph (KG) using information from Planteome and the EMBL-EBI Expression Atlas that connects gene expression, molecular interactions, functions, and pathways to homology-based gene annotations. Our preliminary analysis uses data from gene expression studies in Arabidopsis thaliana and Populus trichocarpa plants exposed to drought conditions. Results A graph query identified 16 pairs of homologous genes in these two taxa, some of which show opposite patterns of gene expression in response to drought. As expected, analysis of the upstream cis-regulatory region of these genes revealed that homologs with similar expression behavior had conserved cis-regulatory regions and potential interaction with similar trans-elements, unlike homologs that changed their expression in opposite ways. Discussion This suggests that even though the homologous pairs share common ancestry and functional roles, predicting expression and phenotype through homology inference needs careful consideration of integrating cis and trans-regulatory components in the curated and inferred knowledge graph.  more » « less
Award ID(s):
1940330 1940062
PAR ID:
10448755
Author(s) / Creator(s):
; ; ; ; ; ;
Date Published:
Journal Name:
Frontiers in Artificial Intelligence
Volume:
6
ISSN:
2624-8212
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Begun, D (Ed.)
    Abstract Changes in gene regulation at multiple levels may comprise an important share of the molecular changes underlying adaptive evolution in nature. However, few studies have assayed within- and between-population variation in gene regulatory traits at a transcriptomic scale, and therefore inferences about the characteristics of adaptive regulatory changes have been elusive. Here, we assess quantitative trait differentiation in gene expression levels and alternative splicing (intron usage) between three closely related pairs of natural populations of Drosophila melanogaster from contrasting thermal environments that reflect three separate instances of cold tolerance evolution. The cold-adapted populations were known to show population genetic evidence for parallel evolution at the SNP level, and here we find evidence for parallel expression evolution between them, with stronger parallelism at larval and adult stages than for pupae. We also implement a flexible method to estimate cis- vs trans-encoded contributions to expression or splicing differences at the adult stage. The apparent contributions of cis- vs trans-regulation to adaptive evolution vary substantially among population pairs. While two of three population pairs show a greater enrichment of cis-regulatory differences among adaptation candidates, trans-regulatory differences are more likely to be implicated in parallel expression changes between population pairs. Genes with significant cis-effects are enriched for signals of elevated genetic differentiation between cold- and warm-adapted populations, suggesting that they are potential targets of local adaptation. These findings expand our knowledge of adaptive gene regulatory evolution and our ability to make inferences about this important and widespread process. 
    more » « less
  2. Lasky, Jesse R. (Ed.)
    Gene expression can be influenced by genetic variants that are closely linked to the expressed gene (cis eQTLs) and variants in other parts of the genome (trans eQTLs). We created a multiparental mapping population by sampling genotypes from a single natural population ofMimulus guttatusand scored gene expression in the leaves of 1,588 plants. We find that nearly every measured gene exhibits cis regulatory variation (91% have FDR < 0.05). cis eQTLs are usually allelic series with three or more functionally distinct alleles. The cis locus explains about two thirds of the standing genetic variance (on average) but varies among genes and tends to be greatest when there is high indel variation in the upstream regulatory region and high nucleotide diversity in the coding sequence. Despite mapping over 10,000 trans eQTL / affected gene pairs, most of the genetic variance generated by trans acting loci remains unexplained. This implies a large reservoir of trans acting genes with subtle or diffuse effects. Mapped trans eQTLs show lower allelic diversity but much higher genetic dominance than cis eQTLs. Several analyses also indicate that trans eQTLs make a substantial contribution to the genetic correlations in expression among different genes. They may thus be essential determinants of “gene expression modules,” which has important implications for the evolution of gene expression and how it is studied by geneticists. 
    more » « less
  3. Gossmann, Toni (Ed.)
    Abstract Understanding and predicting the relationships between genotype and phenotype is often challenging, largely due to the complex nature of eukaryotic gene regulation. A step towards this goal is to map how phenotypic diversity evolves through genomic changes that modify gene regulatory interactions. Using the Prairie Rattlesnake (Crotalus viridis) and related species, we integrate mRNA-seq, proteomic, ATAC-seq and whole genome resequencing data to understand how specific evolutionary modifications to gene regulatory network components produce differences in venom gene expression. Through comparisons within and between species, we find a remarkably high degree of gene expression and regulatory network variation across even a shallow level of evolutionary divergence. We use these data to test hypotheses about the roles of specific trans-factors and cis-regulatory elements, how these roles may vary across venom genes and gene families, and how variation in regulatory systems drive diversity in venom phenotypes. Our results illustrate that differences in chromatin and genotype at regulatory elements play major roles in modulating expression. However, we also find that enhancer deletions, differences in transcription-factor expression, and variation in activity of the insulator protein CTCF also likely impact venom phenotypes. Our findings provide insight into the diversity and gene-specificity of gene regulatory features and highlight the value of comparative studies to link gene regulatory network variation to phenotypic variation. 
    more » « less
  4. Corresponding attributes of neural development and functionsuggest arthropod and vertebrate brains may have an evolutionarily conserved organization. However, the underlying mechanisms have remained elusive. Here, we identify a gene regulatory and character identity network defining the deutocerebral– tritocerebral boundary (DTB) in Drosophila. This network comprises genes homologous to those directing midbrain-hindbrainboundary (MHB) formation in vertebrates and their closest chordate relatives.Genetic tracing reveals that the embryonic DTB gives rise to adult midbrain circuits that in flies control auditory and vestibular information processing and motor coordination, as do MHB-derived circuits in vertebrates. DTB-specific gene expression and function are directed by cis-regulatory elements of developmental control genes that include homologs of mammalian Zinc finger of the cerebellum and Purkinje cell protein 4. Drosophila DTB-specific cis-regulatory elements correspond to regulatory sequences of human ENGRAILED-2, PAX-2, and DACHSHUND-1 that direct MHB-specific expression in the embryonic mouse brain. We show that cis-regulatory elements and the gene networks they regulate direct the formation and function of midbrain circuits for balance and motor coordination in insects and mammals. Regulatory mechanisms mediating the genetic specification of cephalic neural circuits in arthropods correspond to those in chordates, thereby implying their origin before the divergence of deuterostomes and ecdysozoans. 
    more » « less
  5. Barbash, Daniel (Ed.)
    Abstract To understand the relative importance of cis and trans effects on regulation, we crossed multi-parent recombinant-inbred-lines (RILs) to a common tester and measured allele specific gene expression in the offspring. Testing difference of allelic imbalance between two RIL x Tester crosses is a test of cis or trans depending on the RIL alleles compared. The study design also enables to separate two sources of trans variation, genetic and environmental, detected via interactions with cis effects. We demonstrate the effectiveness of this approach in a long-read RNA-seq experiment in female abdominal tissue at two time points in Drosophila melanogaster. Among the 40% of all loci that show evidence of genetic variation in cis, trans effects due to environment are detectable in 31% of loci and trans effects due to genetic background in 19%, with little overlap in sources of trans variation. The genes identified in this study are associated with genes previously reported to exhibit genetic variation in gene expression. Eleven genes in a QTL for thermotolerance, previously shown to differ in expression based on temperature, have evidence for regulation of gene expression regardless of the environment, including the cuticular protein Cpr67B, suggesting a functional role for standing variation in gene expression. This study provides a blueprint for identifying regulatory variation in gene expression, as the tester design maximizes cis variation and enables the efficient assessment of all pairs of RIL alleles relative to the tester, a much smaller study compared to the pairwise direct assessment. 
    more » « less