skip to main content


This content will become publicly available on June 13, 2024

Title: Using knowledge graphs to infer gene expression in plants
Introduction Climate change is already affecting ecosystems around the world and forcing us to adapt to meet societal needs. The speed with which climate change is progressing necessitates a massive scaling up of the number of species with understood genotype-environment-phenotype (G×E×P) dynamics in order to increase ecosystem and agriculture resilience. An important part of predicting phenotype is understanding the complex gene regulatory networks present in organisms. Previous work has demonstrated that knowledge about one species can be applied to another using ontologically-supported knowledge bases that exploit homologous structures and homologous genes. These types of structures that can apply knowledge about one species to another have the potential to enable the massive scaling up that is needed through in silico experimentation. Methods We developed one such structure, a knowledge graph (KG) using information from Planteome and the EMBL-EBI Expression Atlas that connects gene expression, molecular interactions, functions, and pathways to homology-based gene annotations. Our preliminary analysis uses data from gene expression studies in Arabidopsis thaliana and Populus trichocarpa plants exposed to drought conditions. Results A graph query identified 16 pairs of homologous genes in these two taxa, some of which show opposite patterns of gene expression in response to drought. As expected, analysis of the upstream cis-regulatory region of these genes revealed that homologs with similar expression behavior had conserved cis-regulatory regions and potential interaction with similar trans-elements, unlike homologs that changed their expression in opposite ways. Discussion This suggests that even though the homologous pairs share common ancestry and functional roles, predicting expression and phenotype through homology inference needs careful consideration of integrating cis and trans-regulatory components in the curated and inferred knowledge graph.  more » « less
Award ID(s):
1940330 1940062
NSF-PAR ID:
10448755
Author(s) / Creator(s):
; ; ; ; ; ;
Date Published:
Journal Name:
Frontiers in Artificial Intelligence
Volume:
6
ISSN:
2624-8212
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Begun, D (Ed.)
    Abstract Changes in gene regulation at multiple levels may comprise an important share of the molecular changes underlying adaptive evolution in nature. However, few studies have assayed within- and between-population variation in gene regulatory traits at a transcriptomic scale, and therefore inferences about the characteristics of adaptive regulatory changes have been elusive. Here, we assess quantitative trait differentiation in gene expression levels and alternative splicing (intron usage) between three closely related pairs of natural populations of Drosophila melanogaster from contrasting thermal environments that reflect three separate instances of cold tolerance evolution. The cold-adapted populations were known to show population genetic evidence for parallel evolution at the SNP level, and here we find evidence for parallel expression evolution between them, with stronger parallelism at larval and adult stages than for pupae. We also implement a flexible method to estimate cis- vs trans-encoded contributions to expression or splicing differences at the adult stage. The apparent contributions of cis- vs trans-regulation to adaptive evolution vary substantially among population pairs. While two of three population pairs show a greater enrichment of cis-regulatory differences among adaptation candidates, trans-regulatory differences are more likely to be implicated in parallel expression changes between population pairs. Genes with significant cis-effects are enriched for signals of elevated genetic differentiation between cold- and warm-adapted populations, suggesting that they are potential targets of local adaptation. These findings expand our knowledge of adaptive gene regulatory evolution and our ability to make inferences about this important and widespread process. 
    more » « less
  2. Abstract

    The evolution of gene expression viacis‐regulatory changes is well established as a major driver of phenotypic evolution. However, relatively little is known about the influence of enhancer architecture and intergenic interactions on regulatory evolution. We address this question by examining chemosensory system evolution inDrosophila.Drosophila prolongatamales show a massively increased number of chemosensory bristles compared to females and males of sibling species. This increase is driven by sex‐specific transformation of ancestrally mechanosensory organs. Consistent with this phenotype, thePox neurotranscription factor (Poxn), which specifies chemosensory bristle identity, shows expanded expression inD. prolongatamales.Poxnexpression is controlled by nonadditive interactions among widely dispersed enhancers. Although someD. prolongata Poxnenhancers show increased activity, the additive component of this increase is slight, suggesting that most changes inPoxnexpression are due to epistatic interactions betweenPoxnenhancers andtrans‐regulatory factors. Indeed, the expansion ofD. prolongata Poxnenhancer activity is only observed in cells that expressdoublesex(dsx), the gene that controls sexual differentiation inDrosophilaand also shows increased expression inD. prolongatamales due tocis‐regulatory changes. Although expandeddsxexpression may contribute to increased activity ofD. prolongata Poxnenhancers, this interaction is not sufficient to explain the full expansion ofPoxnexpression, suggesting thatcistransinteractions betweenPoxn, dsx, and additional unknown genes are necessary to produce the derivedD. prolongataphenotype. Overall, our results demonstrate the importance of epistatic gene interactions for evolution, particularly when pivotal genes have complex regulatory architecture.

     
    more » « less
  3. Corresponding attributes of neural development and functionsuggest arthropod and vertebrate brains may have an evolutionarily conserved organization. However, the underlying mechanisms have remained elusive. Here, we identify a gene regulatory and character identity network defining the deutocerebral– tritocerebral boundary (DTB) in Drosophila. This network comprises genes homologous to those directing midbrain-hindbrainboundary (MHB) formation in vertebrates and their closest chordate relatives.Genetic tracing reveals that the embryonic DTB gives rise to adult midbrain circuits that in flies control auditory and vestibular information processing and motor coordination, as do MHB-derived circuits in vertebrates. DTB-specific gene expression and function are directed by cis-regulatory elements of developmental control genes that include homologs of mammalian Zinc finger of the cerebellum and Purkinje cell protein 4. Drosophila DTB-specific cis-regulatory elements correspond to regulatory sequences of human ENGRAILED-2, PAX-2, and DACHSHUND-1 that direct MHB-specific expression in the embryonic mouse brain. We show that cis-regulatory elements and the gene networks they regulate direct the formation and function of midbrain circuits for balance and motor coordination in insects and mammals. Regulatory mechanisms mediating the genetic specification of cephalic neural circuits in arthropods correspond to those in chordates, thereby implying their origin before the divergence of deuterostomes and ecdysozoans. 
    more » « less
  4. Lasky, Jesse R. (Ed.)

    Gene expression can be influenced by genetic variants that are closely linked to the expressed gene (cis eQTLs) and variants in other parts of the genome (trans eQTLs). We created a multiparental mapping population by sampling genotypes from a single natural population ofMimulus guttatusand scored gene expression in the leaves of 1,588 plants. We find that nearly every measured gene exhibits cis regulatory variation (91% have FDR < 0.05). cis eQTLs are usually allelic series with three or more functionally distinct alleles. The cis locus explains about two thirds of the standing genetic variance (on average) but varies among genes and tends to be greatest when there is high indel variation in the upstream regulatory region and high nucleotide diversity in the coding sequence. Despite mapping over 10,000 trans eQTL / affected gene pairs, most of the genetic variance generated by trans acting loci remains unexplained. This implies a large reservoir of trans acting genes with subtle or diffuse effects. Mapped trans eQTLs show lower allelic diversity but much higher genetic dominance than cis eQTLs. Several analyses also indicate that trans eQTLs make a substantial contribution to the genetic correlations in expression among different genes. They may thus be essential determinants of “gene expression modules,” which has important implications for the evolution of gene expression and how it is studied by geneticists.

     
    more » « less
  5. Abstract

    Sex determination, the developmental process by which sexually dimorphic phenotypes are established, evolves fast. Evolutionary turnover in a sex determination pathway may occur via selection on alleles that are genetically linked to a new master sex determining locus on a newly formed proto‐sex chromosome. Species with polygenic sex determination, in which master regulatory genes are found on multiple different proto‐sex chromosomes, are informative models to study the evolution of sex determination and sex chromosomes. House flies are such a model system, with male determining loci possible on all six chromosomes and a female‐determiner on one of the chromosomes as well. The two most common male‐determining proto‐Y chromosomes form latitudinal clines on multiple continents, suggesting that temperature variation is an important selection pressure responsible for maintaining polygenic sex determination in this species. Temperature‐dependent fitness effects could be manifested through temperature‐dependent gene expression differences across proto‐Y chromosome genotypes. These gene expression differences may be the result ofcisregulatory variants that affect the expression of genes on the proto‐sex chromosomes, ortranseffects of the proto‐Y chromosomes on genes elswhere in the genome. We used RNA‐seq to identify genes whose expression depends on proto‐Y chromosome genotype and temperature in adult male house flies. We found no evidence for ecologically meaningful temperature‐dependent expression differences of sex determining genes between male genotypes, but we were probably not sampling an appropriate developmental time‐point to identify such effects. In contrast, we identified many other genes whose expression depends on the interaction between proto‐Y chromosome genotype and temperature, including genes that encode proteins involved in reproduction, metabolism, lifespan, stress response, and immunity. Notably, genes with genotype‐by‐temperature interactions on expression were not enriched on the proto‐sex chromosomes. Moreover, there was no evidence that temperature‐dependent expression is driven by chromosome‐widecis‐regulatory divergence between the proto‐Y and proto‐X alleles. Therefore, if temperature‐dependent gene expression is responsible for differences in phenotypes and fitness of proto‐Y genotypes across house fly populations, these effects are driven by a small number of temperature‐dependent alleles on the proto‐Y chromosomes that may havetranseffects on the expression of genes on other chromosomes.

     
    more » « less