skip to main content


Title: Identification of significant gene expression changes in multiple perturbation experiments using knockoffs
Abstract

Large-scale multiple perturbation experiments have the potential to reveal a more detailed understanding of the molecular pathways that respond to genetic and environmental changes. A key question in these studies is which gene expression changes are important for the response to the perturbation. This problem is challenging because (i) the functional form of the nonlinear relationship between gene expression and the perturbation is unknown and (ii) identification of the most important genes is a high-dimensional variable selection problem. To deal with these challenges, we present here a method based on the model-X knockoffs framework and Deep Neural Networks to identify significant gene expression changes in multiple perturbation experiments. This approach makes no assumptions on the functional form of the dependence between the responses and the perturbations and it enjoys finite sample false discovery rate control for the selected set of important gene expression responses. We apply this approach to the Library of Integrated Network-Based Cellular Signature data sets which is a National Institutes of Health Common Fund program that catalogs how human cells globally respond to chemical, genetic and disease perturbations. We identified important genes whose expression is directly modulated in response to perturbation with anthracycline, vorinostat, trichostatin-a, geldanamycin and sirolimus. We compare the set of important genes that respond to these small molecules to identify co-responsive pathways. Identification of which genes respond to specific perturbation stressors can provide better understanding of the underlying mechanisms of disease and advance the identification of new drug targets.

 
more » « less
NSF-PAR ID:
10400977
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Briefings in Bioinformatics
Volume:
24
Issue:
2
ISSN:
1467-5463
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    In the past century, recently emerged infectious diseases have become major drivers of species decline and extinction. The fungal disease chytridiomycosis has devastated many amphibian populations and exacerbated the amphibian conservation crisis. Biologists are beginning to understand what host traits contribute to disease susceptibility, but more work is needed to determine why some species succumb to chytridiomycosis while others do not. We conducted an integrative laboratory experiment to examine how two toad species respond to infection with the pathogenBatrachochytrium dendrobatidisin a controlled environment. We selected two toad species thought to differ in susceptibility –Bufo marinus(an invasive and putatively resistant species) andBufo boreas(an endangered and putatively susceptible species). We measured infection intensity, body weight, histological changes and genomewide gene expression using a custom assay developed from transcriptome sequencing. Our results confirmed that the two species differ in susceptibility with the more susceptible species,B. boreas,showing higher infection intensities, loss in body weight, more dramatic histological changes and larger perturbations in gene expression. We found key differences in skin expression responses in multiple pathways including upregulation of skin integrity‐related genes in the resistantB. marinus. Together, our results show intrinsic differences in host response between related species, which are likely to be important in explaining variation in response to a deadly emerging pathogen in wild populations. Our study also underscores the importance of understanding differences among host species to better predict disease outcomes and reveal generalities about host response to emerging infectious diseases of wildlife.

     
    more » « less
  2. Abstract Background Titinopathies are inherited muscular diseases triggered by genetic mutations in the titin gene. Muscular dystrophy with myositis ( mdm ) is one such disease caused by a LINE repeat insertion, leading to exon skipping and an 83-amino acid residue deletion in the N2A-PEVK region of mouse titin. This region has been implicated in a number of titin—titin ligand interactions, hence are important for myocyte signaling and health. Mice with this mdm mutation develop a severe and progressive muscle degeneration. The range of phenotypic differences observed in mdm mice shows that the deletion of this region induces a cascade of transcriptional changes extending to numerous signaling pathways affected by the titin filament. Previous research has focused on correlating phenotypic differences with muscle function in mdm mice. These studies have provided understanding of the downstream physiological effects resulting from the mdm mutation but only provide insights on processes that can be physiologically observed and measured. We used differential gene expression (DGE) to compare the transcriptomes of extensor digitorum longus (EDL), psoas and soleus muscles from wild-type and mdm mice to develop a deeper understand of these tissue-specific responses. Results The overall expression pattern observed shows a well-differentiated transcriptional signature in mdm muscles compared to wild type. Muscle-specific clusters observed within the mdm transcriptome highlight the level of variability of each muscle to the deletion. Differential gene expression and weighted gene co-expression network analysis showed a strong directional response in oxidative respiration-associated mitochondrial genes, which aligns with the poor shivering and non-shivering thermogenesis previously observed. Sln, which is a marker associated with shivering and non-shivering thermogenesis, showed the strongest expression change in fast-fibered muscles. No drastic changes in MYH expression levels were reported, which indicated an absence of major fiber-type switching events. Overall expression shifts in MYH isoforms, MARPs, and extracellular matrix associated genes demonstrated the transcriptional complexity associated with mdm mutation. The expression alterations in mitochondrial respiration and metabolism related genes in the mdm muscle dominated over other transcriptomic changes, and likely account for the late stage cellular responses in the mdm muscles. Conclusions We were able to demonstrate that the complex nature of mdm mutation extends beyond a simple rearrangement in titin gene. EDL, psoas and soleus exemplify unique response modes observed in skeletal muscles with mdm mutation. Our data also raises the possibility that failure to maintain proper energy homeostasis in mdm muscles may contribute to the pathogenesis of the degenerative phenotype in mdm mice. Understanding the full disease-causing molecular cascade is difficult using bulk RNA sequencing techniques due to intricate nature of the disease. The development of the mdm phenotype is temporally and spatially regulated, hence future studies should focus on single fiber level investigations. 
    more » « less
  3. Abstract Background

    TheBIN1locus contains the second-most significant genetic risk factor for late-onset Alzheimer’s disease.BIN1undergoes alternate splicing to generate tissue- and cell-type-specific BIN1 isoforms, which regulate membrane dynamics in a range of crucial cellular processes. Whilst the expression of BIN1 in the brain has been characterized in neurons and oligodendrocytes in detail, information regarding microglial BIN1 expression is mainly limited to large-scale transcriptomic and proteomic data. Notably, BIN1 protein expression and its functional roles in microglia, a cell type most relevant to Alzheimer’s disease, have not been examined in depth.

    Methods

    Microglial BIN1 expression was analyzed by immunostaining mouse and human brain, as well as by immunoblot and RT-PCR assays of isolated microglia or human iPSC-derived microglial cells.Bin1expression was ablated by siRNA knockdown in primary microglial cultures in vitro and Cre-lox mediated conditional deletion in adult mouse brain microglia in vivo. Regulation of neuroinflammatory microglial signatures by BIN1 in vitro and in vivo was characterized using NanoString gene panels and flow cytometry methods. The transcriptome data was explored by in silico pathway analysis and validated by complementary molecular approaches.

    Results

    Here, we characterized microglial BIN1 expression in vitro and in vivo and ascertained microglia expressed BIN1 isoforms. By silencingBin1expression in primary microglial cultures, we demonstrate that BIN1 regulates the activation of proinflammatory and disease-associated responses in microglia as measured by gene expression and cytokine production. Our transcriptomic profiling revealed key homeostatic and lipopolysaccharide (LPS)-induced inflammatory response pathways, as well as transcription factors PU.1 and IRF1 that are regulated by BIN1. Microglia-specificBin1conditional knockout in vivo revealed novel roles of BIN1 in regulating the expression of disease-associated genes while counteracting CX3CR1 signaling. The consensus from in vitro and in vivo findings showed that loss ofBin1impaired the ability of microglia to mount type 1 interferon responses to proinflammatory challenge, particularly the upregulation of a critical type 1 immune response gene,Ifitm3.

    Conclusions

    Our convergent findings provide novel insights into microglial BIN1 function and demonstrate an essential role of microglial BIN1 in regulating brain inflammatory response and microglial phenotypic changes. Moreover, for the first time, our study shows a regulatory relationship betweenBin1andIfitm3, two Alzheimer’s disease-related genes in microglia. The requirement for BIN1 to regulateIfitm3upregulation during inflammation has important implications for inflammatory responses during the pathogenesis and progression of many neurodegenerative diseases.

    Graphical Abstract 
    more » « less
  4. INTRODUCTION During the independent process of cereal evolution, many trait shifts appear to have been under convergent selection to meet the specific needs of humans. Identification of convergently selected genes across cereals could help to clarify the evolution of crop species and to accelerate breeding programs. In the past several decades, researchers have debated whether convergent phenotypic selection in distinct lineages is driven by conserved molecular changes or by diverse molecular pathways. Two of the most economically important crops, maize and rice, display some conserved phenotypic shifts—including loss of seed dispersal, decreased seed dormancy, and increased grain number during evolution—even though they experienced independent selection. Hence, maize and rice can serve as an excellent system for understanding the extent of convergent selection among cereals. RATIONALE Despite the identification of a few convergently selected genes, our understanding of the extent of molecular convergence on a genome-wide scale between maize and rice is very limited. To learn how often selection acts on orthologous genes, we investigated the functions and molecular evolution of the grain yield quantitative trait locus KRN2 in maize and its rice ortholog OsKRN2 . We also identified convergently selected genes on a genome-wide scale in maize and rice, using two large datasets. RESULTS We identified a selected gene, KRN2 ( kernel row number2 ), that differs between domesticated maize and its wild ancestor, teosinte. This gene underlies a major quantitative trait locus for kernel row number in maize. Selection in the noncoding upstream regions resulted in a reduction of KRN2 expression and an increased grain number through an increase in kernel rows. The rice ortholog, OsKRN2 , also underwent selection and negatively regulates grain number via control of secondary panicle branches. These orthologs encode WD40 proteins and function synergistically with a gene of unknown function, DUF1644, which suggests that a conserved protein interaction controls grain number in maize and rice. Field tests show that knockout of KRN2 in maize or OsKRN2 in rice increased grain yield by ~10% and ~8%, respectively, with no apparent trade-off in other agronomic traits. This suggests potential applications of KRN2 and its orthologs for crop improvement. On a genome-wide scale, we identified a set of 490 orthologous genes that underwent convergent selection during maize and rice evolution, including KRN2/OsKRN2 . We found that the convergently selected orthologous genes appear to be significantly enriched in two specific pathways in both maize and rice: starch and sucrose metabolism, and biosynthesis of cofactors. A deep analysis of convergently selected genes in the starch metabolic pathway indicates that the degree of genetic convergence via convergent selection is related to the conservation and complexity of the gene network for a given selection. CONCLUSION Our findings show that common phenotypic shifts during maize and rice evolution acting on conserved genes are driven at least in part by convergent selection, which in maize and rice likely occurred both during and after domestication. We provide evolutionary and functional evidence on the convergent selection of KRN2/OsKRN2 for grain number between maize and rice. We further found that a complete loss-of-function allele of KRN2/OsKRN2 increased grain yield without an apparent negative impact on other agronomic traits. Exploring the role of KRN2/OsKRN2 and other convergently selected genes across the cereals could provide new opportunities to enhance the production of other global crops. Shared selected orthologous genes in maize and rice for convergent phenotypic shifts during domestication and improvement. By comparing 3163 selected genes in maize and 18,755 selected genes in rice, we identified 490 orthologous gene pairs, including KRN2 and its rice ortholog OsKRN2 , as having been convergently selected. Knockout of KRN2 in maize or OsKRN2 in rice increased grain yield by increasing kernel rows and secondary panicle branches, respectively. 
    more » « less
  5. Plasticity in multicellular organisms involves signaling pathways converting contexts—either natural environmental challenges or laboratory perturbations—into context-specific changes in gene expression. Congruently, the interactions between the signaling molecules and transcription factors (TF) regulating these responses are also context specific. However, when a target gene responds across contexts, the upstream TF identified in one context is often inferred to regulate it across contexts. Reconciling these stable TF–target gene pair inferences with the context-specific nature of homeostatic responses is therefore needed. The induction of the Caenorhabditis elegans genes lipl-3 and lipl-4 is observed in many genetic contexts and is essential to survival during fasting. We find DAF-16/FOXO mediating lipl-4 induction in all contexts tested; hence, lipl-4 regulation seems context independent and compatible with across-context inferences. In contrast, DAF-16–mediated regulation of lipl-3 is context specific. DAF-16 reduces the induction of lipl-3 during fasting, yet it promotes it during oxidative stress. Through discrete dynamic modeling and genetic epistasis, we define that DAF-16 represses HLH-30/TFEB—the main TF activating lipl-3 during fasting. Contrastingly, DAF-16 activates the stress-responsive TF HSF-1 during oxidative stress, which promotes C. elegans survival through induction of lipl-3 . Furthermore, the TF MXL-3 contributes to the dominance of HSF-1 at the expense of HLH-30 during oxidative stress but not during fasting. This study shows how context-specific diverting of functional interactions within a molecular network allows cells to specifically respond to a large number of contexts with a limited number of molecular players, a mode of transcriptional regulation we name “contextualized transcription.” 
    more » « less