skip to main content

Title: The Arabidopsis gene co‐expression network

Identifying genes that interact to confer a biological function to an organism is one of the main goals of functional genomics. High‐throughput technologies for assessment and quantification of genome‐wide gene expression patterns have enabled systems‐level analyses to infer pathways or networks of genes involved in different functions under many different conditions. Here, we leveraged the publicly available, information‐rich RNA‐Seq datasets of the model plantArabidopsis thalianato construct a gene co‐expression network, which was partitioned into clusters or modules that harbor genes correlated by expression. Gene ontology and pathway enrichment analyses were performed to assess functional terms and pathways that were enriched within the different gene modules. By interrogating the co‐expression network for genes in different modules that associate with a gene of interest, diverse functional roles of the gene can be deciphered. By mapping genes differentially expressing under a certain condition inArabidopsisonto the co‐expression network, we demonstrate the ability of the network to uncover novel genes that are likely transcriptionally active but prone to be missed by standard statistical approaches due to their falling outside of the confidence zone of detection. To our knowledge, this is the firstA. thalianaco‐expression network constructed using the entire mRNA‐Seq datasets (>20,000) available at the NCBI SRA database. The developed network can serve as a useful resource for theArabidopsisresearch community to interrogate specific genes of interest within the network, retrieve the respective interactomes, decipher gene modules that are transcriptionally altered under certain condition or stage, and gain understanding of gene functions.

more » « less
Award ID(s):
Author(s) / Creator(s):
 ;  ;  ;  ;  
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
Plant Direct
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Plant responses to the environment are shaped by external stimuli and internal signaling pathways. In both the model plantArabidopsis thaliana(Arabidopsis) and crop species, circadian clock factors are critical for growth, flowering, and circadian rhythms. Outside ofArabidopsis,however, little is known about the molecular function of clock gene products. Therefore, we sought to compare the function ofBrachypodium distachyon(Brachypodium) andSetaria viridis(Setaria) orthologs ofEARLY FLOWERING3,a key clock gene inArabidopsis. To identify both cycling genes and putativeELF3functional orthologs inSetaria, a circadianRNA‐seq dataset and online query tool (Diel Explorer) were generated to explore expression profiles ofSetariagenes under circadian conditions. The function ofELF3orthologs fromArabidopsis, Brachypodium,andSetariawas tested for complementation of anelf3mutation inArabidopsis. We find that both monocot orthologs were capable of rescuing hypocotyl elongation, flowering time, and arrhythmic clock phenotypes. Using affinity purification and mass spectrometry, our data indicate that BdELF3 and SvELF3 could be integrated into similar complexesin vivoas AtELF3. Thus, we find that, despite 180 million years of separation,BdELF3andSvELF3can functionally complement loss ofELF3at the molecular and physiological level.

    more » « less
  2. Summary

    The flowering plantArabidopsis thalianais a dicot model organism for research in many aspects of plant biology. A comprehensive annotation of its genome paves the way for understanding the functions and activities of all types of transcripts, includingmRNA, the various classes of non‐codingRNA, and smallRNA. TheTAIR10 annotation update had a profound impact on Arabidopsis research but was released more than 5 years ago. Maintaining the accuracy of the annotation continues to be a prerequisite for future progress. Using an integrative annotation pipeline, we assembled tissue‐specificRNA‐Seq libraries from 113 datasets and constructed 48 359 transcript models of protein‐coding genes in eleven tissues. In addition, we annotated various classes of non‐codingRNAincluding microRNA, long intergenicRNA, small nucleolarRNA, natural antisense transcript, small nuclearRNA, and smallRNAusing published datasets and in‐house analytic results. Altogether, we identified 635 novel protein‐coding genes, 508 novel transcribed regions, 5178 non‐codingRNAs, and 35 846 smallRNAloci that were formerly unannotated. Analysis of the splicing events andRNA‐Seq based expression profiles revealed the landscapes of gene structures, untranslated regions, and splicing activities to be more intricate than previously appreciated. Furthermore, we present 692 uniformly expressed housekeeping genes, 43% of whose human orthologs are also housekeeping genes. This updated Arabidopsis genome annotation with a substantially increased resolution of gene models will not only further our understanding of the biological processes of this plant model but also of other species.

    more » « less
  3. Summary

    Cultivated cotton (Gossypium hirsutum) is the most important fibre crop in the world. Cotton leaf curl disease (CLCuD) is the major limiting factor and a threat to textile industry in India and Pakistan. All the local cotton cultivars exhibit moderate to no resistance againstCLCuD. In this study, we evaluated an exotic cotton accession Mac7 as a resistance source toCLCuD by challenging it with viruliferous whiteflies and performingqPCRto evaluate the presence/absence and relative titre ofCLCuD‐associated geminiviruses/betasatellites. The results indicated that replication of pathogenicity determinant betasatellite is significantly attenuated in Mac7 and probably responsible for resistance phenotype. Afterwards, to decipher the genetic basis ofCLCuD resistance in Mac7, we performedRNAsequencing onCLCuD‐infested Mac7 and validatedRNA‐Seq data withqPCRon 24 independent genes. We performed co‐expression network and pathway analysis for regulation of geminivirus/betasatellite‐interacting genes. We identified nine novel modules with 52 hubs of highly connected genes in network topology within the co‐expression network. Analysis of these hubs indicated the differential regulation of auxin stimulus and cellular localization pathways in response toCLCuD. We also analysed the differential regulation of geminivirus/betasatellite‐interacting genes in Mac7. We further performed the functional validation of selected candidate genes via virus‐induced gene silencing (VIGS). Finally, we evaluated the genomic context of resistance responsive genes and found that these genes are not specific to A or D sub‐genomes ofG. hirsutum. These results have important implications in understandingCLCuD resistance mechanism and developing a durable resistance in cultivated cotton.

    more » « less
  4. Summary

    Maize (Zea maysL.), a model species for genetic studies, is one of the two most important crop species worldwide. The genome sequence of the reference genotype, B73, representative of the stiff stalk heterotic group was recently updated (AGPv4) using long‐read sequencing and optical mapping technology. To facilitate the use ofAGPv4 and to enable functional genomic studies and association of genotype with phenotype, we determined expression abundances for replicatedmRNA‐sequencing datasets from 79 tissues and five abiotic/biotic stress treatments revealing 36 207 expressed genes. Characterization of the B73 transcriptome across six organs revealed 4154 organ‐specific and 7704 differentially expressed (DE) genes following stress treatment. Gene co‐expression network analyses revealed 12 modules associated with distinct biological processes containing 13 590 genes providing a resource for further association of gene function based on co‐expression patterns. Presence−absence variants (PAVs) previously identified using whole genome resequencing data from 61 additional inbred lines were enriched in organ‐specific and stress‐induced DE genes suggesting thatPAVs may function in phenological variation and adaptation to environment. Relative to core genes conserved across the 62 profiled inbreds,PAVs have lower expression abundances which are correlated with their frequency of dispersion across inbreds and on average have significantly fewer co‐expression network connections suggesting that a subset ofPAVs may be on an evolutionary path to pseudogenization. To facilitate use by the community, we developed the Maize Genomics Resource website ( for viewing and data‐mining these resources and deployed two new views on the maize electronic Fluorescent Pictograph Browser (

    more » « less
  5. Abstract

    With the high variability of natural growth environments, plants exhibit flexibility and resilience in regard to the strategies they employ to maintain overall fitness, including maximizing light use for photosynthesis, while simultaneously limiting light‐associated damage. We measured distinct parameters of photosynthetic performance ofArabidopsis thalianaplants under dynamic light regimes. Plants were grown to maturity then subjected to the following 5‐day (16 h light, 8 h dark) regime: Day 1 at constant light (CL) intensity during light period, representative of a common lab growth condition; Day 2 under sinusoidal variation in light intensity (SL) during the light period that is representative of changes occurring during a clear sunny day; Day 3 under fluctuating light (FL) intensity during the light period that simulates sudden changes that might occur with the movements of clouds in and out of the view of the sun; Day 4, repeat of CL; and Day 5, repeat of FL. We also examined the global transcriptome profile in these growth conditions based on obtaining RNA‐sequencing (RNA‐seq) data for whole plant rosettes. Our transcriptomic analyses indicated downregulation of photosystem I (PSI) and II (PSII) associated genes, which were correlated with elevated levels of photoinhibition as indicated by measurements of nonphotochemical quenching (NPQ), energy‐dependent quenching (qE), and inhibitory quenching (qI) under both SL and FL conditions. Furthermore, our transcriptomic results indicated downregulation of tetrapyrrole biosynthesis associated genes, coupled with reduced levels of chlorophyll under both SL and FL compared with CL, as well as downregulation of photorespiration‐associated genes under SL. We also noticed an enrichment of the stress response gene ontology (GO) terms for genes differentially regulated under FL when compared with SL. Collectively, our phenotypic and transcriptome analyses serve as useful resources for probing the underlying molecular mechanisms associated with plant acclimation to rapid light intensity changes in the natural environment.

    more » « less