skip to main content


Title: Patterns of pan‐genome occupancy and gene coexpression under water‐deficit in Brachypodium distachyon
Abstract

Natural populations are characterized by abundant genetic diversity driven by a range of different types of mutation. The tractability of sequencing complete genomes has allowed new insights into the variable composition of genomes, summarized as a species pan‐genome. These analyses demonstrate that many genes are absent from the first reference genomes, whose analysis dominated the initial years of the genomic era. Our field now turns towards understanding the functional consequence of these highly variable genomes. Here, we analysed weighted gene coexpression networks from leaf transcriptome data for drought response in the purple false bromeBrachypodium distachyonand the differential expression of genes putatively involved in adaptation to this stressor. We specifically asked whether genes with variable “occupancy” in the pan‐genome – genes which are either present in all studied genotypes or missing in some genotypes – show different distributions among coexpression modules. Coexpression analysis united genes expressed in drought‐stressed plants into nine modules covering 72 hub genes (87 hub isoforms), and genes expressed under controlled water conditions into 13 modules, covering 190 hub genes (251 hub isoforms). We find that low occupancy pan‐genes are under‐represented among several modules, while other modules are over‐enriched for low‐occupancy pan‐genes. We also provide new insight into the regulation of drought response inB. distachyon, specifically identifying one module with an apparent role in primary metabolism that is strongly responsive to drought. Our work shows the power of integrating pan‐genomic analysis with transcriptomic data using factorial experiments to understand the functional genomics of environmental response.

 
more » « less
PAR ID:
10373568
Author(s) / Creator(s):
 ;  ;  ;  ;  
Publisher / Repository:
Wiley-Blackwell
Date Published:
Journal Name:
Molecular Ecology
Volume:
31
Issue:
20
ISSN:
0962-1083
Page Range / eLocation ID:
p. 5285-5306
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Pan-genome analyses of metagenome-assembled genomes (MAGs) may suffer from the known issues with MAGs: fragmentation, incompleteness and contamination. Here, we conducted a critical assessment of pan-genomics of MAGs, by comparing pan-genome analysis results of complete bacterial genomes and simulated MAGs. We found that incompleteness led to significant core gene (CG) loss. The CG loss remained when using different pan-genome analysis tools (Roary, BPGA, Anvi’o) and when using a mixture of MAGs and complete genomes. Contamination had little effect on core genome size (except for Roary due to in its gene clustering issue) but had major influence on accessory genomes. Importantly, the CG loss was partially alleviated by lowering the CG threshold and using gene prediction algorithms that consider fragmented genes, but to a less degree when incompleteness was higher than 5%. The CG loss also led to incorrect pan-genome functional predictions and inaccurate phylogenetic trees. Our main findings were supported by a study of real MAG-isolate genome data. We conclude that lowering CG threshold and predicting genes in metagenome mode (as Anvi’o does with Prodigal) are necessary in pan-genome analysis of MAGs. Development of new pan-genome analysis tools specifically for MAGs are needed in future studies.

     
    more » « less
  2. Cultivated peanut ( Arachis hypogaea ) is one of the most widely grown food legumes in the world, being valued for its high protein and unsaturated oil contents. Drought stress is one of the major constraints that limit peanut production. This study’s objective was to identify the drought-responsive genes preferentially expressed under drought stress in different peanut genotypes. To accomplish this, four genotypes (drought tolerant: C76-16 and 587; drought susceptible: Tifrunner and 506) subjected to drought stress in a rainout shelter experiment were examined. Transcriptome sequencing analysis identified that all four genotypes shared a total of 2,457 differentially expressed genes (DEGs). A total of 139 enriched gene ontology terms consisting of 86 biological processes and 53 molecular functions, with defense response, reproductive process, and signaling pathways, were significantly enriched in the common DEGs. In addition, 3,576 DEGs were identified only in drought-tolerant lines in which a total of 74 gene ontology terms were identified, including 55 biological processes and 19 molecular functions, mainly related to protein modification process, pollination, and metabolic process. These terms were also found in shared genes in four genotypes, indicating that tolerant lines adjusted more related genes to respond to drought. Forty-three significantly enriched Kyoto Encyclopedia of Genes and Genomes pathways were also identified, and the most enriched pathways were those processes involved in metabolic pathways, biosynthesis of secondary metabolites, plant circadian rhythm, phenylpropanoid biosynthesis, and starch and sucrose metabolism. This research expands our current understanding of the mechanisms that facilitate peanut drought tolerance and shed light on breeding advanced peanut lines to combat drought stress. 
    more » « less
  3. Abstract Accessible chromatin and unmethylated DNA are associated with many genes and cis-regulatory elements. Attempts to understand natural variation for accessible chromatin regions (ACRs) and unmethylated regions (UMRs) often rely upon alignments to a single reference genome. This limits the ability to assess regions that are absent in the reference genome assembly and monitor how nearby structural variants influence variation in chromatin state. In this study, de novo genome assemblies for four maize inbreds (B73, Mo17, Oh43, and W22) are utilized to assess chromatin accessibility and DNA methylation patterns in a pan-genome context. A more complete set of UMRs and ACRs can be identified when chromatin data are aligned to the matched genome rather than a single reference genome. While there are UMRs and ACRs present within genomic regions that are not shared between genotypes, these features are 6- to 12-fold enriched within regions between genomes. Characterization of UMRs present within shared genomic regions reveals that most UMRs maintain the unmethylated state in other genotypes with only ∼5% being polymorphic between genotypes. However, the majority (71%) of UMRs that are shared between genotypes only exhibit partial overlaps suggesting that the boundaries between methylated and unmethylated DNA are dynamic. This instability is not solely due to sequence variation as these partially overlapping UMRs are frequently found within genomic regions that lack sequence variation. The ability to compare chromatin properties among individuals with structural variation enables pan-epigenome analyses to study the sources of variation for accessible chromatin and unmethylated DNA. 
    more » « less
  4. Ruvinsky, Ilya (Ed.)
    Abstract Developmental polyphenism, the ability to switch between phenotypes in response to environmental variation, involves the alternating activation of environmentally sensitive genes. Consequently, to understand how a polyphenic response evolves requires a comparative analysis of the components that make up environmentally sensitive networks. Here, we inferred coexpression networks for a morphological polyphenism, the feeding-structure dimorphism of the nematode Pristionchus pacificus. In this species, individuals produce alternative forms of a novel trait—moveable teeth, which in one morph enable predatory feeding—in response to environmental cues. To identify the origins of polyphenism network components, we independently inferred coexpression modules for more conserved transcriptional responses, including in an ancestrally nonpolyphenic nematode species. Further, through genome-wide analyses of these components across the nematode family (Diplogastridae) in which the polyphenism arose, we reconstructed how network components have changed. To achieve this, we assembled and resolved the phylogenetic context for five genomes of species representing the breadth of Diplogastridae and a hypothesized outgroup. We found that gene networks instructing alternative forms arose from ancestral plastic responses to environment, specifically starvation-induced metabolism and the formation of a conserved diapause (dauer) stage. Moreover, loci from rapidly evolving gene families were integrated into these networks with higher connectivity than throughout the rest of the P. pacificus transcriptome. In summary, we show that the modular regulatory outputs of a polyphenic response evolved through the integration of conserved plastic responses into networks with genes of high evolutionary turnover. 
    more » « less
  5. Abstract

    This study delves into the genomic features of 10 Vibrio strains collected from deep-sea hydrothermal vents in the Pacific Ocean, providing insights into their evolutionary history and ecological adaptations. Through sequencing and pan-genome analysis involving 141 Vibrio species, we found that deep-sea strains exhibit larger genomes with unique gene distributions, suggesting adaptation to the vent environment. The phylogenomic reconstruction of the investigated isolates revealed the presence of 2 main clades: The first is monophyletic, consisting exclusively of Vibrio alginolyticus, while the second forms a monophyletic clade comprising both Vibrio antiquarius and Vibrio diabolicus species, which were previously isolated from deep-sea vents. All strains carry virulence and antibiotic resistance genes related to those found in human pathogenic Vibrio species which may play a wider ecological role other than host infection in these environments. In addition, functional genomic analysis identified genes potentially related to deep-sea survival and stress response, alongside candidate genes encoding for novel antimicrobial agents. Ultimately, the pan-genome we generated represents a valuable resource for future studies investigating the taxonomy, evolution, and ecology of Vibrio species.

     
    more » « less