skip to main content

Title: Genomic Prediction Informed by Biological Processes Expands Our Understanding of the Genetic Architecture Underlying Free Amino Acid Traits in Dry Arabidopsis Seeds
Plant growth, development, and nutritional quality depends upon amino acid homeostasis, especially in seeds. However, our understanding of the underlying genetics influencing amino acid content and composition remains limited, with only a few candidate genes and quantitative trait loci identified to date. Improved knowledge of the genetics and biological processes that determine amino acid levels will enable researchers to use this information for plant breeding and biological discovery. Toward this goal, we used genomic prediction to identify biological processes that are associated with, and therefore potentially influence, free amino acid (FAA) composition in seeds of the model plant Arabidopsis thaliana . Markers were split into categories based on metabolic pathway annotations and fit using a genomic partitioning model to evaluate the influence of each pathway on heritability explained, model fit, and predictive ability. Selected pathways included processes known to influence FAA composition, albeit to an unknown degree, and spanned four categories: amino acid, core, specialized, and protein metabolism. Using this approach, we identified associations for pathways containing known variants for FAA traits, in addition to finding new trait-pathway associations. Markers related to amino acid metabolism, which are directly involved in FAA regulation, improved predictive ability for branched chain amino acids more » and histidine. The use of genomic partitioning also revealed patterns across biochemical families, in which serine-derived FAAs were associated with protein related annotations and aromatic FAAs were associated with specialized metabolic pathways. Taken together, these findings provide evidence that genomic partitioning is a viable strategy to uncover the relative contributions of biological processes to FAA traits in seeds, offering a promising framework to guide hypothesis testing and narrow the search space for candidate genes. « less
; ; ; ; ;
Award ID(s):
Publication Date:
Journal Name:
G3: Genes|Genomes|Genetics
Page Range or eLocation-ID:
4227 to 4239
Sponsoring Org:
National Science Foundation
More Like this
  1. Dunn, Anne K. ; Ruby, Edward G. (Ed.)
    ABSTRACT Gluconeogenic carbon metabolism is not well understood, especially within the context of flux partitioning between energy generation and biomass production, despite the importance of gluconeogenic carbon substrates in natural and engineered carbon processing. Here, using multiple omics approaches, we elucidate the metabolic mechanisms that facilitate gluconeogenic fast-growth phenotypes in Pseudomonas putida and Comamonas testosteroni , two Proteobacteria species with distinct metabolic networks. In contrast to the genetic constraint of C. testosteroni , which lacks the enzymes required for both sugar uptake and a complete oxidative pentose phosphate (PP) pathway, sugar metabolism in P. putida is known to generate surplus NADPH by relying on the oxidative PP pathway within its characteristic cyclic connection between the Entner-Doudoroff (ED) and Embden-Meyerhoff-Parnas (EMP) pathways. Remarkably, similar to the genome-based metabolic decoupling in C. testosteroni , our 13 C-fluxomics reveals an inactive oxidative PP pathway and disconnected EMP and ED pathways in P. putida during gluconeogenic feeding, thus requiring transhydrogenase reactions to supply NADPH for anabolism in both species by leveraging the high tricarboxylic acid cycle flux during gluconeogenic growth. Furthermore, metabolomics and proteomics analyses of both species during gluconeogenic feeding, relative to glycolytic feeding, demonstrate a 5-fold depletion in phosphorylated metabolites and themore »absence of or up to a 17-fold decrease in proteins of the PP and ED pathways. Such metabolic remodeling, which is reportedly lacking in Escherichia coli exhibiting a gluconeogenic slow-growth phenotype, may serve to minimize futile carbon cycling while favoring the gluconeogenic metabolic regime in relevant proteobacterial species. IMPORTANCE Glycolytic metabolism of sugars is extensively studied in the Proteobacteria , but gluconeogenic carbon sources (e.g., organic acids, amino acids, aromatics) that feed into the tricarboxylic acid (TCA) cycle are widely reported to produce a fast-growth phenotype, particularly in species with biotechnological relevance. Much remains unknown about the importance of glycolysis-associated pathways in the metabolism of gluconeogenic carbon substrates. Here, we demonstrate that two distinct proteobacterial species, through genetic constraints or metabolic regulation at specific metabolic nodes, bypass the oxidative PP pathway during gluconeogenic growth and avoid unnecessary carbon fluxes by depleting protein investment into connected glycolysis pathways. Both species can leverage instead the high TCA cycle flux during gluconeogenic feeding to meet NADPH demand. Importantly, lack of a complete oxidative pentose phosphate pathway is a widespread metabolic trait in Proteobacteria with a gluconeogenic carbon preference, thus highlighting the important relevance of our findings toward elucidating the metabolic architecture in these bacteria.« less
  2. ABSTRACT Water bloom development due to eutrophication constitutes a case of niche specialization among planktonic cyanobacteria, but the genomic repertoire allowing bloom formation in only some species has not been fully characterized. We posited that the habitat relevance of a trait begets its underlying genomic complexity, so that traits within the repertoire would be differentially more complex in species successfully thriving in that habitat than in close species that cannot. To test this for the case of bloom-forming cyanobacteria, we curated 17 potentially relevant query metabolic pathways and five core pathways selected according to existing ecophysiological literature. The available 113 genomes were split into those of blooming (45) or nonblooming (68) strains, and an index of genomic complexity for each strain’s version of each pathway was derived. We show that strain versions of all query pathways were significantly more complex in bloomers, with complexity in fact correlating positively with strain blooming incidence in 14 of those pathways. Five core pathways, relevant everywhere, showed no differential complexity or correlations. Gas vesicle, toxin and fatty acid synthesis, amino acid uptake, and C, N, and S acquisition systems were most strikingly relevant in the blooming repertoire. Further, we validated our findings using metagenomicmore »gene expression analyses of blooming and nonblooming cyanobacteria in natural settings, where pathways in the repertoire were differentially overexpressed according to their relative complexity in bloomers, but not in nonbloomers. We expect that this approach may find applications to other habitats and organismal groups. IMPORTANCE We pragmatically delineate the trait repertoire that enables organismal niche specialization. We based our approach on the tenet, derived from evolutionary and complex-system considerations, that genomic units that can significantly contribute to fitness in a certain habitat will be comparatively more complex in organisms specialized to that habitat than their genomic homologs found in organisms from other habitats. We tested this in cyanobacteria forming harmful water blooms, for which decades-long efforts in ecological physiology and genomics exist. Our results essentially confirm that genomics and ecology can be linked through comparative complexity analyses, providing a tool that should be of general applicability for any group of organisms and any habitat, and enabling the posing of grounded hypotheses regarding the ecogenomic basis for diversification.« less
  3. Claesen, Jan (Ed.)
    ABSTRACT Colorectal cancer (CRC) is the second leading cause of cancer mortality worldwide. The dysbiotic gut microbiota and its metabolite secretions play a significant role in CRC development and progression. In this study, we identified microbial and metabolic biomarkers applicable to CRC using a meta-analysis of metagenomic datasets from diverse geographical regions. We used LEfSe, random forest (RF), and co-occurrence network methods to identify microbial biomarkers. Geographic dataset-specific markers were identified and evaluated using area under the ROC curve (AUC) scores and random effect size. Co-occurrence networks analysis showed a reduction in the overall microbial associations and the presence of oral pathogenic microbial clusters in CRC networks. Analysis of predicted metabolites from CRC datasets showed the enrichment of amino acids, cadaverine, and creatine in CRC, which were positively correlated with CRC-associated microbes ( Peptostreptococcus stomatis , Gemella morbillorum , Bacteroides fragilis , Parvimonas spp., Fusobacterium nucleatum , Solobacterium moorei , and Clostridium symbiosum ), and negatively correlated with control-associated microbes. Conversely, butyrate, nicotinamide, choline, tryptophan, and 2-hydroxybutanoic acid showed positive correlations with control-associated microbes ( P < 0.05). Overall, our study identified a set of global CRC biomarkers that are reproducible across geographic regions. We also reported significant differential metabolitesmore »and microbe-metabolite interactions associated with CRC. This study provided significant insights for further investigations leading to the development of noninvasive CRC diagnostic tools and therapeutic interventions. IMPORTANCE Several studies showed associations between gut dysbiosis and CRC. Yet, the results are not conclusive due to cohort-specific associations that are influenced by genomic, dietary, and environmental stimuli and associated reproducibility issues with various analysis approaches. Emerging evidence suggests the role of microbial metabolites in modulating host inflammation and DNA damage in CRC. However, the experimental validations have been hindered by cost, resources, and cumbersome technical expertise required for metabolomic investigations. In this study, we performed a meta-analysis of CRC microbiota data from diverse geographical regions using multiple methods to achieve reproducible results. We used a computational approach to predict the metabolomic profiles using existing CRC metagenomic datasets. We identified a reliable set of CRC-specific biomarkers from this analysis, including microbial and metabolite markers. In addition, we revealed significant microbe-metabolite associations through correlation analysis and microbial gene families associated with dysregulated metabolic pathways in CRC, which are essential in understanding the vastly sporadic nature of CRC development and progression.« less
  4. Synopsis The following article represents a mini-review of an intensive 10-year progression of genome-to-phenome (G2P) discovery guided by the adverse outcome pathway (AOP) concept. This example is presented as a means to stimulate crossover of this toxicological concept to enhance G2P discovery within the broader biological sciences community. The case study demonstrates the benefits of the AOP approach for establishing causal linkages across multiple levels of biological organization ultimately linking molecular initiation (often at the genomic scale) to organism-level phenotypes of interest. The case study summarizes a US military effort to identify the mechanism(s) underlying toxicological phenotypes of lethargy and weight loss in response to nitroaromatic munitions exposures, such as 2,4,6-trinitrotoluene. Initial key discoveries are described including the toxicogenomic results that nitrotoluene exposures inhibited expression within the peroxisome proliferator activated receptor α (PPARα) pathway. We channeled the AOP concept to test the hypothesis that inhibition of PPARα signaling in nitrotoluene exposures impacted lipid metabolic processes, thus affecting systemic energy budgets, ultimately resulting in body weight loss. Results from a series of transcriptomic, proteomic, lipidomic, in vitro PPARα nuclear signaling, and PPARα knock-out investigations ultimately supported various facets of this hypothesis. Given these results, we next proceeded to develop a formalizedmore »AOP description of PPARα antagonism leading to body weight loss. This AOP was refined through intensive literature review and polished through multiple rounds of peer-review leading to final international acceptance as an Organisation for Economic Cooperation and Development-approved AOP. Briefly, that AOP identifies PPARα antagonist binding as the molecular initiating event (MIE) leading to a series of key events including inhibition of nuclear transactivation for genes controlling lipid metabolism and ketogenesis, inhibition of fatty acid beta-oxidation and ketogenesis dynamics, negative energy budget, and ultimately the adverse outcome (AO) of body-weight loss. Given that the PPARα antagonism MIE represented a reliable indicator of AO progression within the pathway, a phylogenetic analysis was conducted which indicated that PPARα amino acid relatedness generally tracked species relatedness. Additionally, PPARα amino acid relatedness analysis using the Sequence Alignment to Predict Across Species Susceptibility predicted susceptibility to the MIE across vertebrates providing context for AOP extrapolation across species. Overall, we hope this illustrative example of how the AOP concept has benefited toxicology sows a seed within the broader biological sciences community to repurpose the concept to facilitate enhanced G2P discovery in biology.« less
  5. Wolbachia is a widespread endosymbiont of insects and filarial nematodes that profoundly influences host biology. Wolbachia has also been reported in rhizosphere hosts, where its diversity and function remain poorly characterized. The discovery that plant-parasitic nematodes (PPNs) host Wolbachia strains with unknown roles is of interest evolutionarily, ecologically, and for agriculture as a potential target for developing new biological controls. The goal of this study was to screen communities for PPN endosymbionts and analyze genes and genomic patterns that might indicate their role. Genome assemblies revealed 1 out of 16 sampled sites had nematode communities hosting a Wolbachia strain, designated w Tex, that has highly diverged as one of the early supergroup L strains. Genome features, gene repertoires, and absence of known genes for cytoplasmic incompatibility, riboflavin, biotin, and other biosynthetic functions placed w Tex between mutualist C + D strains and reproductive parasite A + B strains. Functional terms enriched in group L included protoporphyrinogen IX, thiamine, lysine, fatty acid, and cellular amino acid biosynthesis, while dN/dS analysis suggested the strongest purifying selection on arginine and lysine metabolism, and vitamin B6, heme, and zinc ion binding, suggesting these as candidate roles in PPN Wolbachia . Higher dN/dS pathways betweenmore »group L, w Pni from aphids, w Fol from springtails, and w CfeT from cat fleas suggested distinct functional changes characterizing these early Wolbachia host transitions. PPN Wolbachia had several putative horizontally transferred genes, including a lysine biosynthesis operon like that of the mitochondrial symbiont Midichloria , a spirochete-like thiamine synthesis operon shared only with w CfeT, an ATP/ADP carrier important in Rickettsia , and a eukaryote-like gene that may mediate plant systemic acquired resistance through the lysine-to-pipecolic acid system. The Discovery of group L-like variants from global rhizosphere databases suggests diverse PPN Wolbachia strains remain to be discovered. These findings support the hypothesis of plant-specialization as key to shaping early Wolbachia evolution and present new functional hypotheses, demonstrating promise for future genomics-based rhizosphere screens.« less