skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: The ModelSEED Biochemistry Database for the integration of metabolic annotations and the reconstruction, comparison and analysis of metabolic models for plants, fungi and microbes
Abstract For over 10 years, ModelSEED has been a primary resource for the construction of draft genome-scale metabolic models based on annotated microbial or plant genomes. Now being released, the biochemistry database serves as the foundation of biochemical data underlying ModelSEED and KBase. The biochemistry database embodies several properties that, taken together, distinguish it from other published biochemistry resources by: (i) including compartmentalization, transport reactions, charged molecules and proton balancing on reactions; (ii) being extensible by the user community, with all data stored in GitHub; and (iii) design as a biochemical ‘Rosetta Stone’ to facilitate comparison and integration of annotations from many different tools and databases. The database was constructed by combining chemical data from many resources, applying standard transformations, identifying redundancies and computing thermodynamic properties. The ModelSEED biochemistry is continually tested using flux balance analysis to ensure the biochemical network is modeling-ready and capable of simulating diverse phenotypes. Ontologies can be designed to aid in comparing and reconciling metabolic reconstructions that differ in how they represent various metabolic pathways. ModelSEED now includes 33,978 compounds and 36,645 reactions, available as a set of extensible files on GitHub, and available to search at https://modelseed.org/biochem and KBase.  more » « less
Award ID(s):
1716285
PAR ID:
10319107
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; more » ; ; « less
Date Published:
Journal Name:
Nucleic Acids Research
Volume:
49
Issue:
D1
ISSN:
0305-1048
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. A major challenge to integrating public metabolic resources is the use of different nomenclatures by individual databases. This paper presents md_harmonize, an open-source Python package for harmonizing compounds and metabolic reactions across various metabolic databases. The md_harmonize package utilizes a neighborhood-specific graph coloring method for generating a unique identifier for each compound via atom identifiers based on a compound’s chemical structure. The resulting harmonized compounds and reactions can be used for various downstream analyses, including the construction of atom-resolved metabolic networks and models for metabolic flux analysis. Parts of the md_harmonize package have been optimized using a variety of computational techniques to allow certain NP-complete problems handled by the software to be tractable for these specific use-cases. The software is available on GitHub and through the Python Package Index, with end-user documentation hosted on GitHub Pages. 
    more » « less
  2. Abstract The modeling of rates of biochemical reactions—fluxes—in metabolic networks is widely used for both basic biological research and biotechnological applications. A number of different modeling methods have been developed to estimate and predict fluxes, including kinetic and constraint‐based (Metabolic Flux Analysis and flux balance analysis) approaches. Although different resources exist for teaching these methods individually, to‐date no resources have been developed to teach these approaches in an integrative way that equips learners with an understanding of each modeling paradigm, how they relate to one another, and the information that can be gleaned from each. We have developed a series of modeling simulations in Python to teach kinetic modeling, metabolic control analysis, 13C‐metabolic flux analysis, and flux balance analysis. These simulations are presented in a series of interactive notebooks with guided lesson plans and associated lecture notes. Learners assimilate key principles using models of simple metabolic networks by running simulations, generating and using data, and making and validating predictions about the effects of modifying model parameters. We used these simulations as the hands‐on computer laboratory component of a four‐day metabolic modeling workshop and participant survey results showed improvements in learners' self‐assessed competence and confidence in understanding and applying metabolic modeling techniques after having attended the workshop. The resources provided can be incorporated in their entirety or individually into courses and workshops on bioengineering and metabolic modeling at the undergraduate, graduate, or postgraduate level. 
    more » « less
  3. Abstract The Plant Metabolic Network (PMN) is a free online database of plant metabolism available at https://plantcyc.org. The latest release, PMN 16, provides metabolic databases representing >1200 metabolic pathways, 1.3 million enzymes, >8000 metabolites, >10 000 reactions and >15 000 citations for 155 plant and green algal genomes, as well as a pan-plant reference database called PlantCyc. This release contains 29 additional genomes compared with PMN 15, including species listed by the African Orphan Crop Consortium and nonflowering plant species. Furthermore, 52 new enzymes with experimentally supported function information have been included in this release. The single-species databases contain a combination of experimental information from the literature and computationally predicted information obtained through PMN’s database generation pipeline for a single species, while PlantCyc contains only experimental information but for any species within Viridiplantae. PMN is a comprehensive resource for querying, visualizing, analyzing and interpreting omics data with metabolic knowledge. It also serves as a useful and interactive tool for teaching plant metabolism. 
    more » « less
  4. Summary: Polyphenols are diverse and abundant carbon sources across ecosystems- having important roles in host-associated and terrestrial systems alike. However, the microbial genes encoding polyphenol metabolic enzymes are poorly represented in commonly used annotation databases, limiting widespread surveying of this metabolism. Here we present CAMPER, a tool that combines custom annotation searches with database-derived searches to both annotate and summarize polyphenol metabolism genes for a wide audience. With CAMPER, users will identify potential polyphenol-active genes and genomes to more broadly understand microbial carbon cycling in their datasets. Availability and Implementation: CAMPER is implemented in Python and is published under the GNU General Public License Version 3. It is available as both a standalone tool and as a database in DRAM v.1.5+. The source code and full documentation is available on GitHub at https://github.com/WrightonLabCSU/CAMPER. 
    more » « less
  5. Abstract FatPlants, an open-access, web-based database, consolidates data, annotations, analysis results, and visualizations of lipid-related genes, proteins, and metabolic pathways in plants. Serving as a minable resource, FatPlants offers a user-friendly interface for facilitating studies into the regulation of plant lipid metabolism and supporting breeding efforts aimed at increasing crop oil content. This web resource, developed using data derived from our own research, curated from public resources, and gleaned from academic literature, comprises information on known fatty-acid-related proteins, genes, and pathways in multiple plants, with an emphasis on Glycine max, Arabidopsis thaliana, and Camelina sativa. Furthermore, the platform includes machine-learning based methods and navigation tools designed to aid in characterizing metabolic pathways and protein interactions. Comprehensive gene and protein information cards, a Basic Local Alignment Search Tool search function, similar structure search capacities from AphaFold, and ChatGPT-based query for protein information are additional features. Database URL: https://www.fatplants.net/ 
    more » « less