Abstract 13C‐Metabolic Flux Analysis (13C‐MFA) and Flux Balance Analysis (FBA) are widely used to investigate the operation of biochemical networks in both biological and biotechnological research. Both methods use metabolic reaction network models of metabolism operating at steady state so that reaction rates (fluxes) and the levels of metabolic intermediates are constrained to be invariant. They provide estimated (MFA) or predicted (FBA) values of the fluxes through the network in vivo, which cannot be measured directly. These fluxes can shed light on basic biology and have been successfully used to inform metabolic engineering strategies. Several approaches have been taken to test the reliability of estimates and predictions from constraint‐based methods and to compare alternative model architectures. Despite advances in other areas of the statistical evaluation of metabolic models, such as the quantification of flux estimate uncertainty, validation and model selection methods have been underappreciated and underexplored. We review the history and state‐of‐the‐art in constraint‐based metabolic model validation and model selection. Applications and limitations of the χ2‐test of goodness‐of‐fit, the most widely used quantitative validation and selection approach in 13C‐MFA, are discussed, and complementary and alternative forms of validation and selection are proposed. A combined model validation and selection framework for 13C‐MFA incorporating metabolite pool size information that leverages new developments in the field is presented and advocated for. Finally, we discuss how adopting robust validation and selection procedures can enhance confidence in constraint‐based modeling as a whole and ultimately facilitate more widespread use of FBA in biotechnology. 
                        more » 
                        « less   
                    
                            
                            Accurate flux predictions using tissue-specific gene expression in plant metabolic modeling
                        
                    
    
            Abstract Motivation The accurate prediction of complex phenotypes such as metabolic fluxes in living systems is a grand challenge for systems biology and central to efficiently identifying biotechnological interventions that can address pressing industrial needs. The application of gene expression data to improve the accuracy of metabolic flux predictions using mechanistic modeling methods such as flux balance analysis (FBA) has not been previously demonstrated in multi-tissue systems, despite their biotechnological importance. We hypothesized that a method for generating metabolic flux predictions informed by relative expression levels between tissues would improve prediction accuracy. Results Relative gene expression levels derived from multiple transcriptomic and proteomic datasets were integrated into FBA predictions of a multi-tissue, diel model of Arabidopsis thaliana’s central metabolism. This integration dramatically improved the agreement of flux predictions with experimentally based flux maps from 13C metabolic flux analysis compared with a standard parsimonious FBA approach. Disagreement between FBA predictions and MFA flux maps was measured using weighted averaged percent error values, and for parsimonious FBA this was169%–180% for high light conditions and 94%–103% for low light conditions, depending on the gene expression dataset used. This fell to 10%-13% and 9%-11% upon incorporating expression data into the modeling process, which also substantially altered the predicted carbon and energy economy of the plant. Availability and implementation Code and data generated as part of this study are available from https://github.com/Gibberella/ArabidopsisGeneExpressionWeights. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 1828149
- PAR ID:
- 10433469
- Editor(s):
- Birol, Inanc
- Date Published:
- Journal Name:
- Bioinformatics
- Volume:
- 39
- Issue:
- 5
- ISSN:
- 1367-4811
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            Abstract The regulation of gene expression is central to many biological processes. Gene regulatory networks (GRNs) link transcription factors (TFs) to their target genes and represent maps of potential transcriptional regulation. Here, we analyzed a large number of publically available maize (Zea mays) transcriptome data sets including >6000 RNA sequencing samples to generate 45 coexpression-based GRNs that represent potential regulatory relationships between TFs and other genes in different populations of samples (cross-tissue, cross-genotype, and tissue-and-genotype samples). While these networks are all enriched for biologically relevant interactions, different networks capture distinct TF-target associations and biological processes. By examining the power of our coexpression-based GRNs to accurately predict covarying TF-target relationships in natural variation data sets, we found that presence/absence changes rather than quantitative changes in TF gene expression are more likely associated with changes in target gene expression. Integrating information from our TF-target predictions and previous expression quantitative trait loci (eQTL) mapping results provided support for 68 TFs underlying 74 previously identified trans-eQTL hotspots spanning a variety of metabolic pathways. This study highlights the utility of developing multiple GRNs within a species to detect putative regulators of important plant pathways and provides potential targets for breeding or biotechnological applications.more » « less
- 
            Enzymatic pathways have evolved uniquely preferred protein expression stoichiometry in living cells, but our ability to predict the optimal abundances from basic properties remains underdeveloped. Here, we report a biophysical, first-principles model of growth optimization for core mRNA translation, a multi-enzyme system that involves proteins with a broadly conserved stoichiometry spanning two orders of magnitude. We show that predictions from maximization of ribosome usage in a parsimonious flux model constrained by proteome allocation agree with the conserved ratios of translation factors. The analytical solutions, without free parameters, provide an interpretable framework for the observed hierarchy of expression levels based on simple biophysical properties, such as diffusion constants and protein sizes. Our results provide an intuitive and quantitative understanding for the construction of a central process of life, as well as a path toward rational design of pathway-specific enzyme expression stoichiometry.more » « less
- 
            Summary Plant metabolites from diverse pathways are important for plant survival, human nutrition and medicine. The pathway memberships of most plant enzyme genes are unknown. While co‐expression is useful for assigning genes to pathways, expression correlation may exist only under specific spatiotemporal and conditional contexts.Utilising > 600 tomato (Solanum lycopersicum) expression data combinations, three strategies for predicting memberships in 85 pathways were explored.Optimal predictions for different pathways require distinct data combinations indicative of pathway functions. Naive prediction (i.e. identifying pathways with the most similarly expressed genes) is error prone. In 52 pathways, unsupervised learning performed better than supervised approaches, possibly due to limited training data availability. Using gene‐to‐pathway expression similarities led to prediction models that outperformed those based simply on expression levels. Using 36 experimental validated genes, the pathway‐best model prediction accuracy is 58.3%, significantly better compared with that for predicting annotated genes without experimental evidence (37.0%) or random guess (1.2%), demonstrating the importance of data quality.Our study highlights the need to extensively explore expression‐based features and prediction strategies to maximise the accuracy of metabolic pathway membership assignment. The prediction framework outlined here can be applied to other species and serves as a baseline model for future comparisons.more » « less
- 
            Abstract The modeling of rates of biochemical reactions—fluxes—in metabolic networks is widely used for both basic biological research and biotechnological applications. A number of different modeling methods have been developed to estimate and predict fluxes, including kinetic and constraint‐based (Metabolic Flux Analysis and flux balance analysis) approaches. Although different resources exist for teaching these methods individually, to‐date no resources have been developed to teach these approaches in an integrative way that equips learners with an understanding of each modeling paradigm, how they relate to one another, and the information that can be gleaned from each. We have developed a series of modeling simulations in Python to teach kinetic modeling, metabolic control analysis, 13C‐metabolic flux analysis, and flux balance analysis. These simulations are presented in a series of interactive notebooks with guided lesson plans and associated lecture notes. Learners assimilate key principles using models of simple metabolic networks by running simulations, generating and using data, and making and validating predictions about the effects of modifying model parameters. We used these simulations as the hands‐on computer laboratory component of a four‐day metabolic modeling workshop and participant survey results showed improvements in learners' self‐assessed competence and confidence in understanding and applying metabolic modeling techniques after having attended the workshop. The resources provided can be incorporated in their entirety or individually into courses and workshops on bioengineering and metabolic modeling at the undergraduate, graduate, or postgraduate level.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    