skip to main content


Title: Membrane Characterization with Model-Based Design of Experiments
Membrane characterization provides essential information for the scale-up, design, and optimization of new separation systems. We recently proposed the diafiltration apparatus for high-throughput analysis (DATA), which enables a 5-times reduction in the time, energy, and the number of experiments necessary to characterize membrane transport properties. This paper applies formal model-based design of experiments (MBDoE) techniques to further analyse and optimize DATA. For example, the eigenvalues and eigenvectors of the Fisher Information Matrix (FIM) show dynamic diafiltration experiments improve parameter identifiability by 3 orders of magnitude compared to traditional filtration experiments. Moreover, continuous retentate conductivity measurements in DATA improve A-, D-, E-, and ME-optimal MBDoE criteria by between 6 % and 32 %. Using these criteria, we identify pressure and initial concentrations conditions that maximize parameter precision and remove correlations.  more » « less
Award ID(s):
1941596
NSF-PAR ID:
10403695
Author(s) / Creator(s):
; ; ; ;
Editor(s):
Yamashita, Y.; Kano, M.
Date Published:
Journal Name:
Computer aided chemical engineering
Volume:
49
ISSN:
2543-1331
Page Range / eLocation ID:
859-864
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Gorodkin, Jan (Ed.)
    Abstract Motivation When learning to subtype complex disease based on next-generation sequencing data, the amount of available data is often limited. Recent works have tried to leverage data from other domains to design better predictors in the target domain of interest with varying degrees of success. But they are either limited to the cases requiring the outcome label correspondence across domains or cannot leverage the label information at all. Moreover, the existing methods cannot usually benefit from other information available a priori such as gene interaction networks. Results In this article, we develop a generative optimal Bayesian supervised domain adaptation (OBSDA) model that can integrate RNA sequencing (RNA-Seq) data from different domains along with their labels for improving prediction accuracy in the target domain. Our model can be applied in cases where different domains share the same labels or have different ones. OBSDA is based on a hierarchical Bayesian negative binomial model with parameter factorization, for which the optimal predictor can be derived by marginalization of likelihood over the posterior of the parameters. We first provide an efficient Gibbs sampler for parameter inference in OBSDA. Then, we leverage the gene-gene network prior information and construct an informed and flexible variational family to infer the posterior distributions of model parameters. Comprehensive experiments on real-world RNA-Seq data demonstrate the superior performance of OBSDA, in terms of accuracy in identifying cancer subtypes by utilizing data from different domains. Moreover, we show that by taking advantage of the prior network information we can further improve the performance. Availability and implementation The source code for implementations of OBSDA and SI-OBSDA are available at the following link. https://github.com/SHBLK/BSDA. Supplementary information Supplementary data are available at Bioinformatics online. 
    more » « less
  2. Emerging lithium-ion battery systems require high-fidelity electrochemical models for advanced control, diagnostics, and design. Accordingly, battery parameter estimation is an active research domain where novel algorithms are being developed to calibrate complex models from input-output data. Amidst these efforts, little focus has been placed on the fundamental mechanisms governing estimation accuracy, spurring the question, why is an estimate accurate or inaccurate? In response, we derive a generalized estimation error equation under the commonly adopted least-squares objective function, which reveals that the error can be represented as a combination of system uncertainties (i.e., in model, measurement, and parameter) and uncertainty-propagating sensitivity structures in the data. We then relate the error equation to conventional error analysis criteria, such as the Fisher information matrix, Cramér-Rao bound, and parameter sensitivity, to assess the benefits and limitations of each. The error equation is validated through several uni- and bivariate estimations of lithium-ion battery electrochemical parameters using experimental data. These results are also analyzed with the error equation to study the error compositions and parameter identifiability under different data. Finally, we show that adding target parameters to the estimation without increasing the amount of data intrinsically reduces the robustness of the results to system uncertainties.

     
    more » « less
  3. Despite continued technological improvements, measurement errors always reduce or distort the information that any real experiment can provide to quantify cellular dynamics. This problem is particularly serious for cell signaling studies to quantify heterogeneity in single-cell gene regulation, where important RNA and protein copy numbers are themselves subject to the inherently random fluctuations of biochemical reactions. Until now, it has not been clear how measurement noise should be managed in addition to other experiment design variables (e.g., sampling size, measurement times, or perturbation levels) to ensure that collected data will provide useful insights on signaling or gene expression mechanisms of interest. We propose a computational framework that takes explicit consideration of measurement errors to analyze single-cell observations, and we derive Fisher Information Matrix (FIM)-based criteria to quantify the information value of distorted experiments. We apply this framework to analyze multiple models in the context of simulated and experimental single-cell data for a reporter gene controlled by an HIV promoter. We show that the proposed approach quantitatively predicts how different types of measurement distortions affect the accuracy and precision of model identification, and we demonstrate that the effects of these distortions can be mitigated through explicit consideration during model inference. We conclude that this reformulation of the FIM could be used effectively to design single-cell experiments to optimally harvest fluctuation information while mitigating the effects of image distortion. 
    more » « less
  4. Yamashita, Y. ; Kano, M. (Ed.)
    Patterned charged membranes with engendered useful characteristics can offer selective transport of electrolytes. Chemical patterning across the membrane surface via a physical inkjet deposition process requires precise control of the reactive-ink formulation, which enables the introduction of charged functionality to the membrane. This study develops a new dynamic mathematical model for the primary step of the batch reactive-ink formulation considering an ink mixture of copper sulphate and ascorbic acid. Nonlinear least squares parameter estimation is performed to infer three kinetic model parameters by analysing data from nine dynamic experiments simultaneously. Global sensitivity and Fisher information matrix (FIM) analyses reveal only one kinetic parameter is identifiable from time-series pH measurements. The fitted model can capture the overall nonlinear dynamics of the batch reaction and works best for initial Cu2 + concentrations between 30 and 50 mM. Time-series Cu2 + or Cu+ concentration measurements are recommended in future experiments to elucidate the kinetics of reactive-ink formulation. 
    more » « less
  5. Kolodny, Rachel (Ed.)
    Crystallography and NMR system (CNS) is currently a widely used method for fragment-free ab initio protein folding from inter-residue distance or contact maps. Despite its widespread use in protein structure prediction, CNS is a decade-old macromolecular structure determination system that was originally developed for solving macromolecular geometry from experimental restraints as opposed to predictive modeling driven by interaction map data. As such, the adaptation of the CNS experimental structure determination protocol for ab initio protein folding is intrinsically anomalous that may undermine the folding accuracy of computational protein structure prediction. In this paper, we propose a new CNS-free hierarchical structure modeling method called DConStruct for folding both soluble and membrane proteins driven by distance and contact information. Rigorous experimental validation shows that DConStruct attains much better reconstruction accuracy than CNS when tested with the same input contact map at varying contact thresholds. The hierarchical modeling with iterative self-correction employed in DConStruct scales at a much higher degree of folding accuracy than CNS with the increase in contact thresholds, ultimately approaching near-optimal reconstruction accuracy at higher-thresholded contact maps. The folding accuracy of DConStruct can be further improved by exploiting distance-based hybrid interaction maps at tri-level thresholding, as demonstrated by the better performance of our method in folding free modeling targets from the 12th and 13th rounds of the Critical Assessment of techniques for protein Structure Prediction (CASP) experiments compared to popular CNS- and fragment-based approaches and energy-minimization protocols, some of which even using much finer-grained distance maps than ours. Additional large-scale benchmarking shows that DConStruct can significantly improve the folding accuracy of membrane proteins compared to a CNS-based approach. These results collectively demonstrate the feasibility of greatly improving the accuracy of ab initio protein folding by optimally exploiting the information encoded in inter-residue interaction maps beyond what is possible by CNS. 
    more » « less