skip to main content

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 11:00 PM ET on Thursday, February 13 until 2:00 AM ET on Friday, February 14 due to maintenance. We apologize for the inconvenience.


Search for: All records

Award ID contains: 1718924

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. null (Ed.)
    We present PALLAS, a practical method for gene regulatory network (GRN) inference from time series data, which employs penalized maximum likelihood and particle swarms for optimization. PALLAS is based on the Partially-Observable Boolean Dynamical System (POBDS) model and thus does not require ad-hoc binarization of the data. The penalty in the likelihood is a LASSO regularization term, which encourages the resulting network to be sparse. PALLAS is able to scale to networks of realistic size under no prior knowledge, by virtue of a novel continuous-discrete Fish School Search particle swarm algorithm for efficient simultaneous maximization of the penalized likelihood over the discrete space of networks and the continuous space of observational parameters. The performance of PALLAS is demonstrated by a comprehensive set of experiments using synthetic data generated from real and artificial networks, as well as real time series microarray and RNA-seq data, where it is compared to several other well-known methods for gene regulatory network inference. The results show that PALLAS can infer GRNs more accurately than other methods, while being capable of working directly on gene expression data, without need of ad-hoc binarization. PALLAS is a fully-fledged program, written in python, and available on GitHub (https://github.com/yukuntan92/PALLAS). 
    more » « less
  2. null (Ed.)
    We propose a new algorithm for inference of protein-protein interaction (PPI) networks from noisy time series of Liquid- Chromatography Mass-Spectrometry (LC-MS) proteomic expression data based on Approximate Bayesian Computation - Sequential Monte Carlo sampling (ABC-SMC). The algorithm is an extension of our previous framework PALLAS. The proposed algorithm can be easily modified to handle other complex models of expression data, such as LC-MS data, for which the likelihood function is intractable. Results based on synthetic time series of cytokine LC-MS measurements cor- responding to a prototype immunomic network demonstrate that our algorithm is capable of inferring the network topology accurately. 
    more » « less
  3. null (Ed.)
  4. Nonlinear state-space models are ubiquitous in modeling real-world dynamical systems. Sequential Monte Carlo (SMC) techniques, also known as particle methods, are a well-known class of parameter estimation methods for this general class of state-space models. Existing SMC-based techniques rely on excessive sampling of the parameter space, which makes their computation intractable for large systems or tall data sets. Bayesian optimization techniques have been used for fast inference in state-space models with intractable likelihoods. These techniques aim to find the maximum of the likelihood function by sequential sampling of the parameter space through a single SMC approximator. Various SMC approximators with different fidelities and computational costs are often available for sample- based likelihood approximation. In this paper, we propose a multi-fidelity Bayesian optimization algorithm for the inference of general nonlinear state-space models (MFBO-SSM), which enables simultaneous sequential selection of parameters and approximators. The accuracy and speed of the algorithm are demonstrated by numerical experiments using synthetic gene expression data from a gene regulatory network model and real data from the VIX stock price index. 
    more » « less
  5. Background: Single-cell gene expression measurements offer opportunities in deriving mechanistic understanding of complex diseases, including cancer. However, due to the complex regulatory machinery of the cell, gene regulatory network (GRN) model inference based on such data still manifests significant uncertainty. Results:The goal of this paper is to develop optimal classification of single-cell trajectories accounting for potential model uncertainty. Partially-observed Boolean dynamical systems (POBDS) are used for modeling gene regulatory networks observed through noisy gene-expression data. We derive the exact optimal Bayesian classifier (OBC) for binary classification of single-cell trajectories. The application of the OBC becomes impractical for large GRNs, due to computational and memory requirements. To address this, we introduce a particle-based single-cell classification method that is highly scalable for large GRNs with much lower complexity than the optimal solution. Conclusion:The performance of the proposed particle-based method is demonstrated through numerical experiments using a POBDS model of the well-known T-cell large granular lymphocyte (T-LGL) leukemia network with noisy time-series gene-expression 
    more » « less
  6. We propose a Bayesian decision making framework for control of Markov Decision Processes (MDPs) with unknown dynamics and large, possibly continuous, state, action, and parameter spaces in data-poor environments. Most of the existing adaptive controllers for MDPs with unknown dynamics are based on the reinforcement learning framework and rely on large data sets acquired by sustained direct interaction with the system or via a simulator. This is not feasible in many applications, due to ethical, economic, and physical constraints. The proposed framework addresses the data poverty issue by decomposing the problem into an offline planning stage that does not rely on sustained direct interaction with the system or simulator and an online execution stage. In the offline process, parallel Gaussian process temporal difference (GPTD) learning techniques are employed for near-optimal Bayesian approximation of the expected discounted reward over a sample drawn from the prior distribution of unknown parameters. In the online stage, the action with the maximum expected return with respect to the posterior distribution of the parameters is selected. This is achieved by an approximation of the posterior distribution using a Markov Chain Monte Carlo (MCMC) algorithm, followed by constructing multiple Gaussian processes over the parameter space for efficient prediction of the means of the expected return at the MCMC sample. The effectiveness of the proposed framework is demonstrated using a simple dynamical system model with continuous state and action spaces, as well as a more complex model for a metastatic melanoma gene regulatory network observed through noisy synthetic gene expression data. 
    more » « less
  7. We propose a novel methodology for fault detection and diagnosis in partially-observed Boolean dynamical systems (POBDS). These are stochastic, highly nonlinear, and derivative- less systems, rendering difficult the application of classical fault detection and diagnosis methods. The methodology comprises two main approaches. The first addresses the case when the normal mode of operation is known but not the fault modes. It applies an innovations filter (IF) to detect deviations from the nominal normal mode of operation. The second approach is applicable when the set of possible fault models is finite and known, in which case we employ a multiple model adaptive estimation (MMAE) approach based on a likelihood-ratio (LR) statistic. Unknown system parameters are estimated by an adaptive expectation- maximization (EM) algorithm. Particle filtering techniques are used to reduce the computational complexity in the case of systems with large state-spaces. The efficacy of the proposed methodology is demonstrated by numerical experiments with a large gene regulatory network (GRN) with stuck-at faults observed through a single noisy time series of RNA-seq gene expression measurements. 
    more » « less
  8. We propose a new algorithm for inference of gene regulatory networks (GRN) from noisy gene expression data based on maximum-likelihood (ML) adaptive filtering and the discrete fish school search algorithm (DFSS). The approach is based on the general partially-observed Boolean dynamical system (POBDS) model, and as such can be used for simultaneous state and parameter estimation for any Boolean dynamical system observed in noise. The proposed DFSS-ML-BKF algorithm combines the ML adaptive Boolean Kalman Filter (ML-BKF) with DFSS, a version of the Fish School Search algorithm tailored for discrete parameter spaces. Results based on synthetic gene expression time-series data using the well-known p53-MDM2 negative-feedback loop GRN demonstrate that DFSS-ML-BKF can infer the network topology accurately and efficiently. 
    more » « less