We introduce VISTA, a clustering approach for multivariate and irregularly sampled time series based on a parametric state space mixture model. VISTA is specifically designed for the unsupervised identification of groups in datasets originating from healthcare and psychology where such sampling issues are commonplace. Our approach adapts linear Gaussian state space models (LGSSMs) to provide a flexible parametric framework for fitting a wide range of time series dynamics. The clustering approach itself is based on the assumption that the population can be represented as a mixture of a fixed number of LGSSMs. VISTA’s model formulation allows for an explicit derivation of the log-likelihood function, from which we develop an expectation-maximization scheme for fitting model parameters to the observed data samples. Our algorithmic implementation is designed to handle populations of multivariate time series that can exhibit large changes in sampling rate as well as irregular sampling. We evaluate the versatility and accuracy of our approach on simulated and real-world datasets, including demographic trends, wearable sensor data, epidemiological time series, and ecological momentary assessments. Our results indicate that VISTA outperforms most comparable standard times series clustering methods. We provide an open-source implementation of VISTA in Python.
more »
« less
Clustering time series with nonlinear dynamics: A Bayesian non-parametric and particle-based approach
We propose a statistical framework for clustering multiple time series that exhibit nonlinear dynamics into an a-priori-unknown number of sub-groups that each comprise time series with similar dynamics. Our motivation comes from neuroscience where an important problem is to identify, within a large assembly of neurons, sub-groups that respond similarly to a stimulus or contingency. In the neural setting, conditioned on cluster membership and the parameters governing the dynamics, time series within a cluster are assumed independent and generated according to a nonlinear binomial state-space model. We derive a Metropolis-within-Gibbs algorithm for full Bayesian inference that alternates between sampling of cluster membership and sampling of parameters of interest. The Metropolis step is a PMMH iteration that requires an unbiased, low variance estimate of the likelihood function of a nonlinear state- space model. We leverage recent results on controlled sequential Monte Carlo to estimate likelihood functions more efficiently compared to the bootstrap particle filter. We apply the framework to time series acquired from the prefrontal cortex of mice in an experiment designed to characterize the neural underpinnings of fear.
more »
« less
- Award ID(s):
- 1712872
- PAR ID:
- 10089652
- Date Published:
- Journal Name:
- AISTATS
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
To understand the complex nonlinear dynamics of neural circuits, we fit a structured state-space model called tree-structured recurrent switching linear dynamical system (TrSLDS) to noisy high-dimensional neural time series. TrSLDS is a multi-scale hierarchical generative model for the state-space dynamics where each node of the latent tree captures locally linear dynamics. TrSLDS can be learned efficiently and in a fully Bayesian manner using Gibbs sampling. We showcase TrSLDS' potential of inferring low-dimensional interpretable dynamical systems on a variety of examples.more » « less
-
ABSTRACT Estimating parameters and their credible intervals for complex system dynamics models is challenging but critical to continuous model improvement and reliable communication with an increasing fraction of audiences. The purpose of this study is to integrate Amortized Bayesian Inference (ABI) methods with system dynamics. Utilizing Neural Posterior Estimation (NPE), we train neural networks using synthetic data (pairs of ground truth parameters and outcome time series) to estimate parameters of system dynamics models. We apply this method to two example models: a simple Random Walk model and a moderately complex SEIRb model. We show that the trained neural networks can output the posterior for parameters instantly given new unseen time series data. Our analysis highlights the potential of ABI to facilitate a principled, scalable, and likelihood‐free inference workflow that enhance the integration of models of complex systems with data. Accompanying code streamlines application to diverse system dynamics models.more » « less
-
Nonlinear state-space models are ubiquitous in modeling real-world dynamical systems. Sequential Monte Carlo (SMC) techniques, also known as particle methods, are a well-known class of parameter estimation methods for this general class of state-space models. Existing SMC-based techniques rely on excessive sampling of the parameter space, which makes their computation intractable for large systems or tall data sets. Bayesian optimization techniques have been used for fast inference in state-space models with intractable likelihoods. These techniques aim to find the maximum of the likelihood function by sequential sampling of the parameter space through a single SMC approximator. Various SMC approximators with different fidelities and computational costs are often available for sample- based likelihood approximation. In this paper, we propose a multi-fidelity Bayesian optimization algorithm for the inference of general nonlinear state-space models (MFBO-SSM), which enables simultaneous sequential selection of parameters and approximators. The accuracy and speed of the algorithm are demonstrated by numerical experiments using synthetic gene expression data from a gene regulatory network model and real data from the VIX stock price index.more » « less
-
We investigate approximate Bayesian inference techniques for nonlinear systems described by ordinary differential equation (ODE) models. In particular, the approximations will be based on set-valued reachability analysis approaches, yielding approximate models for the posterior distribution. Nonlinear ODEs are widely used to mathematically describe physical and biological models. However, these models are often described by parameters that are not directly measurable and have an impact on the system behaviors. Often, noisy measurement data combined with physical/biological intuition serve as the means for finding appropriate values of these parameters.Our approach operates under a Bayesian framework, given prior distribution over the parameter space and noisy observations under a known sampling distribution. We explore subsets of the space of model parameters, computing bounds on the likelihood for each subset. This is performed using nonlinear set-valued reachability analysis that is made faster by means of linearization around a reference trajectory. The tiling of the parameter space can be adaptively refined to make bounds on the likelihood tighter. We evaluate our approach on a variety of nonlinear benchmarks and compare our results with Markov Chain Monte Carlo and Sequential Monte Carlo approaches.more » « less
An official website of the United States government

