Suppose L simultaneous independent stochastic sys- tems generate observations, where the observations from each system depend on the underlying model parameter of that system. The observations are unlabeled (anonymized), in the sense that an analyst does not know which observation came from which stochastic system. How can the analyst estimate the underlying model parameters of the L systems? Since the anonymized observations at each time are an unordered set of L measurements (rather than a vector), classical stochastic gradient algorithms cannot be directly used. By using symmetric polynomials, we formulate a symmetric measurement equation that maps the observation set to a unique vector. By exploiting the fact that the algebraic ring of multi-variable polynomials is a unique factorization domain over the ring of one-variable polynomials, we construct an adaptive filtering algorithm that yields a statistically consistent estimate of the underlying param- eters. We analyze the asymptotic covariance of these estimates to quantify the effect of anonymization. Finally, we characterize the anonymity of the observations in terms of the error probability of the maximum aposteriori Bayesian estimator. Specifically using Blackwell dominance of mean preserving spreads, we construct a partial ordering of the noise densities which relates the anonymity of the observations to the asymptotic covariance of the adaptive filtering algorithm.
more »
« less
Adaptive Filtering Algorithms For Set-Valued Observations-Symmetric Measurement Approach To Unlabeled And Anonymized Data
Suppose L simultaneous independent stochastic systems generate observations, where the observations from each system depend on the underlying parameter of that system. The observations are unlabeled (anonymized), in the sense that an analyst does not know which observation came from which stochastic system. How can the analyst estimate the underlying parameters of the L systems? Since the anonymized observations at each time are an unordered set of L measurements (rather than a vector), classical stochastic gradient algorithms cannot be directly used. By using symmetric polynomials, we formulate a symmetric measurement equation that maps the observation set to a unique vector. We then construct an adaptive filtering algorithm that yields a statistically consistent estimate of the underlying parameters.
more »
« less
- Award ID(s):
- 2112457
- PAR ID:
- 10425820
- Date Published:
- Journal Name:
- 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- Page Range / eLocation ID:
- 1 to 5
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Chowell, Gerardo (Ed.)To support decision-making and policy for managing epidemics of emerging pathogens, we present a model for inference and scenario analysis of SARS-CoV-2 transmission in the USA. The stochastic SEIR-type model includes compartments for latent, asymptomatic, detected and undetected symptomatic individuals, and hospitalized cases, and features realistic interval distributions for presymptomatic and symptomatic periods, time varying rates of case detection, diagnosis, and mortality. The model accounts for the effects on transmission of human mobility using anonymized mobility data collected from cellular devices, and of difficult to quantify environmental and behavioral factors using a latent process. The baseline transmission rate is the product of a human mobility metric obtained from data and this fitted latent process. We fit the model to incident case and death reports for each state in the USA and Washington D.C., using likelihood Maximization by Iterated particle Filtering (MIF). Observations (daily case and death reports) are modeled as arising from a negative binomial reporting process. We estimate time-varying transmission rate, parameters of a sigmoidal time-varying fraction of hospitalized cases that result in death, extra-demographic process noise, two dispersion parameters of the observation process, and the initial sizes of the latent, asymptomatic, and symptomatic classes. In a retrospective analysis covering March–December 2020, we show how mobility and transmission strength became decoupled across two distinct phases of the pandemic. The decoupling demonstrates the need for flexible, semi-parametric approaches for modeling infectious disease dynamics in real-time.more » « less
-
Multireference alignment (MRA) is the problem of estimating a signal from many noisy and cyclically shifted copies of itself. In this paper, we consider an extension called heterogeneous MRA, where K signals must be estimated, and each observation comes from one of those signals, unknown to us. This is a simplified model for the heterogeneity problem notably arising in cryo-electron microscopy. We propose an algorithm which estimates the K signals without estimating either the shifts or the classes of the observations. It requires only one pass over the data and is based on low-order moments that are invariant under cyclic shifts. Given sufficiently many measurements, one can estimate these invariant features averaged over the K signals. We then design a smooth, non-convex optimization problem to compute a set of signals which are consistent with the estimated averaged features. We find that, in many cases, the proposed approach estimates the set of signals accurately despite non-convexity, and conjecture the number of signals K that can be resolved as a function of the signal length L is on the order of √L.more » « less
-
Optimal designs minimize the number of experimental runs (samples) needed to accurately estimate model parameters, resulting in algorithms that, for instance, efficiently minimize parameter estimate variance. Governed by knowledge of past observations, adaptive approaches adjust sampling constraints online as model parameter estimates are refined, continually maximizing expected information gained or variance reduced. We apply adaptive Bayesian inference to estimate transition rates of Markov chains, a common class of models for stochastic processes in nature. Unlike most previous studies, our sequential Bayesian optimal design is updated with each observation and can be simply extended beyond two-state models to birth–death processes and multistate models. By iteratively finding the best time to obtain each sample, our adaptive algorithm maximally reduces variance, resulting in lower overall error in ground truth parameter estimates across a wide range of Markov chain parameterizations and conformations.more » « less
-
This paper introduces a method of identifying a maximal set of safe strategies from data for stochastic systems with unknown dynamics using barrier certificates. The first step is learning the dynamics of the system via Gaussian Process (GP) regression and obtaining probabilistic errors for this estimate. Then, we develop an algorithm for constructing piecewise stochastic barrier functions to find a maximal permissible strategy set using the learned GP model, which is based on sequentially pruning the worst controls until a maximal set is identified. The permissible strategies are guaranteed to maintain probabilistic safety for the true system. This is especially important for learned systems, because a rich strategy space enables additional data collection and complex behaviors while remaining safe. Case studies on linear and nonlinear systems demonstrate that increasing the size of the dataset for learning grows the permissible strategy set.more » « less
An official website of the United States government

