skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Stochastic distinguishability of Markovian trajectories
The ability to distinguish between stochastic systems based on their trajectories is crucial in thermodynamics, chemistry, and biophysics. The Kullback–Leibler (KL) divergence, DKLAB(0,τ), quantifies the distinguishability between the two ensembles of length-τ trajectories from Markov processes A and B. However, evaluating DKLAB(0,τ) from histograms of trajectories faces sufficient sampling difficulties, and no theory explicitly reveals what dynamical features contribute to the distinguishability. This work provides a general formula that decomposes DKLAB(0,τ) in space and time for any Markov processes, arbitrarily far from equilibrium or steady state. It circumvents the sampling difficulty of evaluating DKLAB(0,τ). Furthermore, it explicitly connects trajectory KL divergence with individual transition events and their waiting time statistics. The results provide insights into understanding distinguishability between Markov processes, leading to new theoretical frameworks for designing biological sensors and optimizing signal transduction.  more » « less
Award ID(s):
2145256
PAR ID:
10593830
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
American Institute of Physics
Date Published:
Journal Name:
The Journal of Chemical Physics
Volume:
160
Issue:
17
ISSN:
0021-9606
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Krause, Andreas; Brunskill, Emma; Cho, Kyunghyun; Engelhardt; Barbara; Sabato, Sivan; Scarlett, Jonathan (Ed.)
    Robust Markov decision processes (MDPs) address the challenge of model uncertainty by optimizing the worst-case performance over an uncertainty set of MDPs. In this paper, we focus on the robust average-reward MDPs under the modelfree setting. We first theoretically characterize the structure of solutions to the robust averagereward Bellman equation, which is essential for our later convergence analysis. We then design two model-free algorithms, robust relative value iteration (RVI) TD and robust RVI Q-learning, and theoretically prove their convergence to the optimal solution. We provide several widely used uncertainty sets as examples, including those def ined by the contamination model, total variation, Chi-squared divergence, Kullback-Leibler (KL) divergence and Wasserstein distance. 
    more » « less
  2. Inferring underlying microscopic dynamics from low-dimensional experimental signals is a central problem in physics, chemistry, and biology. As a trade-off between molecular complexity and the low-dimensional nature of experimental data, mesoscopic descriptions such as the Markovian master equation are commonly used. The states in such descriptions usually include multiple microscopic states, and the ensuing coarse-grained dynamics are generally non-Markovian. It is frequently assumed that such dynamics can nevertheless be described as a Markov process because of the timescale separation between slow transitions from one observed coarse state to another and the fast interconversion within such states. Here, we use a simple model of a molecular motor with unobserved internal states to highlight that (1) dissipation estimated from the observed coarse dynamics may significantly underestimate microscopic dissipation even in the presence of timescale separation and even when mesoscopic states do not contain dissipative cycles and (2) timescale separation is not necessarily required for the Markov approximation to give the exact entropy production, provided that certain constraints on the microscopic rates are satisfied. When the Markov approximation is inadequate, we discuss whether including memory effects can improve the estimate. Surprisingly, when we do so in a “model-free” way by computing the Kullback–Leibler divergence between the observed probability distributions of forward trajectories and their time reverses, this leads to poorer estimates of entropy production. Finally, we argue that alternative approaches, such as hidden Markov models, may uncover the dissipative nature of the microscopic dynamics even when the observed coarse trajectories are completely time-reversible. 
    more » « less
  3. We give a algorithm for exact sampling from the Bingham distribution p(x)∝exp(x⊤Ax) on the sphere Sd−1 with expected runtime of poly(d,λmax(A)−λmin(A)). The algorithm is based on rejection sampling, where the proposal distribution is a polynomial approximation of the pdf, and can be sampled from by explicitly evaluating integrals of polynomials over the sphere. Our algorithm gives exact samples, assuming exact computation of an inverse function of a polynomial. This is in contrast with Markov Chain Monte Carlo algorithms, which are not known to enjoy rapid mixing on this problem, and only give approximate samples. As a direct application, we use this to sample from the posterior distribution of a rank-1 matrix inference problem in polynomial time. 
    more » « less
  4. A loss function measures the discrepancy between the true values (observations) and their estimated fits, for a given instance of data. A loss function is said to be proper (unbiased, Fisher consistent) if the fits are defined over a unit simplex, and the minimizer of the expected loss is the true underlying probability of the data. Typical examples are the zero-one loss, the quadratic loss and the Bernoulli log-likelihood loss (log-loss). In this work we show that for binary classification problems, the divergence associated with smooth, proper and convex loss functions is bounded from above by the Kullback-Leibler (KL) divergence, up to a multiplicative normalization constant. It implies that by minimizing the log-loss (associated with the KL divergence), we minimize an upper bound to any choice of loss functions from this set. This property justifies the broad use of log-loss in regression, decision trees, deep neural networks and many other applications. In addition, we show that the KL divergence bounds from above any separable Bregman divergence that is convex in its second argument (up to a multiplicative normalization constant). This result introduces a new set of divergence inequalities, similar to the well-known Pinsker inequality. 
    more » « less
  5. Ranzato, M.; Beygelzimer, A.; Dauphin, Y.; Liang, P.S.; Wortman Vaughan, J. (Ed.)
    We develop nested variational inference (NVI), a family of methods that learn proposals for nested importance samplers by minimizing an forward or reverse KL divergence at each level of nesting. NVI is applicable to many commonly-used importance sampling strategies and provides a mechanism for learning intermediate densities, which can serve as heuristics to guide the sampler. Our experiments apply NVI to (a) sample from a multimodal distribution using a learned annealing path (b) learn heuristics that approximate the likelihood of future observations in a hidden Markov model and (c) to perform amortized inference in hierarchical deep generative models. We observe that optimizing nested objectives leads to improved sample quality in terms of log average weight and effective sample size. 
    more » « less