skip to main content

Attention:

The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 11:00 PM ET on Friday, March 22 until 6:00 AM ET on Saturday, March 23 due to maintenance. We apologize for the inconvenience.


Title: HOGWILD!-Gibbs can be PanAccurate
Asynchronous Gibbs sampling has been recently shown to be fast-mixing and an accurate method for estimating probabilities of events on a small number of variables of a graphical model satisfying Dobrushin's condition~\cite{DeSaOR16}. We investigate whether it can be used to accurately estimate expectations of functions of {\em all the variables} of the model. Under the same condition, we show that the synchronous (sequential) and asynchronous Gibbs samplers can be coupled so that the expected Hamming distance between their (multivariate) samples remains bounded by O(τlogn), where n is the number of variables in the graphical model, and τ is a measure of the asynchronicity. A similar bound holds for any constant power of the Hamming distance. Hence, the expectation of any function that is Lipschitz with respect to a power of the Hamming distance, can be estimated with a bias that grows logarithmically in n. Going beyond Lipschitz functions, we consider the bias arising from asynchronicity in estimating the expectation of polynomial functions of all variables in the model. Using recent concentration of measure results, we show that the bias introduced by the asynchronicity is of smaller order than the standard deviation of the function value already present in the true model. We perform experiments on a multi-processor machine to empirically illustrate our theoretical findings.  more » « less
Award ID(s):
1741137
NSF-PAR ID:
10125382
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Journal of neural information processing
ISSN:
2522-6517
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Asynchronous Gibbs sampling has been recently shown to be fast-mixing and an accurate method for estimating probabilities of events on a small number of variables of a graphical model satisfying Dobrushin's condition~\cite{DeSaOR16}. We investigate whether it can be used to accurately estimate expectations of functions of {\em all the variables} of the model. Under the same condition, we show that the synchronous (sequential) and asynchronous Gibbs samplers can be coupled so that the expected Hamming distance between their (multivariate) samples remains bounded by O(τlogn), where n is the number of variables in the graphical model, and τ is a measure of the asynchronicity. A similar bound holds for any constant power of the Hamming distance. Hence, the expectation of any function that is Lipschitz with respect to a power of the Hamming distance, can be estimated with a bias that grows logarithmically in n. Going beyond Lipschitz functions, we consider the bias arising from asynchronicity in estimating the expectation of polynomial functions of all variables in the model. Using recent concentration of measure results, we show that the bias introduced by the asynchronicity is of smaller order than the standard deviation of the function value already present in the true model. We perform experiments on a multi-processor machine to empirically illustrate our theoretical findings. 
    more » « less
  2. The seminal result of Kahn, Kalai and Linial shows that a coalition of O(n/(log n)) players can bias the outcome of any Boolean function {0,1}^n -> {0,1} with respect to the uniform measure. We extend their result to arbitrary product measures on {0,1}^n, by combining their argument with a completely different argument that handles very biased input bits. We view this result as a step towards proving a conjecture of Friedgut, which states that Boolean functions on the continuous cube [0,1]^n (or, equivalently, on {1,...,n}^n) can be biased using coalitions of o(n) players. This is the first step taken in this direction since Friedgut proposed the conjecture in 2004. Russell, Saks and Zuckerman extended the result of Kahn, Kalai and Linial to multi-round protocols, showing that when the number of rounds is o(log^* n), a coalition of o(n) players can bias the outcome with respect to the uniform measure. We extend this result as well to arbitrary product measures on {0,1}^n. The argument of Russell et al. relies on the fact that a coalition of o(n) players can boost the expectation of any Boolean function from epsilon to 1-epsilon with respect to the uniform measure. This fails for general product distributions, as the example of the AND function with respect to mu_{1-1/n} shows. Instead, we use a novel boosting argument alongside a generalization of our first result to arbitrary finite ranges. 
    more » « less
  3. Lysyanskaya, Anna ; Handschuh, Helena (Ed.)
    We study the black-box function inversion problem, which is the problem of finding x[N] such that f(x)=y, given as input some challenge point y in the image of a function f:[N][N], using T oracle queries to f and preprocessed advice 01S depending on f. We prove a number of new results about this problem, as follows. 1. We show an algorithm that works for any T and S satisfying TS2maxST=(N3) . In the important setting when ST, this improves on the celebrated algorithm of Fiat and Naor [STOC, 1991], which requires TS3N3. E.g., Fiat and Naor's algorithm is only non-trivial for SN23 , while our algorithm gives a non-trivial tradeoff for any SN12 . (Our algorithm and analysis are quite simple. As a consequence of this, we also give a self-contained and simple proof of Fiat and Naor's original result, with certain optimizations left out for simplicity.) 2. We show a non-adaptive algorithm (i.e., an algorithm whose ith query xi is chosen based entirely on and y, and not on the f(x1)f(xi−1)) that works for any T and S satisfying S=(Nlog(NT)) giving the first non-trivial non-adaptive algorithm for this problem. E.g., setting T=Npolylog(N) gives S=(NloglogN). This answers a question due to Corrigan-Gibbs and Kogan [TCC, 2019], who asked whether it was possible for a non-adaptive algorithm to work with parameters T and S satisfying T+SlogNo(N) . We also observe that our non-adaptive algorithm is what we call a guess-and-check algorithm, that is, it is non-adaptive and its final output is always one of the oracle queries x1xT. For guess-and-check algorithms, we prove a matching lower bound, therefore completely characterizing the achievable parameters (ST) for this natural class of algorithms. (Corrigan-Gibbs and Kogan showed that any such lower bound for arbitrary non-adaptive algorithms would imply new circuit lower bounds.) 3. We show equivalence between function inversion and a natural decision version of the problem in both the worst case and the average case, and similarly for functions f:[N][M] with different ranges. All of the above results are most naturally described in a model with shared randomness (i.e., random coins shared between the preprocessing algorithm and the online algorithm). However, as an additional contribution, we show (using a technique from communication complexity due to Newman [IPL, 1991]) how to generically convert any algorithm that uses shared randomness into one that does not. 
    more » « less
  4. null (Ed.)
    Ribonucleic acid (RNA) secondary structures and branching properties are important for determining functional ramifications in biology. While energy minimization of the Nearest Neighbor Thermodynamic Model (NNTM) is commonly used to identify such properties (number of hairpins, maximum ladder distance, etc.), it is difficult to know whether the resultant values fall within expected dispersion thresholds for a given energy function. The goal of this study was to construct a Markov chain capable of examining the dispersion of RNA secondary structures and branching properties obtained from NNTM energy function minimization independent of a specific nucleotide sequence. Plane trees are studied as a model for RNA secondary structure, with energy assigned to each tree based on the NNTM, and a corresponding Gibbs distribution is defined on the trees. Through a bijection between plane trees and 2-Motzkin paths, a Markov chain converging to the Gibbs distribution is constructed, and fast mixing time is established by estimating the spectral gap of the chain. The spectral gap estimate is obtained through a series of decompositions of the chain and also by building on known mixing time results for other chains on Dyck paths. The resulting algorithm can be used as a tool for exploring the branching structure of RNA, especially for long sequences, and to examine branching structure dependence on energy model parameters. Full exposition is provided for the mathematical techniques used with the expectation that these techniques will prove useful in bioinformatics, computational biology, and additional extended applications. 
    more » « less
  5. Embedding properties of network realizations of dissipative reduced order models Jörn Zimmerling, Mikhail Zaslavsky,Rob Remis, Shasri Moskow, Alexander Mamonov, Murthy Guddati, Vladimir Druskin, and Liliana Borcea Mathematical Sciences Department, Worcester Polytechnic Institute https://www.wpi.edu/people/vdruskin Abstract Realizations of reduced order models of passive SISO or MIMO LTI problems can be transformed to tridiagonal and block-tridiagonal forms, respectively, via dierent modications of the Lanczos algorithm. Generally, such realizations can be interpreted as ladder resistor-capacitor-inductor (RCL) networks. They gave rise to network syntheses in the rst half of the 20th century that was at the base of modern electronics design and consecutively to MOR that tremendously impacted many areas of engineering (electrical, mechanical, aerospace, etc.) by enabling ecient compression of the underlining dynamical systems. In his seminal 1950s works Krein realized that in addition to their compressing properties, network realizations can be used to embed the data back into the state space of the underlying continuum problems. In more recent works of the authors Krein's ideas gave rise to so-called nite-dierence Gaussian quadrature rules (FDGQR), allowing to approximately map the ROM state-space representation to its full order continuum counterpart on a judicially chosen grid. Thus, the state variables can be accessed directly from the transfer function without solving the full problem and even explicit knowledge of the PDE coecients in the interior, i.e., the FDGQR directly learns" the problem from its transfer function. This embedding property found applications in PDE solvers, inverse problems and unsupervised machine learning. Here we show a generalization of this approach to dissipative PDE problems, e.g., electromagnetic and acoustic wave propagation in lossy dispersive media. Potential applications include solution of inverse scattering problems in dispersive media, such as seismic exploration, radars and sonars. To x the idea, we consider a passive irreducible SISO ROM fn(s) = Xn j=1 yi s + σj , (62) assuming that all complex terms in (62) come in conjugate pairs. We will seek ladder realization of (62) as rjuj + vj − vj−1 = −shˆjuj , uj+1 − uj + ˆrj vj = −shj vj , (63) for j = 0, . . . , n with boundary conditions un+1 = 0, v1 = −1, and 4n real parameters hi, hˆi, ri and rˆi, i = 1, . . . , n, that can be considered, respectively, as the equivalent discrete inductances, capacitors and also primary and dual conductors. Alternatively, they can be viewed as respectively masses, spring stiness, primary and dual dampers of a mechanical string. Reordering variables would bring (63) into tridiagonal form, so from the spectral measure given by (62 ) the coecients of (63) can be obtained via a non-symmetric Lanczos algorithm written in J-symmetric form and fn(s) can be equivalently computed as fn(s) = u1. The cases considered in the original FDGQR correspond to either (i) real y, θ or (ii) real y and imaginary θ. Both cases are covered by the Stieltjes theorem, that yields in case (i) real positive h, hˆ and trivial r, rˆ, and in case (ii) real positive h,r and trivial hˆ,rˆ. This result allowed us a simple interpretation of (62) as the staggered nite-dierence approximation of the underlying PDE problem [2]. For PDEs in more than one variables (including topologically rich data-manifolds), a nite-dierence interpretation is obtained via a MIMO extensions in block form, e.g., [4, 3]. The main diculty of extending this approach to general passive problems is that the Stieltjes theory is no longer applicable. Moreover, the tridiagonal realization of a passive ROM transfer function (62) via the ladder network (63) cannot always be obtained in port-Hamiltonian form, i.e., the equivalent primary and dual conductors may change sign [1]. 100 Embedding of the Stieltjes problems, e.g., the case (i) was done by mapping h and hˆ into values of acoustic (or electromagnetic) impedance at grid cells, that required a special coordinate stretching (known as travel time coordinate transform) for continuous problems. Likewise, to circumvent possible non-positivity of conductors for the non-Stieltjes case, we introduce an additional complex s-dependent coordinate stretching, vanishing as s → ∞ [1]. This stretching applied in the discrete setting induces a diagonal factorization, removes oscillating coecients, and leads to an accurate embedding for moderate variations of the coecients of the continuum problems, i.e., it maps discrete coecients onto the values of their continuum counterparts. Not only does this embedding yields an approximate linear algebraic algorithm for the solution of the inverse problems for dissipative PDEs, it also leads to new insight into the properties of their ROM realizations. We will also discuss another approach to embedding, based on Krein-Nudelman theory [5], that results in special data-driven adaptive grids. References [1] Borcea, Liliana and Druskin, Vladimir and Zimmerling, Jörn, A reduced order model approach to inverse scattering in lossy layered media, Journal of Scientic Computing, V. 89, N1, pp. 136,2021 [2] Druskin, Vladimir and Knizhnerman, Leonid, Gaussian spectral rules for the three-point second dierences: I. A two-point positive denite problem in a semi-innite domain, SIAM Journal on Numerical Analysis, V. 37, N 2, pp.403422, 1999 [3] Druskin, Vladimir and Mamonov, Alexander V and Zaslavsky, Mikhail, Distance preserving model order reduction of graph-Laplacians and cluster analysis, Druskin, Vladimir and Mamonov, Alexander V and Zaslavsky, Mikhail, Journal of Scientic Computing, V. 90, N 1, pp 130, 2022 [4] Druskin, Vladimir and Moskow, Shari and Zaslavsky, Mikhail LippmannSchwingerLanczos algorithm for inverse scattering problems, Inverse Problems, V. 37, N. 7, 2021, [5] Mark Adolfovich Nudelman The Krein String and Characteristic Functions of Maximal Dissipative Operators, Journal of Mathematical Sciences, 2004, V 124, pp 49184934 Go back to Plenary Speakers Go back to Speakers Go back 
    more » « less