skip to main content


Title: Promise Problems Meet Pseudodeterminism
The {\sc Acceptance Probability Estimation Problem} (APEP) is to additively approximate the acceptance probability of a Boolean circuit. This problem admits a probabilistic approximation scheme. A central question is whether we can design a {\em pseudodeterministic} approximation algorithm for this problem: a probabilistic polynomial-time algorithm that outputs a canonical approximation with high probability. Recently, it was shown that such an algorithm would imply that {\em every approximation algorithm can be made pseudodeterministic} (Dixon, Pavan, Vinodchandran; ITCS 2021). The main conceptual contribution of this work is to establish that the existence of a pseudodeterministic algorithm for APEP is fundamentally connected to the relationship between probabilistic promise classes and the corresponding standard complexity classes. In particular, we show the following equivalence: {\em every promise problem in PromiseBPP has a solution in BPP if and only if APEP has a pseudodeterministic algorithm}. Based on this intuition, we show that pseudodeterministic algorithms for APEP can shed light on a few central topics in complexity theory such as circuit lowerbounds, probabilistic hierarchy theorems, and multi-pseudodeterminism.  more » « less
Award ID(s):
1934884
NSF-PAR ID:
10221650
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Electronic colloquium on computational complexity
Volume:
28
ISSN:
1433-8092
Page Range / eLocation ID:
43
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Stefano Leonardi and Anupam Gupta (Ed.)
    A probabilistic algorithm A is pseudodeterministic if, on every input, there exists a canonical value that is output with high probability. If the algorithm outputs one of k canonical values with high probability, then it is called a k-pseudodeterministic algorithm. In the study of pseudodeterminism, the Acceptance Probability Estimation Problem (APEP), which is to additively approximate the acceptance probability of a Boolean circuit, is emerging as a central computational problem. This problem admits a 2-pseudodeterministic algorithm. Recently, it was shown that a pseudodeterministic algorithm for this problem would imply that any multi-valued function that admits a k-pseudodeterministic algorithm for a constant k (including approximation algorithms) also admits a pseudodeterministic algorithm (Dixon, Pavan, Vinodchandran; ITCS 2021). The contribution of the present work is two-fold. First, as our main conceptual contribution, we establish that the existence of a pseudodeterministic algorithm for APEP is fundamentally related to the gap between probabilistic promise classes and the corresponding standard complexity classes. In particular, we show the following equivalence: APEP has a pseudodeterministic approximation algorithm if and only if every promise problem in PromiseBPP has a solution in BPP. A conceptual interpretation of this equivalence is that the algorithmic gap between 2-pseudodeterminism and pseudodeterminism is equivalent to the gap between PromiseBPP and BPP. Based on this connection, we show that designing pseudodeterministic algorithms for APEP leads to the solution of some open problems in complexity theory, including new Boolean circuit lower bounds. This equivalence also explains how multi-pseudodeterminism is connected to problems in SearchBPP. In particular, we show that if APEP has a pseudodeterministic algorithm, then every problem that admits a k(n)-pseudodeterministic algorithm (for any polynomial k) is in SearchBPP and admits a pseudodeterministic algorithm. Motivated by this connection, we also explore its connection to probabilistic search problems and establish that APEP is complete for certain notions of search problems in the context of pseudodeterminism. Our second contribution is establishing query complexity lower bounds for multi-pseudodeterministic computations. We prove that for every k ≥ 1, there exists a problem whose (k+1)-pseudodeterministic query complexity, in the uniform query model, is O(1) but has a k-pseudodeterministic query complexity of Ω(n), even in the more general nonadaptive query model. A key contribution of this part of the work is the utilization of Sperner’s lemma in establishing query complexity lower bounds. 
    more » « less
  2. The noise sensitivity of a Boolean function f: {0,1}^n - > {0,1} is one of its fundamental properties. For noise parameter delta, the noise sensitivity is denoted as NS_{delta}[f]. This quantity is defined as follows: First, pick x = (x_1,...,x_n) uniformly at random from {0,1}^n, then pick z by flipping each x_i independently with probability delta. NS_{delta}[f] is defined to equal Pr [f(x) != f(z)]. Much of the existing literature on noise sensitivity explores the following two directions: (1) Showing that functions with low noise-sensitivity are structured in certain ways. (2) Mathematically showing that certain classes of functions have low noise sensitivity. Combined, these two research directions show that certain classes of functions have low noise sensitivity and therefore have useful structure. The fundamental importance of noise sensitivity, together with this wealth of structural results, motivates the algorithmic question of approximating NS_{delta}[f] given an oracle access to the function f. We show that the standard sampling approach is essentially optimal for general Boolean functions. Therefore, we focus on estimating the noise sensitivity of monotone functions, which form an important subclass of Boolean functions, since many functions of interest are either monotone or can be simply transformed into a monotone function (for example the class of unate functions consists of all the functions that can be made monotone by reorienting some of their coordinates [O'Donnell, 2014]). Specifically, we study the algorithmic problem of approximating NS_{delta}[f] for monotone f, given the promise that NS_{delta}[f] >= 1/n^{C} for constant C, and for delta in the range 1/n <= delta <= 1/2. For such f and delta, we give a randomized algorithm performing O((min(1,sqrt{n} delta log^{1.5} n))/(NS_{delta}[f]) poly (1/epsilon)) queries and approximating NS_{delta}[f] to within a multiplicative factor of (1 +/- epsilon). Given the same constraints on f and delta, we also prove a lower bound of Omega((min(1,sqrt{n} delta))/(NS_{delta}[f] * n^{xi})) on the query complexity of any algorithm that approximates NS_{delta}[f] to within any constant factor, where xi can be any positive constant. Thus, our algorithm's query complexity is close to optimal in terms of its dependence on n. We introduce a novel descending-ascending view of noise sensitivity, and use it as a central tool for the analysis of our algorithm. To prove lower bounds on query complexity, we develop a technique that reduces computational questions about query complexity to combinatorial questions about the existence of "thin" functions with certain properties. The existence of such "thin" functions is proved using the probabilistic method. These techniques also yield new lower bounds on the query complexity of approximating other fundamental properties of Boolean functions: the total influence and the bias. 
    more » « less
  3. Real-time decision making in IoT applications relies upon space-efficient evaluation of queries over streaming data. To model the uncertainty in the classification of data being processed, we consider the model of probabilistic strings --- sequences of discrete probability distributions over a finite set of events, and initiate the study of space complexity of streaming computation for different classes of queries over such probabilistic strings. We first consider the problem of computing the probability that a word, sampled from the distribution defined by the probabilistic string read so far, is accepted by a given deterministic finite automaton. We show that this regular pattern matching problem can be solved using space that is only poly-logarithmic in the string length (and polynomial in the size of the DFA) if we are allowed a multiplicative approximation error. Then we show how to generalize this result to quantitative queries specified by additive cost register automata --- these are automata that map strings to numerical values using finite control and registers that get updated using linear transformations. Finally, we consider the case when updates in such an automaton involve tests, and in particular, when there is a counter variable that can be either incremented or decremented but decrements only apply when the counter value is non-zero. In this case, the desired answer depends on the probability distribution over the set of possible counter values that can range from 0 to n for a string of length n. Under a mild assumption, namely probabilities of the individual events are bounded away from 0 and 1, we show that there is an algorithm that can compute all n entries of this probability distribution vector to within additive 1/poly(n) error using space that is only Õ(n). In establishing these results, we introduce several new technical ideas that may prove useful for designing space-efficient algorithms for other query models over probabilistic strings. 
    more » « less
  4. Tuple-independent probabilistic databases (TI-PDBs) han- dle uncertainty by annotating each tuple with a probability parameter; when the user submits a query, the database de- rives the marginal probabilities of each output-tuple, assum- ing input-tuples are statistically independent. While query processing in TI-PDBs has been studied extensively, limited research has been dedicated to the problems of updating or deriving the parameters from observations of query results . Addressing this problem is the main focus of this paper. We introduce Beta Probabilistic Databases (B-PDBs), a general- ization of TI-PDBs designed to support both (i) belief updat- ing and (ii) parameter learning in a principled and scalable way. The key idea of B-PDBs is to treat each parameter as a latent, Beta-distributed random variable. We show how this simple expedient enables both belief updating and pa- rameter learning in a principled way, without imposing any burden on regular query processing. We use this model to provide the following key contributions: (i) we show how to scalably compute the posterior densities of the parameters given new evidence; (ii) we study the complexity of perform- ing Bayesian belief updates, devising efficient algorithms for tractable classes of queries; (iii) we propose a soft-EM algo- rithm for computing maximum-likelihood estimates of the parameters; (iv) we show how to embed the proposed algo- rithms into a standard relational engine; (v) we support our conclusions with extensive experimental results. 
    more » « less
  5. The research problem of how to use a high-speed circuit switch, typically an optical switch, to most effectively boost the switching capacity of a datacenter network, has been extensively studied. In this work, we focus on a different but related research problem that arises when multiple (say $s$) parallel circuit switches are used: How to best split a switching workload $D$ into sub-workloads $D_1, D_2, ..., D_s$, and give them to the $s$ switches as their respective workloads, so that the overall makespan of the parallel switching system is minimized? Computing such an optimal split is unfortunately NP-hard, since the circuit/optical switch incurs a nontrivial reconfiguration delay when the switch configuration has to change. In this work, we formulate a weaker form of this problem: How to minimize the total number of nonzero entries in $D_1, D_2, ..., D_s$ (so that the overall reconfiguration cost can be kept low), under the constraint that every row or column sum of $D$ (which corresponds to the workload imposed on a sending or receiving rack respectively) is evenly split? Although this weaker problem is still NP-hard, we are able to design LESS, an approximation algorithm that has a low approximation ratio of only $1+\epsilon$ in practice and a low computational complexity of only $O(m^2)$, where $m = \|D\|_0$ is the number of nonzero entries in $D$. Our simulation studies show that LESS results in excellent overall makespan performances under realistic datacenter traffic workloads and parameter settings. 
    more » « less