skip to main content


This content will become publicly available on November 6, 2024

Title: Triplet Reconstruction and all other Phylogenetic CSPs are Approximation Resistant
We study the natural problem of Triplet Reconstruction (also known as Rooted Triplets Consistency or Triplet Clustering), originally motivated by applications in computational biology and relational databases (Aho, Sagiv, Szymanski, and Ullman, 1981) [2]: given n datapoints, we want to embed them onto the n leaves of a rooted binary tree (also known as a hierarchical clustering, or ultrametric embedding) such that a given set of m triplet constraints is satisfied. A triplet constraint i j · k for points i, j, k indicates that 'i, j are more closely related to each other than to k,' (in terms of distances d(i, j) ≤ d(i, k) and d(i, j) ≤ d(j, k)) and we say that a tree satisfies the triplet i j · k if the distance in the tree between i, j is smaller than the distance between i, k (or j, k). Among all possible trees with n leaves, can we efficiently find one that satisfies a large fraction of the m given triplets? Aho et al. (1981) [2] studied the decision version and gave an elegant polynomial-time algorithm that determines whether or not there exists a tree that satisfies all of the m constraints. Moreover, it is straightforward to see that a random binary tree achieves a constant 13-approximation, since there are only 3 distinct triplets i j|k, i k| j, j k · i (each will be satisfied w.p. 13). Unfortunately, despite more than four decades of research by various communities, there is no better approximation algorithm for this basic Triplet Reconstruction problem.Our main theorem-which captures Triplet Reconstruction as a special case-is a general hardness of approximation result about Constraint Satisfaction Problems (CSPs) over infinite domains (CSPs where instead of boolean values {0,1} or a fixed-size domain, the variables can be mapped to any of the n leaves of a tree). Specifically, we prove that assuming the Unique Games Conjecture [57], Triplet Reconstruction and more generally, every Constraint Satisfaction Problem (CSP) over hierarchies is approximation resistant, i.e., there is no polynomial-time algorithm that does asymptotically better than a biased random assignment.Our result settles the approximability not only for Triplet Reconstruction, but for many interesting problems that have been studied by various scientific communities such as the popular Quartet Reconstruction and Subtree/Supertree Aggregation Problems. More broadly, our result significantly extends the list of approximation resistant predicates by pointing to a large new family of hard problems over hierarchies. Our main theorem is a generalization of Guruswami, Håstad, Manokaran, Raghavendra, and Charikar (2011) [36], who showed that ordering CSPs (CSPs over permutations of n elements, e.g., Max Acyclic Subgraph, Betweenness, Non-Betweenness) are approximation resistant. The main challenge in our analyses stems from the fact that trees have topology (in contrast to permutations and ordering CSPs) and it is the tree topology that determines whether a given constraint on the variables is satisfied or not. As a byproduct, we also present some of the first CSPs where their approximation resistance is proved against biased random assignments, instead of uniformly random assignments.  more » « less
Award ID(s):
1955351
NSF-PAR ID:
10519115
Author(s) / Creator(s):
;
Publisher / Repository:
IEEE
Date Published:
ISSN:
9798350318944
ISBN:
979-8-3503-1894-4
Page Range / eLocation ID:
253 to 284
Format(s):
Medium: X
Location:
Santa Cruz, CA, USA
Sponsoring Org:
National Science Foundation
More Like this
  1. Nikhil, Bansal ; Nagarajan, Viswanath (Ed.)
    We initiate a study of the streaming complexity of constraint satisfaction problems (CSPs) when the constraints arrive in a random order. We show that there exists a CSP, namely Max-DICUT, for which random ordering makes a provable difference. Whereas a 4/9 ≈ 0.445 approximation of DICUT requires space with adversarial ordering, we show that with random ordering of constraints there exists a 0.483-approximation algorithm that only needs O(log n) space. We also give new algorithms for Max-DICUT in variants of the adversarial ordering setting. Specifically, we give a two-pass O(log n) space 0.483-approximation algorithm for general graphs and a single-pass space 0.483-approximation algorithm for bounded-degree graphs. On the negative side, we prove that CSPs where the satisfying assignments of the constraints support a one-wise independent distribution require -space for any non-trivial approximation, even when the constraints are randomly ordered. This was previously known only for adversarially ordered constraints. Extending the results to randomly ordered constraints requires switching the hard instances from a union of random matchings to simple Erdős-Renyi random (hyper)graphs and extending tools that can perform Fourier analysis on such instances. The only CSP to have been considered previously with random ordering is Max-CUT where the ordering is not known to change the approximability. Specifically it is known to be as hard to approximate with random ordering as with adversarial ordering, for space algorithms. Our results show a richer variety of possibilities and motivate further study of CSPs with randomly ordered constraints. 
    more » « less
  2. We introduce a notion of \emph{generic local algorithm} which strictly generalizes existing frameworks of local algorithms such as \emph{factors of i.i.d.} by capturing local \emph{quantum} algorithms such as the Quantum Approximate Optimization Algorithm (QAOA). Motivated by a question of Farhi et al. [arXiv:1910.08187, 2019] we then show limitations of generic local algorithms including QAOA on random instances of constraint satisfaction problems (CSPs). Specifically, we show that any generic local algorithm whose assignment to a vertex depends only on a local neighborhood with o(n) other vertices (such as the QAOA at depth less than ϵlog(n)) cannot arbitrarily-well approximate boolean CSPs if the problem satisfies a geometric property from statistical physics called the coupled overlap-gap property (OGP) [Chen et al., Annals of Probability, 47(3), 2019]. We show that the random MAX-k-XOR problem has this property when k≥4 is even by extending the corresponding result for diluted k-spin glasses. Our concentration lemmas confirm a conjecture of Brandao et al. [arXiv:1812.04170, 2018] asserting that the landscape independence of QAOA extends to logarithmic depth -- in other words, for every fixed choice of QAOA angle parameters, the algorithm at logarithmic depth performs almost equally well on almost all instances. One of these concentration lemmas is a strengthening of McDiarmid's inequality, applicable when the random variables have a highly biased distribution, and may be of independent interest. 
    more » « less
  3. Chakrabarti, Amit ; Swamy, Chaitanya (Ed.)
    We analyze the sketching approximability of constraint satisfaction problems on Boolean domains, where the constraints are balanced linear threshold functions applied to literals. In particular, we explore the approximability of monarchy-like functions where the value of the function is determined by a weighted combination of the vote of the first variable (the president) and the sum of the votes of all remaining variables. The pure version of this function is when the president can only be overruled by when all remaining variables agree. For every k ≥ 5, we show that CSPs where the underlying predicate is a pure monarchy function on k variables have no non-trivial sketching approximation algorithm in o(√n) space. We also show infinitely many weaker monarchy functions for which CSPs using such constraints are non-trivially approximable by O(log(n)) space sketching algorithms. Moreover, we give the first example of sketching approximable asymmetric Boolean CSPs. Our results work within the framework of Chou, Golovnev, Sudan, and Velusamy (FOCS 2021) that characterizes the sketching approximability of all CSPs. Their framework can be applied naturally to get a computer-aided analysis of the approximability of any specific constraint satisfaction problem. The novelty of our work is in using their work to get an analysis that applies to infinitely many problems simultaneously. 
    more » « less
  4. The paper continues the development of the rigorous supersymmetric transfer matrix approach to the random band matrices started in (J Stat Phys 164:1233–1260, 2016; Commun Math Phys 351:1009–1044, 2017). We consider random Hermitian block band matrices consisting of $W\times W$ random Gaussian blocks (parametrized by $j,k \in\Lambda=[1,n]^d\cap \mathbb{Z}^d$) with a fixed entry's variance $J_{jk}=\delta_{j,k}W^{-1}+\beta\Delta_{j,k}W^{-2}$, $\beta>0$ in each block. Taking the limit $W\to\infty$ with fixed $n$ and $\beta$, we derive the sigma-model approximation of the second correlation function similar to Efetov's one. Then, considering the limit $\beta, n\to\infty$, we prove that in the dimension $d=1$ the behaviour of the sigma-model approximation in the bulk of the spectrum, as $\beta\gg n$, is determined by the classical Wigner -- Dyson statistics. 
    more » « less
  5. Leonardi, Stefano ; Gupta, Anupam (Ed.)
    We consider the approximability of constraint satisfaction problems in the streaming setting. For every constraint satisfaction problem (CSP) on n variables taking values in {0,…,q−1}, we prove that improving over the trivial approximability by a factor of q requires Ω(n) space even on instances with O(n) constraints. We also identify a broad subclass of problems for which any improvement over the trivial approximability requires Ω(n) space. The key technical core is an optimal, q−(k−1)-inapproximability for the Max k-LIN-mod q problem, which is the Max CSP problem where every constraint is given by a system of k−1 linear equations mod q over k variables. Our work builds on and extends the breakthrough work of Kapralov and Krachun (Proc. STOC 2019) who showed a linear lower bound on any non-trivial approximation of the MaxCut problem in graphs. MaxCut corresponds roughly to the case of Max k-LIN-mod q with k=q=2. For general CSPs in the streaming setting, prior results only yielded Ω(√n) space bounds. In particular no linear space lower bound was known for an approximation factor less than 1/2 for any CSP. Extending the work of Kapralov and Krachun to Max k-LIN-mod q to k>2 and q>2 (while getting optimal hardness results) is the main technical contribution of this work. Each one of these extensions provides non-trivial technical challenges that we overcome in this work. 
    more » « less