skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Lossy Kernelization for (Implicit) Hitting Set Problems
We re-visit the complexity of polynomial time pre-processing (kernelization) for the d-Hitting Set problem. This is one of the most classic problems in Parameterized Complexity by itself, and, furthermore, it encompasses several other of the most well-studied problems in this field, such as Vertex Cover, Feedback Vertex Set in Tournaments (FVST) and Cluster Vertex Deletion (CVD). In fact, d-Hitting Set encompasses any deletion problem to a hereditary property that can be characterized by a finite set of forbidden induced subgraphs. With respect to bit size, the kernelization complexity of d-Hitting Set is essentially settled: there exists a kernel with 𝒪(k^d) bits (𝒪(k^d) sets and 𝒪(k^{d-1}) elements) and this it tight by the result of Dell and van Melkebeek [STOC 2010, JACM 2014]. Still, the question of whether there exists a kernel for d-Hitting Set with fewer elements has remained one of the most major open problems in Kernelization. In this paper, we first show that if we allow the kernelization to be lossy with a qualitatively better loss than the best possible approximation ratio of polynomial time approximation algorithms, then one can obtain kernels where the number of elements is linear for every fixed d. Further, based on this, we present our main result: we show that there exist approximate Turing kernelizations for d-Hitting Set that even beat the established bit-size lower bounds for exact kernelizations - in fact, we use a constant number of oracle calls, each with "near linear" (𝒪(k^{1+ε})) bit size, that is, almost the best one could hope for. Lastly, for two special cases of implicit 3-Hitting set, namely, FVST and CVD, we obtain the "best of both worlds" type of results - (1+ε)-approximate kernelizations with a linear number of vertices. In terms of size, this substantially improves the exact kernels of Fomin et al. [SODA 2018, TALG 2019], with simpler arguments.  more » « less
Award ID(s):
2008838
PAR ID:
10608200
Author(s) / Creator(s):
; ; ; ; ;
Editor(s):
Gørtz, Inge Li; Farach-Colton, Martin; Puglisi, Simon J; Herman, Grzegorz
Publisher / Repository:
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Date Published:
Volume:
274
ISSN:
1868-8969
ISBN:
978-3-95977-295-2
Page Range / eLocation ID:
49:1-49:14
Subject(s) / Keyword(s):
Hitting Set Lossy Kernelization Theory of computation → Parameterized complexity and exact algorithms
Format(s):
Medium: X Size: 14 pages; 875348 bytes Other: application/pdf
Size(s):
14 pages 875348 bytes
Right(s):
Creative Commons Attribution 4.0 International license; info:eu-repo/semantics/openAccess
Sponsoring Org:
National Science Foundation
More Like this
  1. Leroux, Jérôme; Lombardy, Sylvain; Peleg, David (Ed.)
    Feedback Vertex Set (FVS) is one of the most studied vertex deletion problems in the field of graph algorithms. In the decision version of the problem, given a graph G and an integer k, the question is whether there exists a set S of at most k vertices in G such that G-S is acyclic. It is one of the first few problems which were shown to be NP-complete, and has been extensively studied from the viewpoint of approximation and parameterized algorithms. The best-known polynomial time approximation algorithm for FVS is a 2-factor approximation, while the best known deterministic and randomized FPT algorithms run in time 𝒪^*(3.460^k) and 𝒪^*(2.7^k) respectively. In this paper, we contribute to the newly established area of parameterized approximation, by studying FVS in this paradigm. In particular, we combine the approaches of parameterized and approximation algorithms for the study of FVS, and achieve an approximation guarantee with a factor better than 2 in randomized FPT running time, that improves over the best known parameterized algorithm for FVS. We give three simple randomized (1+ε) approximation algorithms for FVS, running in times 𝒪^*(2^{εk}⋅ 2.7^{(1-ε)k}), 𝒪^*(({(4/(1+ε))^{(1+ε)}}⋅{(ε/3)^ε})^k), and 𝒪^*(4^{(1-ε)k}) respectively for every ε ∈ (0,1). Combining these three algorithms, we obtain a factor (1+ε) approximation algorithm for FVS, which has better running time than the best-known (randomized) FPT algorithm for every ε ∈ (0, 1). This is the first attempt to look at a parameterized approximation of FVS to the best of our knowledge. Our algorithms are very simple, and they rely on some well-known reduction rules used for arriving at FPT algorithms for FVS. 
    more » « less
  2. We study the bit complexity of two related fundamental computational problems in linear algebra and control theory. Our results are: (1) An Õ(n^{ω+3}a+n⁴a²+n^ωlog(1/ε)) time algorithm for finding an ε-approximation to the Jordan Normal form of an integer matrix with a-bit entries, where ω is the exponent of matrix multiplication. (2) An Õ(n⁶d⁶a+n⁴d⁴a²+n³d³log(1/ε)) time algorithm for ε-approximately computing the spectral factorization P(x) = Q^*(x)Q(x) of a given monic n× n rational matrix polynomial of degree 2d with rational a-bit coefficients having a-bit common denominators, which satisfies P(x)⪰0 for all real x. The first algorithm is used as a subroutine in the second one. Despite its being of central importance, polynomial complexity bounds were not previously known for spectral factorization, and for Jordan form the best previous best running time was an unspecified polynomial in n of degree at least twelve [Cai, 1994]. Our algorithms are simple and judiciously combine techniques from numerical and symbolic computation, yielding significant advantages over either approach by itself. 
    more » « less
  3. Beyersdorff, Olaf; Pilipczuk, Michał; Pimentel, Elaine; Thắng, Nguyễn Kim (Ed.)
    For a length n text over an alphabet of size σ, we can encode the suffix tree data structure in 𝒪(nlog σ) bits of space. It supports suffix array (SA), inverse suffix array (ISA), and longest common extension (LCE) queries in 𝒪(log^ε_σ n) time, which enables efficient pattern matching; here ε > 0 is an arbitrarily small constant. Further improvements are possible for LCE queries, where 𝒪(1) time queries can be achieved using an index of space 𝒪(nlog σ) bits. However, compactly indexing a two-dimensional text (i.e., an n× n matrix) has been a major open problem. We show progress in this direction by first presenting an 𝒪(n²log σ)-bit structure supporting LCE queries in near 𝒪((log_σ n)^{2/3}) time. We then present an 𝒪(n²log σ + n²log log n)-bit structure supporting ISA queries in near 𝒪(log n ⋅ (log_σ n)^{2/3}) time. Within a similar space, achieving SA queries in poly-logarithmic (even strongly sub-linear) time is a significant challenge. However, our 𝒪(n²log σ + n²log log n)-bit structure can support SA queries in 𝒪(n²/(σ log n)^c) time, where c is an arbitrarily large constant, which enables pattern matching in time faster than what is possible without preprocessing. We then design a repetition-aware data structure. The δ_2D compressibility measure for two-dimensional texts was recently introduced by Carfagna and Manzini [SPIRE 2023]. The measure ranges from 1 to n², with smaller δ_2D indicating a highly compressible two-dimensional text. The current data structure utilizing δ_2D allows only element access. We obtain the first structure based on δ_2D for LCE queries. It takes 𝒪^{~}(n^{5/3} + n^{8/5}δ_2D^{1/5}) space and answers queries in 𝒪(log n) time. 
    more » « less
  4. Etessami, Kousha; Feige, Uriel; Puppis, Gabriele (Ed.)
    In the Min k-Cut problem, the input is a graph G and an integer k. The task is to find a partition of the vertex set of G into k parts, while minimizing the number of edges that go between different parts of the partition. The problem is NP-complete, and admits a simple 3ⁿ⋅n^𝒪(1) time dynamic programming algorithm, which can be improved to a 2ⁿ⋅n^𝒪(1) time algorithm using the fast subset convolution framework by Björklund et al. [STOC'07]. In this paper we give an algorithm for Min k-Cut with running time 𝒪((2-ε)ⁿ), for ε > 10^{-50}. This is the first algorithm for Min k-Cut with running time 𝒪(cⁿ) for c < 2. 
    more » « less
  5. Gørtz, Inge Li; Farach-Colton, Martin; Puglisi, Simon J; Herman, Grzegorz (Ed.)
    In the Minimum Bisection problem input is a graph G and the goal is to partition the vertex set into two parts A and B, such that ||A|-|B|| ≤ 1 and the number k of edges between A and B is minimized. The problem is known to be NP-hard, and assuming the Unique Games Conjecture even NP-hard to approximate within a constant factor [Khot and Vishnoi, J.ACM'15]. On the other hand, a 𝒪(log n)-approximation algorithm [Räcke, STOC'08] and a parameterized algorithm [Cygan et al., ACM Transactions on Algorithms'20] running in time k^𝒪(k) n^𝒪(1) is known. The Minimum Bisection problem can be viewed as a clustering problem where edges represent similarity and the task is to partition the vertices into two equally sized clusters while minimizing the number of pairs of similar objects that end up in different clusters. Motivated by a number of egregious examples of unfair bias in AI systems, many fundamental clustering problems have been revisited and re-formulated to incorporate fairness constraints. In this paper we initiate the study of the Minimum Bisection problem with fairness constraints. Here the input is a graph G, positive integers c and k, a function χ:V(G) → {1, …, c} that assigns a color χ(v) to each vertex v in G, and c integers r_1,r_2,⋯,r_c. The goal is to partition the vertex set of G into two almost-equal sized parts A and B with at most k edges between them, such that for each color i ∈ {1, …, c}, A has exactly r_i vertices of color i. Each color class corresponds to a group which we require the partition (A, B) to treat fairly, and the constraints that A has exactly r_i vertices of color i can be used to encode that no group is over- or under-represented in either of the two clusters. We first show that introducing fairness constraints appears to make the Minimum Bisection problem qualitatively harder. Specifically we show that unless FPT=W[1] the problem admits no f(c)n^𝒪(1) time algorithm even when k = 0. On the other hand, our main technical contribution shows that is that this hardness result is simply a consequence of the very strict requirement that each color class i has exactly r_i vertices in A. In particular we give an f(k,c,ε)n^𝒪(1) time algorithm that finds a balanced partition (A, B) with at most k edges between them, such that for each color i ∈ [c], there are at most (1±ε)r_i vertices of color i in A. Our approximation algorithm is best viewed as a proof of concept that the technique introduced by [Lampis, ICALP'18] for obtaining FPT-approximation algorithms for problems of bounded tree-width or clique-width can be efficiently exploited even on graphs of unbounded width. The key insight is that the technique of Lampis is applicable on tree decompositions with unbreakable bags (as introduced in [Cygan et al., SIAM Journal on Computing'14]). An important ingredient of our approximation scheme is a combinatorial result that may be of independent interest, namely that for every k, every graph G admits a tree decomposition with adhesions of size at most 𝒪(k), unbreakable bags, and logarithmic depth. 
    more » « less