skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Differentially Private Partial Set Cover with Applications to Facility Location
Set Cover is a fundamental problem in combinatorial optimization which has been studied for many decades due to its various applications across multiple domains. In many of these domains, the input data consists of locations, relationships, and other sensitive information of individuals which may leaked due to the set cover output. Attempts have been made to design privacy-preserving algorithms to solve the Set Cover problem under privacy constraints. Under differential privacy, it has been proved that the Set Cover problem has strong impossibility results and no explicit forms of the output can be released to the public. In this work, we observe that these hardness results dissolve when we turn to the Partial Set Cover problem, where we only need to cover a constant fraction of the elements. We show that this relaxation enables us to avoid the impossibility results, and give the first algorithm which outputs an explicit form of set cover with non-trivial utility guarantees under differential privacy. Using our algorithm as a subroutine, we design a differentially-private bicriteria algorithm to solve a recently-proposed facility-location problem for vaccine distribution which generalizes k-supplier with outliers. Our analysis shows that relaxing the covering requirement also allows us to circumvent the inherent hardness of k-supplier and give the first nontrivial guarantees.  more » « less
Award ID(s):
1918749
PAR ID:
10497732
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
International Joint Conferences on Artificial Intelligence Organization
Date Published:
Journal Name:
International Joint Conference on Artificial Intelligence (IJCAI)
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Directed Steiner Tree (DST) is a central problem in combinatorial optimization and theoretical computer science: Given a directed graph G = (V, E) with edge costs c ∈ ℝ_{≥ 0}^E, a root r ∈ V and k terminals K ⊆ V, we need to output a minimum-cost arborescence in G that contains an rrightarrow t path for every t ∈ K. Recently, Grandoni, Laekhanukit and Li, and independently Ghuge and Nagarajan, gave quasi-polynomial time O(log²k/log log k)-approximation algorithms for the problem, which are tight under popular complexity assumptions. In this paper, we consider the more general Degree-Bounded Directed Steiner Tree (DB-DST) problem, where we are additionally given a degree bound d_v on each vertex v ∈ V, and we require that every vertex v in the output tree has at most d_v children. We give a quasi-polynomial time (O(log n log k), O(log² n))-bicriteria approximation: The algorithm produces a solution with cost at most O(log nlog k) times the cost of the optimum solution that violates the degree constraints by at most a factor of O(log²n). This is the first non-trivial result for the problem. While our cost-guarantee is nearly optimal, the degree violation factor of O(log²n) is an O(log n)-factor away from the approximation lower bound of Ω(log n) from the Set Cover hardness. The hardness result holds even on the special case of the Degree-Bounded Group Steiner Tree problem on trees (DB-GST-T). With the hope of closing the gap, we study the question of whether the degree violation factor can be made tight for this special case. We answer the question in the affirmative by giving an (O(log nlog k), O(log n))-bicriteria approximation algorithm for DB-GST-T. 
    more » « less
  2. Differential obliviousness (DO) is a privacy notion which guarantees that the access patterns of a program satisfies differential privacy. Differential obliviousness was studied in a sequence of recent works as a relaxation of full obliviousness. Earlier works showed that DO not only allows us to circumvent the logarithmic-overhead barrier of fully oblivious algorithms, in many cases, it also allows us to achieve polynomial speedup over full obliviousness, since it avoids “padding to the worst-case” behavior of fully oblivious algorithms. Despite the promises of differential obliviousness (DO), a significant barrier that hinders its broad application is the lack of composability. In particular, when we apply one DO algorithm to the output of another DO algorithm, the composed algorithm may no longer be DO (with reasonable parameters). Specifically, the outputs of the first DO algorithm on two neighboring inputs may no longer be neighboring, and thus we cannot directly benefit from the DO guarantee of the second algorithm. In this work, we are the first to explore a theory of composition for differentially oblivious algorithms. We propose a refinement of the DO notion called (ε, δ)-neighbor-preserving-DO, or (ε,δ)-NPDO for short, and we prove that our new notion indeed provides nice compositional guarantees. In this way, the algorithm designer can easily track the privacy loss when composing multiple DO algorithms. We give several example applications to showcase the power and expressiveness of our new NPDO notion. One of these examples is a result of independent interest: we use the com- positional framework to prove an optimal privacy amplification theorem for the differentially oblivious shuffle model. In other words, we show that for a class of distributed differentially private mechanisms in the shuffle-model, one can replace the perfectly secure shuffler with a DO shuffler, and nonetheless enjoy almost the same privacy amplification enabled by a shuffler. 
    more » « less
  3. What is a minimal set of tuples to delete from a database in order to eliminate all query answers? This problem is called the resilience of a query and is one of the key algorithmic problems underlying various forms of reverse data management, such as view maintenance, deletion propagation and causal responsibility. A long-open question is determining the conjunctive queries (CQs) for which resilience can be solved in PTIME. We shed new light on this problem by proposing a unified Integer Linear Programming (ILP) formulation. It is unified in that it can solve both previously studied restrictions (e.g., self-join-free CQs under set semantics that allow a PTIME solution) and new cases (all CQs under set or bag semantics). It is also unified in that all queries and all database instances are treated with the same approach, yet the algorithm is guaranteed to terminate in PTIME for all known PTIME cases. In particular, we prove that for all known easy cases, the optimal solution to our ILP is identical to a simpler Linear Programming (LP) relaxation, which implies that standard ILP solvers return the optimal solution to the original ILP in PTIME. Our approach allows us to explore new variants and obtain new complexity results. 1) It works under bag semantics, for which we give the first dichotomy results in the problem space. 2) We extend our approach to the related problem of causal responsibility and give a more fine-grained analysis of its complexity. 3) We recover easy instances for generally hard queries, including instances with read-once provenance and instances that become easy because of Functional Dependencies in the data. 4) We solve an open conjecture about a unified hardness criterion from PODS 2020 and prove the hardness of several queries of previously unknown complexity. 5) Experiments confirm that our findings accurately predict the asymptotic running times, and that our universal ILP is at times even quicker than a previously proposed dedicated flow algorithm. 
    more » « less
  4. We study the classic Maximum Independent Set problem under the notion of stability introduced by Bilu and Linial (2010): a weighted instance of Independent Set is γ-stable if it has a unique optimal solution that remains the unique optimal solution under multiplicative perturbations of the weights by a factor of at most γ ≥ 1. The goal then is to efficiently recover this “pronounced” optimal solution exactly. In this work, we solve stable instances of Independent Set on several classes of graphs: we improve upon previous results by solving \tilde{O}(∆/sqrt(log ∆))-stable instances on graphs of maximum degree ∆, (k − 1)-stable instances on k-colorable graphs and (1 + ε)-stable instances on planar graphs (for any fixed ε > 0), using both combinatorial techniques as well as LPs and the Sherali-Adams hierarchy. For general graphs, we give an algorithm for (εn)-stable instances, for any fixed ε > 0, and lower bounds based on the planted clique conjecture. As a by-product of our techniques, we give algorithms as well as lower bounds for stable instances of Node Multiway Cut (a generalization of Edge Multiway Cut), by exploiting its connections to Vertex Cover. Furthermore, we prove a general structural result showing that the integrality gap of convex relaxations of several maximization problems reduces dramatically on stable instances. Moreover, we initiate the study of certified algorithms for Independent Set. The notion of a γ-certified algorithm was introduced very recently by Makarychev and Makarychev (2018) and it is a class of γ-approximation algorithms that satisfy one crucial property: the solution returned is optimal for a perturbation of the original instance, where perturbations are again multiplicative up to a factor of γ ≥ 1 (hence, such algorithms not only solve γ-stable instances optimally, but also have guarantees even on unstable instances). Here, we obtain ∆-certified algorithms for Independent Set on graphs of maximum degree ∆, and (1 + ε)-certified algorithms on planar graphs. Finally, we analyze the algorithm of Berman and Fürer (1994) and prove that it is a ((∆+1)/3 + ε)-certified algorithm for Independent Set on graphs of maximum degree ∆ where all weights are equal to 1. 
    more » « less
  5. Hardt, Moritz (Ed.)
    The change-point detection problem seeks to identify distributional changes at an unknown change-point k^∗ in a stream of data. This problem appears in many important practical settings involving personal data, including biosurveillance, fault detection, finance, signal detection, and security systems. The field of differential privacy offers data analysis tools that provide powerful worst-case privacy guarantees. We study the statistical problem of change-point detection through the lens of differential privacy. We give private algorithms for both online and offline change-point detection, analyze these algorithms theoretically, and provide empirical validation of our results. 
    more » « less