Approximation Algorithms for Socially Fair Clustering
We present an $e^{O(p)} (\log \ell) / (\log \log \ell)$-approximation algorithm for socially fair clustering with the $\ell_p$-objective. In this problem, we are given a set of points in a metric space. Each point belongs to one (or several) of $\ell$ groups. The goal is to find a $k$-medians, $k$-means, or, more generally, $\ell_p$-clustering that is simultaneously good for all of the groups. More precisely, we need to find a set of $k$ centers $C$ so as to minimize the maximum over all groups $j$ of $\sum_{u \text{ in group } j} d(u, C)^p$. The socially fair clustering problem was independently proposed by Abbasi, Bhaskara, and Venkatasubramanian (2021) and Ghadiri, Samadi, and Vempala (2021). Our algorithm improves and generalizes their $O(\ell)$-approximation algorithms for the problem. The natural LP relaxation for the problem has an integrality gap of $\Omega(\ell)$. In order to obtain our result, we introduce a strengthened LP relaxation and show that it has an integrality gap of $\Theta((\log \ell) / (\log \log \ell))$ for a fixed p. Additionally, we present a bicriteria approximation algorithm, which generalizes the bicriteria approximation of Abbasi et al. (2021).
Authors:
;
Editors:
;
Award ID(s):
Publication Date:
NSF-PAR ID:
10336944
Journal Name:
Proceedings of the Conference on Learning Theory, PMLR
Volume:
134
Page Range or eLocation-ID:
3246-3264
We consider the problem of covering multiple submodular constraints. Given a finite ground setN, a weight function$$w: N \rightarrow \mathbb {R}_+$$$w:N\to {R}_{+}$,rmonotone submodular functions$$f_1,f_2,\ldots ,f_r$$${f}_{1},{f}_{2},\dots ,{f}_{r}$overNand requirements$$k_1,k_2,\ldots ,k_r$$${k}_{1},{k}_{2},\dots ,{k}_{r}$the goal is to find a minimum weight subset$$S \subseteq N$$$S\subseteq N$such that$$f_i(S) \ge k_i$$${f}_{i}\left(S\right)\ge {k}_{i}$for$$1 \le i \le r$$$1\le i\le r$. We refer to this problem asMulti-Submod-Coverand it was recently considered by Har-Peled and Jones (Few cuts meet many point sets. CoRR.arxiv:abs1808.03260Har-Peled and Jones 2018) who were motivated by an application in geometry. Even with$$r=1$$$r=1$Multi-Submod-Covergeneralizes the well-known Submodular Set Cover problem (Submod-SC), and it can also be easily reduced toSubmod-SC. A simple greedy algorithm gives an$$O(\log (kr))$$$O\left(log\left(kr\right)\right)$approximation where$$k = \sum _i k_i$$$k={\sum }_{i}{k}_{i}$and this ratio cannot be improved in the general case. In this paper, motivated by several concrete applications, we consider two ways to improve upon the approximation given by the greedy algorithm. First, we give a bicriteria approximation algorithm forMulti-Submod-Coverthat covers each constraint to within a factor of$$(1-1/e-\varepsilon )$$$\left(1-1/e-\epsilon \right)$while incurring an approximation of$$O(\frac{1}{\epsilon }\log r)$$$O\left(\frac{1}{ϵ}logr\right)$in the cost. Second, we consider the special case when each$$f_i$$${f}_{i}$is a obtained from a truncated coverage function and obtain an algorithm that generalizes previous work on partial set cover (Partial-SC), covering integer programs (CIPs) and multiple vertex cover constraintsmore »
4. The densest subgraph problem in a graph (\dsg), in the simplest form, is the following. Given an undirected graph $G=(V,E)$ find a subset $S \subseteq V$ of vertices that maximizes the ratio $|E(S)|/|S|$ where $E(S)$ is the set of edges with both endpoints in $S$. \dsg and several of its variants are well-studied in theory and practice and have many applications in data mining and network analysis. In this paper we study fast algorithms and structural aspects of \dsg via the lens of \emph{supermodularity}. For this we consider the densest supermodular subset problem (\dssp): given a non-negative supermodular function $f: 2^V \rightarrow \mathbb{R}_+$, maximize $f(S)/|S|$. For \dsg we describe a simple flow-based algorithm that outputs a $(1-\eps)$-approximation in deterministic $\tilde{O}(m/\eps)$ time where $m$ is the number of edges. Our algorithm is the first to have a near-linear dependence on $m$ and $1/\eps$ and improves previous methods based on an LP relaxation. It generalizes to hypergraphs, and also yields a faster algorithm for directed \dsg. Greedy peeling algorithms have been very popular for \dsg and several variants due to their efficiency, empirical performance, and worst-case approximation guarantees. We describe a simple peeling algorithm for \dssp and analyze its approximation guarantee inmore »