We study the problem of fair kmedian where each cluster is required to have a fair representation of individuals from different groups. In the fair representation kmedian problem, we are given a set of points X in a metric space. Each point x ∈ X belongs to one of ℓ groups. Further, we are given fair representation parameters αj and β_j for each group j ∈ [ℓ]. We say that a kclustering C_1, ⋅⋅⋅, C_k fairly represents all groups if the number of points from group j in cluster C_i is between α_j C_i and β_j C_i for every j ∈ [ℓ] and i ∈ [k]. The goal is to find a set of k centers and an assignment such that the clustering defined by fairly represents all groups and minimizes the ℓ_1objective ∑_{x ∈ X} d(x, ϕ(x)).
We present an O(log k)approximation algorithm that runs in time n^{O(ℓ)}. Note that the known algorithms for the problem either (i) violate the fairness constraints by an additive term or (ii) run in time that is exponential in both k and ℓ. We also consider an important special case of the problem where and for all j ∈ [ℓ]. For this special case, we present an O(log k)approximation algorithm that runs in time.
more »
« less
Approximation Algorithms for Socially Fair Clustering
We present an $e^{O(p)} (\log \ell) / (\log \log \ell)$approximation algorithm for socially fair clustering with the $\ell_p$objective. In this problem, we are given a set of points in a metric space. Each point belongs to one (or several) of $\ell$ groups. The goal is to find a $k$medians, $k$means, or, more generally, $\ell_p$clustering that is simultaneously good for all of the groups. More precisely, we need to find a set of $k$ centers $C$ so as to minimize the maximum over all groups $j$ of $\sum_{u \text{ in group } j} d(u, C)^p$. The socially fair clustering problem was independently proposed by Abbasi, Bhaskara, and Venkatasubramanian (2021) and Ghadiri, Samadi, and Vempala (2021). Our algorithm improves and generalizes their $O(\ell)$approximation algorithms for the problem. The natural LP relaxation for the problem has an integrality gap of $\Omega(\ell)$. In order to obtain our result, we introduce a strengthened LP relaxation and show that it has an integrality gap of $\Theta((\log \ell) / (\log \log \ell))$ for a fixed p. Additionally, we present a bicriteria approximation algorithm, which generalizes the bicriteria approximation of Abbasi et al. (2021).
more »
« less
 NSFPAR ID:
 10336944
 Editor(s):
 Belkin, Mikhail; Kpotufe, Samor
 Date Published:
 Journal Name:
 Proceedings of the Conference on Learning Theory, PMLR
 Volume:
 134
 Page Range / eLocation ID:
 32463264
 Format(s):
 Medium: X
 Sponsoring Org:
 National Science Foundation
More Like this


Megow, Nicole ; Smith, Adam (Ed.)In this paper, we study the weighted kserver problem on the uniform metric in both the offline and online settings. We start with the offline setting. In contrast to the (unweighted) kserver problem which has a polynomialtime solution using mincost flows, there are strong computational lower bounds for the weighted kserver problem, even on the uniform metric. Specifically, we show that assuming the unique games conjecture, there are no polynomialtime algorithms with a subpolynomial approximation factor, even if we use cresource augmentation for c < 2. Furthermore, if we consider the natural LP relaxation of the problem, then obtaining a bounded integrality gap requires us to use at least 𝓁 resource augmentation, where 𝓁 is the number of distinct server weights. We complement these results by obtaining a constantapproximation algorithm via LP rounding, with a resource augmentation of (2+ε)𝓁 for any constant ε > 0. In the online setting, an exp(k) lower bound is known for the competitive ratio of any randomized algorithm for the weighted kserver problem on the uniform metric. In contrast, we show that 2𝓁resource augmentation can bring the competitive ratio down by an exponential factor to only O(𝓁² log 𝓁). Our online algorithm uses the twostage approach of first obtaining a fractional solution using the online primaldual framework, and then rounding it online.more » « less

Meila, Marina ; Zhang, Tong (Ed.)In the Correlation Clustering problem, we are given a complete weighted graph $G$ with its edges labeled as “similar" and “dissimilar" by a noisy binary classifier. For a clustering $\mathcal{C}$ of graph $G$, a similar edge is in disagreement with $\mathcal{C}$, if its endpoints belong to distinct clusters; and a dissimilar edge is in disagreement with $\mathcal{C}$ if its endpoints belong to the same cluster. The disagreements vector, $\mathbf{disagree}$, is a vector indexed by the vertices of $G$ such that the $v$th coordinate $\mathbf{disagree}_v$ equals the weight of all disagreeing edges incident on $v$. The goal is to produce a clustering that minimizes the $\ell_p$ norm of the disagreements vector for $p\geq 1$. We study the $\ell_p$ objective in Correlation Clustering under the following assumption: Every similar edge has weight in $[\alpha\mathbf{w},\mathbf{w}]$ and every dissimilar edge has weight at least $\alpha\mathbf{w}$ (where $\alpha \leq 1$ and $\mathbf{w}>0$ is a scaling parameter). We give an $O\left((\frac{1}{\alpha})^{\frac{1}{2}\frac{1}{2p}}\cdot \log\frac{1}{\alpha}\right)$ approximation algorithm for this problem. Furthermore, we show an almost matching convex programming integrality gap.more » « less

null (Ed.)Directed Steiner Tree (DST) is a central problem in combinatorial optimization and theoretical computer science: Given a directed graph G = (V, E) with edge costs c ∈ ℝ_{≥ 0}^E, a root r ∈ V and k terminals K ⊆ V, we need to output a minimumcost arborescence in G that contains an rrightarrow t path for every t ∈ K. Recently, Grandoni, Laekhanukit and Li, and independently Ghuge and Nagarajan, gave quasipolynomial time O(log²k/log log k)approximation algorithms for the problem, which are tight under popular complexity assumptions. In this paper, we consider the more general DegreeBounded Directed Steiner Tree (DBDST) problem, where we are additionally given a degree bound d_v on each vertex v ∈ V, and we require that every vertex v in the output tree has at most d_v children. We give a quasipolynomial time (O(log n log k), O(log² n))bicriteria approximation: The algorithm produces a solution with cost at most O(log nlog k) times the cost of the optimum solution that violates the degree constraints by at most a factor of O(log²n). This is the first nontrivial result for the problem. While our costguarantee is nearly optimal, the degree violation factor of O(log²n) is an O(log n)factor away from the approximation lower bound of Ω(log n) from the Set Cover hardness. The hardness result holds even on the special case of the DegreeBounded Group Steiner Tree problem on trees (DBGSTT). With the hope of closing the gap, we study the question of whether the degree violation factor can be made tight for this special case. We answer the question in the affirmative by giving an (O(log nlog k), O(log n))bicriteria approximation algorithm for DBGSTT.more » « less

Del Pia, Alberto ; Kaibel, Volker (Ed.)A longstanding conjecture for the traveling salesman problem (TSP) states that the integrality gap of the standard linear programming relaxation of the TSP (sometimes called the Subtour LP or the HeldKarp bound) is at most 4/3 for symmetric instances of the TSP obeying the triangle inequality. In this paper we consider the halfintegral case, in which a feasible solution to the LP has solution values in {0,1/2,1} . Karlin, Klein, and Oveis Gharan [9], in a breakthrough result, were able to show that in the halfintegral case, the integrality gap is at most 1.49993; Gupta et al. [6] showed a slight improvement of this result to 1.4983. Both of these papers consider a hierarchy of critical tight sets in the support graph of the LP solution, in which some of the sets correspond to cycle cuts and the others to degree cuts. Here we show that if all the sets in the hierarchy correspond to cycle cuts, then we can find a distribution of tours whose expected cost is at most 4/3 times the value of the halfintegral LP solution; sampling from the distribution gives us a randomized 4/3approximation algorithm. We note that known bad cases for the integrality gap have a gap of 4/3 and have a halfintegral LP solution in which all the critical tight sets in the hierarchy are cycle cuts; thus our result is tight.more » « less