We study the problem of fair kmedian where each cluster is required to have a fair representation of individuals from different groups. In the fair representation kmedian problem, we are given a set of points X in a metric space. Each point x ∈ X belongs to one of ℓ groups. Further, we are given fair representation parameters αj and β_j for each group j ∈ [ℓ]. We say that a kclustering C_1, ⋅⋅⋅, C_k fairly represents all groups if the number of points from group j in cluster C_i is between α_j C_i and β_j C_i for every j ∈ [ℓ] and i ∈ [k]. The goal is to find a set of k centers and an assignment such that the clustering defined by fairly represents all groups and minimizes the ℓ_1objective ∑_{x ∈ X} d(x, ϕ(x)).
We present an O(log k)approximation algorithm that runs in time n^{O(ℓ)}. Note that the known algorithms for the problem either (i) violate the fairness constraints by an additive term or (ii) run in time that is exponential in both k and ℓ. We also consider an important special case of the problem where and for all j ∈ [ℓ]. For this special case, we present an O(log k)approximation algorithm that runs in time.
more »
« less
Approximation Algorithms for Socially Fair Clustering
We present an $e^{O(p)} (\log \ell) / (\log \log \ell)$approximation algorithm for socially fair clustering with the $\ell_p$objective. In this problem, we are given a set of points in a metric space. Each point belongs to one (or several) of $\ell$ groups. The goal is to find a $k$medians, $k$means, or, more generally, $\ell_p$clustering that is simultaneously good for all of the groups. More precisely, we need to find a set of $k$ centers $C$ so as to minimize the maximum over all groups $j$ of $\sum_{u \text{ in group } j} d(u, C)^p$. The socially fair clustering problem was independently proposed by Abbasi, Bhaskara, and Venkatasubramanian (2021) and Ghadiri, Samadi, and Vempala (2021). Our algorithm improves and generalizes their $O(\ell)$approximation algorithms for the problem. The natural LP relaxation for the problem has an integrality gap of $\Omega(\ell)$. In order to obtain our result, we introduce a strengthened LP relaxation and show that it has an integrality gap of $\Theta((\log \ell) / (\log \log \ell))$ for a fixed p. Additionally, we present a bicriteria approximation algorithm, which generalizes the bicriteria approximation of Abbasi et al. (2021).
more »
« less
 NSFPAR ID:
 10336944
 Editor(s):
 Belkin, Mikhail; Kpotufe, Samor
 Date Published:
 Journal Name:
 Proceedings of the Conference on Learning Theory, PMLR
 Volume:
 134
 Page Range / eLocation ID:
 32463264
 Format(s):
 Medium: X
 Sponsoring Org:
 National Science Foundation
More Like this


Chan, Timothy ; Fischer, Johannes ; Iacono, John ; Herman, Grzegorz (Ed.)We consider twocost network design models in which edges of the input graph have an associated cost and length. We build upon recent advances in hopconstrained oblivious routing to obtain two sets of results. We address multicommodity buyatbulk network design in the nonuniform setting. Existing polylogarithmic approximations are based on the junction tree approach [Chekuri et al., 2010; Guy Kortsarz and Zeev Nutov, 2011]. We obtain a new polylogarithmic approximation via a natural LP relaxation. This establishes an upper bound on its integrality gap and affirmatively answers an open question raised in [Chekuri et al., 2010]. The rounding is based on recent results in hopconstrained oblivious routing [Ghaffari et al., 2021], and this technique yields a polylogarithmic approximation in more general settings such as set connectivity. Our algorithm for buyatbulk network design is based on an LPbased reduction to hhop constrained network design for which we obtain LPbased bicriteria approximation algorithms. We also consider a faulttolerant version of hhop constrained network design where one wants to design a lowcost network to guarantee short paths between a given set of sourcesink pairs even when k1 edges can fail. This model has been considered in network design [Luis Gouveia and Markus Leitner, 2017; Gouveia et al., 2018; Arslan et al., 2020] but no approximation algorithms were known. We obtain polylogarithmic bicriteria approximation algorithms for the singlesource setting for any fixed k. We build upon the singlesource algorithm and the junctiontree approach to obtain an approximation algorithm for the multicommodity setting when at most one edge can fail.more » « less

Megow, Nicole ; Smith, Adam (Ed.)In this paper, we study the weighted kserver problem on the uniform metric in both the offline and online settings. We start with the offline setting. In contrast to the (unweighted) kserver problem which has a polynomialtime solution using mincost flows, there are strong computational lower bounds for the weighted kserver problem, even on the uniform metric. Specifically, we show that assuming the unique games conjecture, there are no polynomialtime algorithms with a subpolynomial approximation factor, even if we use cresource augmentation for c < 2. Furthermore, if we consider the natural LP relaxation of the problem, then obtaining a bounded integrality gap requires us to use at least 𝓁 resource augmentation, where 𝓁 is the number of distinct server weights. We complement these results by obtaining a constantapproximation algorithm via LP rounding, with a resource augmentation of (2+ε)𝓁 for any constant ε > 0. In the online setting, an exp(k) lower bound is known for the competitive ratio of any randomized algorithm for the weighted kserver problem on the uniform metric. In contrast, we show that 2𝓁resource augmentation can bring the competitive ratio down by an exponential factor to only O(𝓁² log 𝓁). Our online algorithm uses the twostage approach of first obtaining a fractional solution using the online primaldual framework, and then rounding it online.more » « less

Meila, Marina ; Zhang, Tong (Ed.)In the Correlation Clustering problem, we are given a complete weighted graph $G$ with its edges labeled as “similar" and “dissimilar" by a noisy binary classifier. For a clustering $\mathcal{C}$ of graph $G$, a similar edge is in disagreement with $\mathcal{C}$, if its endpoints belong to distinct clusters; and a dissimilar edge is in disagreement with $\mathcal{C}$ if its endpoints belong to the same cluster. The disagreements vector, $\mathbf{disagree}$, is a vector indexed by the vertices of $G$ such that the $v$th coordinate $\mathbf{disagree}_v$ equals the weight of all disagreeing edges incident on $v$. The goal is to produce a clustering that minimizes the $\ell_p$ norm of the disagreements vector for $p\geq 1$. We study the $\ell_p$ objective in Correlation Clustering under the following assumption: Every similar edge has weight in $[\alpha\mathbf{w},\mathbf{w}]$ and every dissimilar edge has weight at least $\alpha\mathbf{w}$ (where $\alpha \leq 1$ and $\mathbf{w}>0$ is a scaling parameter). We give an $O\left((\frac{1}{\alpha})^{\frac{1}{2}\frac{1}{2p}}\cdot \log\frac{1}{\alpha}\right)$ approximation algorithm for this problem. Furthermore, we show an almost matching convex programming integrality gap.more » « less

null (Ed.)Directed Steiner Tree (DST) is a central problem in combinatorial optimization and theoretical computer science: Given a directed graph G = (V, E) with edge costs c ∈ ℝ_{≥ 0}^E, a root r ∈ V and k terminals K ⊆ V, we need to output a minimumcost arborescence in G that contains an rrightarrow t path for every t ∈ K. Recently, Grandoni, Laekhanukit and Li, and independently Ghuge and Nagarajan, gave quasipolynomial time O(log²k/log log k)approximation algorithms for the problem, which are tight under popular complexity assumptions. In this paper, we consider the more general DegreeBounded Directed Steiner Tree (DBDST) problem, where we are additionally given a degree bound d_v on each vertex v ∈ V, and we require that every vertex v in the output tree has at most d_v children. We give a quasipolynomial time (O(log n log k), O(log² n))bicriteria approximation: The algorithm produces a solution with cost at most O(log nlog k) times the cost of the optimum solution that violates the degree constraints by at most a factor of O(log²n). This is the first nontrivial result for the problem. While our costguarantee is nearly optimal, the degree violation factor of O(log²n) is an O(log n)factor away from the approximation lower bound of Ω(log n) from the Set Cover hardness. The hardness result holds even on the special case of the DegreeBounded Group Steiner Tree problem on trees (DBGSTT). With the hope of closing the gap, we study the question of whether the degree violation factor can be made tight for this special case. We answer the question in the affirmative by giving an (O(log nlog k), O(log n))bicriteria approximation algorithm for DBGSTT.more » « less