Clustering is a fundamental unsupervised learning problem where a data-set is partitioned into clusters that consist of nearby points in a metric space. A recent variant, fair clustering, associates a color with each point representing its group membership and requires that each color has (approximately) equal representation in each cluster to satisfy group fairness. In this model, the cost of the clustering objective increases due to enforcing fairness in the algorithm. The relative increase in the cost, the “price of fairness,” can indeed be unbounded. Therefore, in this paper we propose to treat an upper bound on the clustering objective as a constraint on the clustering problem, and to maximize equality of representation subject to it. We consider two fairness objectives: the group utilitarian objective and the group egalitarian objective, as well as the group leximin objective which generalizes the group egalitarian objective. We derive fundamental lower bounds on the approximation of the utilitarian and egalitarian objectives and introduce algorithms with provable guarantees for them. For the leximin objective we introduce an effective heuristic algorithm. We further derive impossibility results for other natural fairness objectives. We conclude with experimental results on real-world datasets that demonstrate the validity of our algorithms. 
                        more » 
                        « less   
                    
                            
                            Fair Clustering Under a Bounded Cost
                        
                    
    
            Clustering is a fundamental unsupervised learning problem where a dataset is partitioned into clusters that consist of nearby points in a metric space. A recent variant, fair clustering, associates a color with each point representing its group membership and requires that each color has (approximately) equal representation in each cluster to satisfy group fairness. In this model, the cost of the clustering objective increases due to enforcing fairness in the algorithm. The relative increase in the cost, the `''price of fairness,'' can indeed be unbounded. Therefore, in this paper we propose to treat an upper bound on the clustering objective as a constraint on the clustering problem, and to maximize equality of representation subject to it. We consider two fairness objectives: the group utilitarian objective and the group egalitarian objective, as well as the group leximin objective which generalizes the group egalitarian objective. We derive fundamental lower bounds on the approximation of the utilitarian and egalitarian objectives and introduce algorithms with provable guarantees for them. For the leximin objective we introduce an effective heuristic algorithm. We further derive impossibility results for other natural fairness objectives. We conclude with experimental results on real-world datasets that demonstrate the validity of our algorithms. 
        more » 
        « less   
        
    
    
                            - PAR ID:
- 10315237
- Date Published:
- Journal Name:
- Advances in Neural Information Processing Systems 34
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            The classic house allocation problem is primarily concerned with finding a matching between a set of agents and a set of houses that guarantees some notion of economic efficiency (e.g. utilitarian welfare). While recent works have shifted focus on achieving fairness (e.g. minimizing the number of envious agents), they often come with notable costs on efficiency notions such as utilitarian or egalitarian welfare. We investigate the trade-offs between these welfare measures and several natural fairness measures that rely on the number of envious agents, the total (aggregate) envy of all agents, and maximum total envy of an agent. In particular, by focusing on envy-free allocations, we first show that, should one exist, finding an envy-free allocation with maximum utilitarian or egalitarian welfare is computationally tractable. We highlight a rather stark contrast between utilitarian and egalitarian welfare by showing that finding utilitarian welfare maximizing allocations that minimize the aforementioned fairness measures can be done in polynomial time while their egalitarian counterparts remain intractable (for the most part) even under binary valuations. We complement our theoretical findings by giving insights into the relationship between the different fairness measures and by conducting empirical analysis.more » « less
- 
            We study the problem of fair k-median where each cluster is required to have a fair representation of individuals from different groups. In the fair representation k-median problem, we are given a set of points X in a metric space. Each point x ∈ X belongs to one of ℓ groups. Further, we are given fair representation parameters αj and β_j for each group j ∈ [ℓ]. We say that a k-clustering C_1, ⋅⋅⋅, C_k fairly represents all groups if the number of points from group j in cluster C_i is between α_j |C_i| and β_j |C_i| for every j ∈ [ℓ] and i ∈ [k]. The goal is to find a set of k centers and an assignment such that the clustering defined by fairly represents all groups and minimizes the ℓ_1-objective ∑_{x ∈ X} d(x, ϕ(x)). We present an O(log k)-approximation algorithm that runs in time n^{O(ℓ)}. Note that the known algorithms for the problem either (i) violate the fairness constraints by an additive term or (ii) run in time that is exponential in both k and ℓ. We also consider an important special case of the problem where and for all j ∈ [ℓ]. For this special case, we present an O(log k)-approximation algorithm that runs in time.more » « less
- 
            We study the problem of fairly allocating a set of indivisible goods among agents with matroid rank valuations: every good provides a marginal value of 0 or 1 when added to a bundle and valuations are submodular.We present a simple algorithmic framework, calledGeneral Yankee Swap, that can efficiently compute allocations that maximize any justice criterion (or fairness objective) satisfying some mild assumptions. Along with maximizing a justice criterion, General Yankee Swap is guaranteed to maximize utilitarian social welfare, ensure strategyproofness and use at most a quadratic number of valuation queries. We show how General Yankee Swap can be used to compute allocations for five different well-studied justice criteria: (a) Prioritized Lorenz dominance, (b) Maximin fairness, (c) Weighted leximin, (d) Max weighted Nash welfare, and (e) Max weightedp-mean welfare.In particular, this framework provides the first polynomial time algorithms to compute weighted leximin, max weighted Nash welfare and max weightedp-mean welfare allocations for agents with matroid rank valuations. We also extend this framework to the setting of binary chores — items with marginal values -1 or 0 — and similarly show that it can be used to maximize any justice criteria satisfying some mild assumptions.more » « less
- 
            Gørtz, Inge Li; Farach-Colton, Martin; Puglisi, Simon J; Herman, Grzegorz (Ed.)In the Minimum Bisection problem input is a graph G and the goal is to partition the vertex set into two parts A and B, such that ||A|-|B|| ≤ 1 and the number k of edges between A and B is minimized. The problem is known to be NP-hard, and assuming the Unique Games Conjecture even NP-hard to approximate within a constant factor [Khot and Vishnoi, J.ACM'15]. On the other hand, a 𝒪(log n)-approximation algorithm [Räcke, STOC'08] and a parameterized algorithm [Cygan et al., ACM Transactions on Algorithms'20] running in time k^𝒪(k) n^𝒪(1) is known. The Minimum Bisection problem can be viewed as a clustering problem where edges represent similarity and the task is to partition the vertices into two equally sized clusters while minimizing the number of pairs of similar objects that end up in different clusters. Motivated by a number of egregious examples of unfair bias in AI systems, many fundamental clustering problems have been revisited and re-formulated to incorporate fairness constraints. In this paper we initiate the study of the Minimum Bisection problem with fairness constraints. Here the input is a graph G, positive integers c and k, a function χ:V(G) → {1, …, c} that assigns a color χ(v) to each vertex v in G, and c integers r_1,r_2,⋯,r_c. The goal is to partition the vertex set of G into two almost-equal sized parts A and B with at most k edges between them, such that for each color i ∈ {1, …, c}, A has exactly r_i vertices of color i. Each color class corresponds to a group which we require the partition (A, B) to treat fairly, and the constraints that A has exactly r_i vertices of color i can be used to encode that no group is over- or under-represented in either of the two clusters. We first show that introducing fairness constraints appears to make the Minimum Bisection problem qualitatively harder. Specifically we show that unless FPT=W[1] the problem admits no f(c)n^𝒪(1) time algorithm even when k = 0. On the other hand, our main technical contribution shows that is that this hardness result is simply a consequence of the very strict requirement that each color class i has exactly r_i vertices in A. In particular we give an f(k,c,ε)n^𝒪(1) time algorithm that finds a balanced partition (A, B) with at most k edges between them, such that for each color i ∈ [c], there are at most (1±ε)r_i vertices of color i in A. Our approximation algorithm is best viewed as a proof of concept that the technique introduced by [Lampis, ICALP'18] for obtaining FPT-approximation algorithms for problems of bounded tree-width or clique-width can be efficiently exploited even on graphs of unbounded width. The key insight is that the technique of Lampis is applicable on tree decompositions with unbreakable bags (as introduced in [Cygan et al., SIAM Journal on Computing'14]). An important ingredient of our approximation scheme is a combinatorial result that may be of independent interest, namely that for every k, every graph G admits a tree decomposition with adhesions of size at most 𝒪(k), unbreakable bags, and logarithmic depth.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    