skip to main content


Title: Metric-Fair Classifier Derandomization
We study the problem of classifier derandomization in machine learning: given a stochastic binary classifier f:X→[0,1], sample a deterministic classifier f̂ :X→{0,1} that approximates the output of f in aggregate over any data distribution. Recent work revealed how to efficiently derandomize a stochastic classifier with strong output approximation guarantees, but at the cost of individual fairness -- that is, if f treated similar inputs similarly, f̂ did not. In this paper, we initiate a systematic study of classifier derandomization with metric fairness guarantees. We show that the prior derandomization approach is almost maximally metric-unfair, and that a simple ``random threshold'' derandomization achieves optimal fairness preservation but with weaker output approximation. We then devise a derandomization procedure that provides an appealing tradeoff between these two: if f is α-metric fair according to a metric d with a locality-sensitive hash (LSH) family, then our derandomized f̂ is, with high probability, O(α)-metric fair and a close approximation of f. We also prove generic results applicable to all (fair and unfair) classifier derandomization procedures, including a bias-variance decomposition and reductions between various notions of metric fairness.  more » « less
Award ID(s):
2040800 2143895
NSF-PAR ID:
10403107
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
International Conference on Machine Learning (ICML)
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Roth, A (Ed.)
    It is well understood that classification algorithms, for example, for deciding on loan applications, cannot be evaluated for fairness without taking context into account. We examine what can be learned from a fairness oracle equipped with an underlying understanding of “true” fairness. The oracle takes as input a (context, classifier) pair satisfying an arbitrary fairness definition, and accepts or rejects the pair according to whether the classifier satisfies the underlying fairness truth. Our principal conceptual result is an extraction procedure that learns the underlying truth; moreover, the procedure can learn an approximation to this truth given access to a weak form of the oracle. Since every “truly fair” classifier induces a coarse metric, in which those receiving the same decision are at distance zero from one another and those receiving different decisions are at distance one, this extraction process provides the basis for ensuring a rough form of metric fairness, also known as individual fairness. Our principal technical result is a higher fidelity extractor under a mild technical constraint on the weak oracle’s conception of fairness. Our framework permits the scenario in which many classifiers, with differing outcomes, may all be considered fair. Our results have implications for interpretablity – a highly desired but poorly defined property of classification systems that endeavors to permit a human arbiter to reject classifiers deemed to be“unfair” or illegitimately derived. 
    more » « less
  2. Blum, A (Ed.)
    Algorithmic fairness, and in particular the fairness of scoring and classification algorithms, has become a topic of increasing social concern and has recently witnessed an explosion of research in theoretical computer science, machine learning, statistics, the social sciences, and law. Much of the literature considers the case of a single classifier (or scoring function) used once, in isolation. In this work, we initiate the study of the fairness properties of systems composed of algorithms that are fair in isolation; that is, we study fairness under composition. We identify pitfalls of naïve composition and give general constructions for fair composition, demonstrating both that classifiers that are fair in isolation do not necessarily compose into fair systems and also that seemingly unfair components may be carefully combined to construct fair systems. We focus primarily on the individual fairness setting proposed in [Dwork, Hardt, Pitassi, Reingold, Zemel, 2011], but also extend our results to a large class of group fairness definitions popular in the recent literature, exhibiting several cases in which group fairness definitions give misleading signals under composition. 
    more » « less
  3. Clustering is a fundamental problem in unsupervised machine learning, and due to its numerous societal implications fair variants of it have recently received significant attention. In this work we introduce a novel definition of individual fairness for clustering problems. Specifically, in our model, each point j has a set of other points S(j) that it perceives as similar to itself, and it feels that it is being fairly treated if the quality of service it receives in the solution is α-close (in a multiplicative sense, for some given α ≥ 1) to that of the points in S(j). We begin our study by answering questions regarding the combinatorial structure of the problem, namely for what values of α the problem is well-defined, and what the behavior of the Price of Fairness (PoF) for it is. For the well-defined region of α, we provide efficient and easily-implementable approximation algorithms for the k-center objective, which in certain cases also enjoy bounded-PoF guarantees. We finally complement our analysis by an extensive suite of experiments that validates the effectiveness of our theoretical results. 
    more » « less
  4. Fairness in classification has become an increasingly relevant and controversial issue as com- puters replace humans in many of today’s classification tasks. In particular, a subject of much recent debate is that of finding, and subsequently achieving, suitable definitions of fairness in an algorithmic context. In this work, following the work of Hardt et al. (NIPS’16), we consider and formalize the task of sanitizing an unfair classifier C into a classifier C′ satisfying an approximate notion of “equalized odds” or fair treatment. Our main result shows how to take any (possibly unfair) classifier C over a finite outcome space, and transform it—by just perturbing the out- put of C—according to some distribution learned by just having black-box access to samples of labeled, and previously classified, data, to produce a classifier C′ that satisfies fair treatment; we additionally show that our derived classifier is near-optimal in terms of accuracy. We also experimentally evaluate the performance of our method. 
    more » « less
  5. Minimizing risk with fairness constraints is one of the popular approaches to learning a fair classifier. Recent works showed that this approach yields an unfair classifier if the training set is corrupted. In this work, we study the minimum amount of data corruption required for a successful flipping attack. First, we find lower/upper bounds on this quantity and show that these bounds are tight when the target model is the unique unconstrained risk minimizer. Second, we propose a computationally efficient data poisoning attack algorithm that can compromise the performance of fair learning algorithms. 
    more » « less