skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Advancing Subgroup Fairness via Sleeping Experts
We study methods for improving fairness to subgroups in settings with overlapping populations and sequential predictions. Classical notions of fairness focus on the balance of some property across different populations. However, in many applications the goal of the different groups is not to be predicted equally but rather to be predicted well. We demonstrate that the task of satisfying this guarantee for multiple overlapping groups is not straightforward and show that for the simple objective of unweighted average of false negative and false positive rate, satisfying this for overlapping populations can be statistically impossible even when we are provided predictors that perform well separately on each subgroup. On the positive side, we show that when individuals are equally important to the different groups they belong to, this goal is achievable; to do so, we draw a connection to the sleeping experts literature in online learning. Motivated by the one-sided feedback in natural settings of interest, we extend our results to such a feedback model. We also provide a game-theoretic interpretation of our results, examining the incentives of participants to join the system and to provide the system full information about predictors they may possess. We end with several interesting open problems concerning the strength of guarantees that can be achieved in a computationally efficient manner.  more » « less
Award ID(s):
1815011 1733556
PAR ID:
10190439
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Innovations in Theoretical Computer Science Conference (ITCS)
Volume:
11
Page Range / eLocation ID:
55:1--55:24
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Making fair decisions is crucial to ethically implementing machine learning algorithms in social settings. In this work, we consider the celebrated definition of counterfactual fairness. We begin by showing that an algorithm which satisfies counterfactual fairness also satisfies demographic parity, a far simpler fairness constraint. Similarly, we show that all algorithms satisfying demographic parity can be trivially modified to satisfy counterfactual fairness. Together, our results indicate that counterfactual fairness is basically equivalent to demographic parity, which has important implications for the growing body of work on counterfactual fairness. We then validate our theoretical findings empirically, analyzing three existing algorithms for counterfactual fairness against three simple benchmarks. We find that two simple benchmark algorithms outperform all three existing algorithms---in terms of fairness, accuracy, and efficiency---on several data sets. Our analysis leads us to formalize a concrete fairness goal: to preserve the order of individuals within protected groups. We believe transparency around the ordering of individuals within protected groups makes fair algorithms more trustworthy. By design, the two simple benchmark algorithms satisfy this goal while the existing algorithms do not. 
    more » « less
  2. Ensuring that technological advancements benefit all groups of people equally is crucial. The first step towards fairness is identifying existing inequalities. The naive comparison of group error rates may lead to wrong conclusions. We introduce a new method to determine whether a speaker verification system is fair toward several population subgroups. We propose to model miss and false alarm probabilities as a function of multiple factors, including the population group effects, e.g., male and female, and a series of confounding variables, e.g., speaker effects, language, nationality, etc. This model can estimate error rates related to a group effect without the influence of confounding effects. We experiment with a synthetic dataset where we control group and confounding effects. Our metric achieves significantly lower false positive and false negative rates w.r.t. baseline. We also experiment with VoxCeleb and NIST SRE21 datasets on different ASV systems and present our conclusions. 
    more » « less
  3. Rated preference aggregation is conventionally performed by averaging ratings from multiple evaluators to create a consensus ordering of candidates from highest to lowest average rating. Ideally, the consensus is fair, meaning critical opportunities are not withheld from marginalized groups of candidates, even if group biases may be present in the to-be-combined ratings. Prior work operationalizing fairness in preference aggregation is limited to settings where evaluators provide rankings of candidates (e.g., Joe > Jack > Jill). Yet, in practice, many evaluators assign ratings such as Likert scales or categories (e.g., yes, no, maybe) to each candidate. Ratings convey different information than rankings leading to distinct fairness issues during their aggregation. The existing literature does not characterize these fairness concerns nor provide applicable bias-mitigation solutions. Unlike the ranked setting studied previously, two unique forms of bias arise in rating aggregation. First, biased rating stems from group disparities in to-be-aggregated evaluator ratings. Second, biased tie-breaking occurs because ties in average ratings must be resolved when aggregating ratings into a consensus ranking, and this tie-breaking act can unfairly advantage certain groups. To address this gap, we define the open fair rated preference aggregation problem and introduce the corresponding Fate methodology. Fate offers the first group fairness metric specifically for rated preference data. We propose two Fate algorithms. Fate-Break works in settings when ties need to be broken, explicitly fairness-enhancing such processes without lowering consensus utility. Fate-Rate mitigates disparities in how groups are rated, by using a Markov-chain approach to generate outcomes where groups are, in as much as possible, equally represented. Our experimental study illustrates the FATE methods provide the most bias-mitigation compared to adapting prior methods to fair tie-breaking and rating aggregation. 
    more » « less
  4. Settings such as lending and policing can be modeled by a centralized agent allocating a scarce resource (e.g. loans or police officers) amongst several groups, in order to maximize some objective (e.g. loans given that are repaid, or criminals that are apprehended). Often in such problems fairness is also a concern. One natural notion of fairness, based on general principles of equality of opportunity, asks that conditional on an individual being a candidate for the resource in question, the probability of actually receiving it is approximately independent of the individual’s group. For example, in lending this would mean that equally creditworthy individuals in different racial groups have roughly equal chances of receiving a loan. In policing it would mean that two individuals committing the same crime in different districts would have roughly equal chances of being arrested. In this paper, we formalize this general notion of fairness for allocation problems and investigate its algorithmic consequences. Our main technical results include an efficient learning algorithm that converges to an optimal fair allocation even when the allocator does not know the frequency of candidates (i.e. creditworthy individuals or criminals) in each group. This algorithm operates in a censored feedback model in which only the number of candidates who received the resource in a given allocation can be observed, rather than the true number of candidates in each group. This models the fact that we do not learn the creditworthiness of individuals we do not give loans to and do not learn about crimes committed if the police presence in a district is low. 
    more » « less
  5. To achieve a goal, people have to keep track of how much effort they are putting in (effort monitoring) and how well they are performing (performance monitoring), which can be informed by endogenous signals, or exogenous signals providing explicit feedback about whether they have met their goal. Interventions to improve performance often focus on adjusting feedback to direct the individual on how to better invest their efforts, but is it possible that this feedback itself plays a role in shaping the experience of how effortful the task feels? Here, we examine this question directly by assessing the relationship between effort monitoring and performance monitoring. Participants (N = 68) performed a task in which their goal was to squeeze a handgrip to within a target force level (not lower or higher) for a minimum duration. On most trials, they were given no feedback as to whether they met their goal, and were largely unable to detect how they had performed. On a subset of trials, however, we provided participants with (false) feedback indicating that they had either succeeded or failed at meeting their goal (positive vs. negative feedback blocks, respectively). Sporadically, participants rated their experience of effort exertion, fatigue, and confidence in having met the target grip force on that trial. Despite being non-veridical to their actual performance, we found that the type of feedback participants received influenced their experience of effort. When receiving negative (vs. positive) feedback, participants fatigued faster and adjusted their grip strength more for higher target force levels. We also found that confidence gradually increased with increasing positive feedback and decreased with increasing negative feedback, again despite feedback being uniformly uninformative. These results suggest differential influences of feedback on experiences related to effort and further shed light on the relationship between experiences related to performance monitoring and effort monitoring. 
    more » « less