skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: What can experimental studies of bias tell us about real-world group disparities?
Abstract This article questions the widespread use of experimental social psychology to understand real-world group disparities. Standard experimental practice is to design studies in which participants make judgments of targets who vary only on the social categories to which they belong. This is typically done under simplified decision landscapes and with untrained decision-makers. For example, to understand racial disparities in police shootings, researchers show pictures of armed and unarmed Black and White men to undergraduates and have them press “shoot” and “don't shoot” buttons. Having demonstrated categorical bias under these conditions, researchers then use such findings to claim that real-world disparities are also due to decision-maker bias. I describe three flaws inherent in this approach, flaws which undermine any direct contribution of experimental studies to explaining group disparities. First, the decision landscapes used in experimental studies lack crucial components present in actual decisions ( missing information flaw ). Second, categorical effects in experimental studies are not interpreted in light of other effects on outcomes, including behavioral differences across groups ( missing forces flaw ). Third, there is no systematic testing of whether the contingencies required to produce experimental effects are present in real-world decisions ( missing contingencies flaw ). I apply this analysis to three research topics to illustrate the scope of the problem. I discuss how this research tradition has skewed our understanding of the human mind within and beyond the discipline and how results from experimental studies of bias are generally misunderstood. I conclude by arguing that the current research tradition should be abandoned.  more » « less
Award ID(s):
1756092
PAR ID:
10439219
Author(s) / Creator(s):
Date Published:
Journal Name:
Behavioral and Brain Sciences
Volume:
45
ISSN:
0140-525X
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Abstract Algorithmic decision making is becoming more prevalent, increasingly impacting people’s daily lives. Recently, discussions have been emerging about the fairness of decisions made by machines. Researchers have proposed different approaches for improving the fairness of these algorithms. While these approaches can help machines make fairer decisions, they have been developed and validated on fairly clean data sets. Unfortunately, most real-world data have complexities that make them more dirty . This work considers two of these complexities by analyzing the impact of two real-world data issues on fairness—missing values and selection bias—for categorical data. After formulating this problem and showing its existence, we propose fixing algorithms for data sets containing missing values and/or selection bias that use different forms of reweighting and resampling based upon the missing value generation process. We conduct an extensive empirical evaluation on both real-world and synthetic data using various fairness metrics, and demonstrate how different missing values generated from different mechanisms and selection bias impact prediction fairness, even when prediction accuracy remains fairly constant. 
    more » « less
  2. A significant body of research in the data sciences considers unfair discrimination against social categories such as race or gender that could occur or be amplified as a result of algorithmic decisions. Simultaneously, real-world disparities continue to exist, even before algorithmic decisions are made. In this work, we draw on insights from the social sciences brought into the realm of causal modeling and constrained optimization, and develop a novel algorithmic framework for tackling pre-existing real-world disparities. The purpose of our framework, which we call the “impact remediation framework,” is to measure real-world disparities and discover the optimal intervention policies that could help improve equity or access to opportunity for those who are underserved with respect to an outcome of interest. We develop a disaggregated approach to tackling pre-existing disparities that relaxes the typical set of assumptions required for the use of social categories in structural causal models. Our approach flexibly incorporates counterfactuals and is compatible with various ontological assumptions about the nature of social categories. We demonstrate impact remediation with a hypothetical case study and compare our disaggregated approach to an existing state-of-the-art approach, comparing its structure and resulting policy recommendations. In contrast to most work on optimal policy learning, we explore disparity reduction itself as an objective, explicitly focusing the power of algorithms on reducing inequality. 
    more » « less
  3. The current study examines the effect of sleep deprivation and caffeine use on racial bias in the decision to shoot. Participants deprived of sleep for 24 hr (vs. rested participants) made more errors in a shooting task and were more likely to shoot unarmed targets. A diffusion decision model analysis revealed sleep deprivation decreased participants’ ability to extract information from the stimuli, whereas caffeine impacted the threshold separation, reflecting decreased caution. Neither sleep deprivation nor caffeine moderated anti-Black racial bias in shooting decisions or at the process level. We discuss how our results clarify discrepancies in past work testing the impact of fatigue on racial bias in shooting decisions. 
    more » « less
  4. The use of AI-based decision aids in diverse domains has inspired many empirical investigations into how AI models’ decision recommendations impact humans’ decision accuracy in AI-assisted decision making, while explorations on the impacts on humans’ decision fairness are largely lacking despite their clear importance. In this paper, using a real-world business decision making scenario—bidding in rental housing markets—as our testbed, we present an experimental study on understanding how the bias level of the AI-based decision aid as well as the provision of AI explanations affect the fairness level of humans’ decisions, both during and after their usage of the decision aid. Our results suggest that when people are assisted by an AI-based decision aid, both the higher level of racial biases the decision aid exhibits and surprisingly, the presence of AI explanations, result in more unfair human decisions across racial groups. Moreover, these impacts are partly made through triggering humans’ “disparate interactions” with AI. However, regardless of the AI bias level and the presence of AI explanations, when people return to make independent decisions after their usage of the AI-based decision aid, their decisions no longer exhibit significant unfairness across racial groups. 
    more » « less
  5. Algorithmic decision-making systems are increasingly used throughout the public and private sectors to make important decisions or assist humans in making these decisions with real social consequences. While there has been substantial research in recent years to build fair decision-making algorithms, there has been less research seeking to understand the factors that affect people's perceptions of fairness in these systems, which we argue is also important for their broader acceptance. In this research, we conduct an online experiment to better understand perceptions of fairness, focusing on three sets of factors: algorithm outcomes, algorithm development and deployment procedures, and individual differences. We find that people rate the algorithm as more fair when the algorithm predicts in their favor, even surpassing the negative effects of describing algorithms that are very biased against particular demographic groups. We find that this effect is moderated by several variables, including participants' education level, gender, and several aspects of the development procedure. Our findings suggest that systems that evaluate algorithmic fairness through users' feedback must consider the possibility of "outcome favorability" bias. 
    more » « less