skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Translation Tutorial: Engineering for Fairness: How a Firm Conceptual Distinction between Unfairness and Bias Makes it Easier to Address Un/Fairness
This translation tutorial will demonstrate that making fairness a tractable engineering goal requires demarcating a clear conceptual difference between bias and unfairness. Making this distinction demonstrates that bias is a property of technical judgments, whereas fairness is a property of value judgments. With that distinction in place, it is easier to articulate how engineering for fairness is an organizational or social function not reducible to a mathematical description. Using hypothetical and real-life examples I will show how product design practices would benefit from an explicit focus on value-driven decision processes. The upshot of these claims is that designing for fairness (and other ethical commitments) is more tractable if organizations build out capacity for the “soft” aspects of engineering practice. See: https://www.youtube.com/watch?v=y-Dj0cw5OtE&t=7s for the presentation.  more » « less
Award ID(s):
1704425
PAR ID:
10112010
Author(s) / Creator(s):
Date Published:
Journal Name:
Proceedings of the Conference on Fairness, Accountability, and Transparency
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Abstract Algorithmic decision making is becoming more prevalent, increasingly impacting people’s daily lives. Recently, discussions have been emerging about the fairness of decisions made by machines. Researchers have proposed different approaches for improving the fairness of these algorithms. While these approaches can help machines make fairer decisions, they have been developed and validated on fairly clean data sets. Unfortunately, most real-world data have complexities that make them more dirty . This work considers two of these complexities by analyzing the impact of two real-world data issues on fairness—missing values and selection bias—for categorical data. After formulating this problem and showing its existence, we propose fixing algorithms for data sets containing missing values and/or selection bias that use different forms of reweighting and resampling based upon the missing value generation process. We conduct an extensive empirical evaluation on both real-world and synthetic data using various fairness metrics, and demonstrate how different missing values generated from different mechanisms and selection bias impact prediction fairness, even when prediction accuracy remains fairly constant. 
    more » « less
  2. A goal of software engineering research is advancing software quality and the success of the software engineering process. However, while recent studies have demonstrated a new kind of defect in software related to its ability to operate in fair and unbiased manner, software engineering has not yet wholeheartedly tackled these new kinds of defects, thus leaving software vulnerable. This paper outlines a vision for how software engineering research can help reduce fairness defects and represents a call to action by the software engineering research community to reify that vision. Modern software is riddled with examples of biased behavior, from automated translation injecting gender stereotypes, to vision systems failing to see faces of certain races, to the US criminal justice system relying on biased computational assessments of crime recidivism. While systems may learn bias from biased data, bias can also emerge from ambiguous or incomplete requirement specification, poor design, implementation bugs, and unintended component interactions. We argue that software fairness is analogous to software quality, and that numerous software engineering challenges in the areas of requirements, specification, design, testing, and verification need to be tackled to solve this problem. 
    more » « less
  3. The plethora of fairness metrics developed for ranking-based decision-making raises the question: which metrics align best with people’s perceptions of fairness, and why? Most prior studies examining people’s perceptions of fairness metrics tend to use ordinal rating scales (e.g., Likert scales). However, such scales can be ambiguous in their interpretation across participants, and can be influenced by interface features used to capture responses.We address this gap by exploring the use of two-alternative forced choice methodologies— used extensively outside the fairness community for comparing visual stimuli— to quantitatively compare participant perceptions across fairness metrics and ranking characteristics. We report a crowdsourced experiment with 224 participants across four conditions: two alternative rank fairness metrics, ARP and NDKL, and two ranking characteristics, lists of 20 and 100 candidates, resulting in over 170,000 individual judgments. Quantitative results show systematic differences in how people interpert these metrics, and surprising exceptions where fairness metrics disagree with people’s perceptions. Qualitative analyses of participant comments reveals an interplay between cognitive and visual strategies that affects people’s perceptions of fairness. From these results, we discuss future work in aligning fairness metrics with people’s perceptions, and highlight the need and benefits of expanding methodologies for fairness studies. 
    more » « less
  4. The development of Artificial Intelligence (AI) systems involves a significant level of judgment and decision making on the part of engineers and designers to ensure the safety, robustness, and ethical design of such systems. However, the kinds of judgments that practitioners employ while developing AI platforms are rarely foregrounded or examined to explore areas practitioners might need ethical support. In this short paper, we employ the concept of design judgment to foreground and examine the kinds of sensemaking software engineers use to inform their decisionmaking while developing AI systems. Relying on data generated from two exploratory observation studies of student software engineers, we connect the concept of fairness to the foregrounded judgments to implicate their potential algorithmic fairness impacts. Our findings surface some ways in which the design judgment of software engineers could adversely impact the downstream goal of ensuring fairness in AI systems. We discuss the implications of these findings in fostering positive innovation and enhancing fairness in AI systems, drawing attention to the need to provide ethical guidance, support, or intervention to practitioners as they engage in situated and contextual judgments while developing AI systems. 
    more » « less
  5. Deep neural networks (DNNs) are increasingly used in real-world applications (e.g. facial recognition). This has resulted in concerns about the fairness of decisions made by these models. Various notions and measures of fairness have been proposed to ensure that a decision-making system does not disproportionately harm (or benefit) particular subgroups of the population. In this paper, we argue that traditional notions of fairness that are only based on models' outputs are not sufficient when the model is vulnerable to adversarial attacks. We argue that in some cases, it may be easier for an attacker to target a particular subgroup, resulting in a form of robustness bias. We show that measuring robustness bias is a challenging task for DNNs and propose two methods to measure this form of bias. We then conduct an empirical study on state-of-the-art neural networks on commonly used real-world datasets such as CIFAR-10, CIFAR-100, Adience, and UTKFace and show that in almost all cases there are subgroups (in some cases based on sensitive attributes like race, gender, etc) which are less robust and are thus at a disadvantage. We argue that this kind of bias arises due to both the data distribution and the highly complex nature of the learned decision boundary in the case of DNNs, thus making mitigation of such biases a non-trivial task. Our results show that robustness bias is an important criterion to consider while auditing real-world systems that rely on DNNs for decision making. Code to reproduce all our results can be found here: https://github.com/nvedant07/Fairness-Through-Robustness 
    more » « less