skip to main content


The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 10:00 PM ET on Friday, December 8 until 2:00 AM ET on Saturday, December 9 due to maintenance. We apologize for the inconvenience.

Title: On Measures of Biases and Harms in NLP
Recent studies show that Natural Language Processing (NLP) technologies propagate societal biases about demographic groups associated with attributes such as gender, race, and nationality. To create interventions and mitigate these biases and associated harms, it is vital to be able to detect and measure such biases. While existing works propose bias evaluation and mitigation methods for various tasks, there remains a need to cohesively understand the biases and the specific harms they measure, and how different measures compare with each other. To address this gap, this work presents a practical framework of harms and a series of questions that practitioners can answer to guide the development of bias measures. As a validation of our framework and documentation questions, we also present several case studies of how existing bias measures in NLP—both intrinsic measures of bias in representations and extrinsic measures of bias of downstream applications—can be aligned with different harms and how our proposed documentation questions facilitates more holistic understanding of what bias measures are measuring.  more » « less
Award ID(s):
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ;
Date Published:
Journal Name:
Findings of the Association for Computational Linguistics: AACL-IJCNLP 2022
Page Range / eLocation ID:
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. his work-in-progress paper expands on a collaboration between engineering education researchers and machine learning researchers to automate the analysis of written responses to conceptually challenging questions in statics and dynamics courses (Authors, 2022). Using the Concept Warehouse (Koretsky et al., 2014), written justifications of ConcepTests (CTs) were gathered from statics and dynamics courses in a diverse set of two- and four-year institutions. Written justifications for CTs have been used to support active learning pedagogies which makes them important to investigate how students put together their problem-solving narratives of understanding. However, despite the large benefit that analysis of student written responses may provide to instructors and researchers, manual review of responses is cumbersome, limits analysis, and can be prone to human bias. In efforts to improve the analysis of student written responses, machine learning has been used in various educational contexts to analyze short and long texts (Burstein et al., 2020; Burstein et al., 2021). Natural Language Processing (NLP) uses transformer-based machine learning models (Brown et al., 2020; Raffel et al., 2019) which can be used through fine-tuning or in-context learning methods. NLP can be used to train algorithms that can automate the coding of written responses. Only a few studies for educational applications have leveraged transformer-based machine learning models further prompting an investigation into its use in STEM education. However, work in NLP has been criticized for heightening the possibility to perpetuate and even amplify harmful stereotypes and implicit biases (Chang et al., 2019; Mayfield et al., 2019). In this study, we detail the aim to use NLP for linguistic justice. Using methods like text summary, topic modeling, and text classification, we identify key aspects of student narratives of understanding in written responses to mechanics and statics CTs. Through this process, we seek to use machine learning to identify different ways students talk about a problem and their understanding at any point in their narrative formation process. Thus, we hope to help reduce human bias in the classroom and through technology by giving instructors and researchers a diverse set of narratives that include insight into their students’ histories, identities, and understanding. These can then be used towards connecting technological knowledge to students’ everyday lives. 
    more » « less
  2. null (Ed.)
    Background Mobile health technology has demonstrated the ability of smartphone apps and sensors to collect data pertaining to patient activity, behavior, and cognition. It also offers the opportunity to understand how everyday passive mobile metrics such as battery life and screen time relate to mental health outcomes through continuous sensing. Impulsivity is an underlying factor in numerous physical and mental health problems. However, few studies have been designed to help us understand how mobile sensors and self-report data can improve our understanding of impulsive behavior. Objective The objective of this study was to explore the feasibility of using mobile sensor data to detect and monitor self-reported state impulsivity and impulsive behavior passively via a cross-platform mobile sensing application. Methods We enrolled 26 participants who were part of a larger study of impulsivity to take part in a real-world, continuous mobile sensing study over 21 days on both Apple operating system (iOS) and Android platforms. The mobile sensing system (mPulse) collected data from call logs, battery charging, and screen checking. To validate the model, we used mobile sensing features to predict common self-reported impulsivity traits, objective mobile behavioral and cognitive measures, and ecological momentary assessment (EMA) of state impulsivity and constructs related to impulsive behavior (ie, risk-taking, attention, and affect). Results Overall, the findings suggested that passive measures of mobile phone use such as call logs, battery charging, and screen checking can predict different facets of trait and state impulsivity and impulsive behavior. For impulsivity traits, the models significantly explained variance in sensation seeking, planning, and lack of perseverance traits but failed to explain motor, urgency, lack of premeditation, and attention traits. Passive sensing features from call logs, battery charging, and screen checking were particularly useful in explaining and predicting trait-based sensation seeking. On a daily level, the model successfully predicted objective behavioral measures such as present bias in delay discounting tasks, commission and omission errors in a cognitive attention task, and total gains in a risk-taking task. Our models also predicted daily EMA questions on positivity, stress, productivity, healthiness, and emotion and affect. Perhaps most intriguingly, the model failed to predict daily EMA designed to measure previous-day impulsivity using face-valid questions. Conclusions The study demonstrated the potential for developing trait and state impulsivity phenotypes and detecting impulsive behavior from everyday mobile phone sensors. Limitations of the current research and suggestions for building more precise passive sensing models are discussed. Trial Registration NCT03006653; 
    more » « less
  3. Increased social media use has contributed to the greater prevalence of abusive, rude, and offensive textual comments. Machine learning models have been developed to detect toxic comments online, yet these models tend to show biases against users with marginalized or minority identities (e.g., females and African Americans). Established research in debiasing toxicity classifiers often (1) takes a static or batch approach, assuming that all information is available and then making a one-time decision; and (2) uses a generic strategy to mitigate different biases (e.g., gender and racial biases) that assumes the biases are independent of one another. However, in real scenarios, the input typically arrives as a sequence of comments/words over time instead of all at once. Thus, decisions based on partial information must be made while additional input is arriving. Moreover, social bias is complex by nature. Each type of bias is defined within its unique context, which, consistent with intersectionality theory within the social sciences, might be correlated with the contexts of other forms of bias. In this work, we consider debiasing toxicity detection as a sequential decision-making process where different biases can be interdependent. In particular, we study debiasing toxicity detection with two aims: (1) to examine whether different biases tend to correlate with each other; and (2) to investigate how to jointly mitigate these correlated biases in an interactive manner to minimize the total amount of bias. At the core of our approach is a framework built upon theories of sequential Markov Decision Processes that seeks to maximize the prediction accuracy and minimize the bias measures tailored to individual biases. Evaluations on two benchmark datasets empirically validate the hypothesis that biases tend to be correlated and corroborate the effectiveness of the proposed sequential debiasing strategy. 
    more » « less
  4. The leading difficulty in achieving the contrast necessary to directly image exoplanets and associated structures (e.g., protoplanetary disks) at wavelengths ranging from the visible to the infrared is quasi-static speckles (QSSs). QSSs are hard to distinguish from planets at the necessary level of precision to achieve high contrast. QSSs are the result of hardware aberrations that are not compensated for by the adaptive optics (AO) system; these aberrations are called non-common path aberrations (NCPAs). In 2013, Frazin showed how simultaneous millisecond telemetry from the wavefront sensor (WFS) and a science camera behind a stellar coronagraph can be used as input into a regression scheme that simultaneously and self-consistently estimates NCPAs and the sought-after image of the planetary system (exoplanetimage). When run in a closed-loop configuration, the WFS measures the corrected wavefront, called theAO residual(AOR)wavefront. The physical principle underlying the regression method is rather simple: when an image is formed at the science camera, the AOR modules both the speckles arising from NCPAs as well as the planetary image. Therefore, the AOR can be used as a probe to estimate NCPA and the exoplanet image via regression techniques. The regression approach is made more difficult by the fact that the AOR is not exactly known since it can be estimated only from the WFS telemetry. The simulations in the Part I paper provide results on the joint regression on NCPAs and the exoplanet image from three different methods, calledideal,naïve, andbias-correctedestimators. The ideal estimator is not physically realizable (it is useful as a benchmark for simulation studies), but the other two are. The ideal estimator uses true AOR values (available in simulation studies), but it treats the noise in focal plane images via standard linearized regression. Naïve regression uses the same regression equations as the ideal estimator, except that it substitutes the estimated values of the AOR for true AOR values in the regression formulas, which can result in problematic biases (however, Part I provides an example in which the naïve estimate makes a useful estimate of NCPAs). The bias-corrected estimator treats the errors in AOR estimates, but it requires the probability distribution that governs the errors in AOR estimates. This paper provides the regression equations for ideal, naïve, and bias-corrected estimators, as well as a supporting technical discussion.

    more » « less
  5. Abstract Background

    Estimation of genetic relatedness, or kinship, is used occasionally for recreational purposes and in forensic applications. While numerous methods were developed to estimate kinship, they suffer from high computational requirements and often make an untenable assumption of homogeneous population ancestry of the samples. Moreover, genetic privacy is generally overlooked in the usage of kinship estimation methods. There can be ethical concerns about finding unknown familial relationships in third-party databases. Similar ethical concerns may arise while estimating and reporting sensitive population-level statistics such as inbreeding coefficients for the concerns around marginalization and stigmatization.


    Here, we present SIGFRIED, which makes use of existing reference panels with a projection-based approach that simplifies kinship estimation in the admixed populations. We use simulated and real datasets to demonstrate the accuracy and efficiency of kinship estimation. We present a secure federated kinship estimation framework and implement a secure kinship estimator using homomorphic encryption-based primitives for computing relatedness between samples in two different sites while genotype data are kept confidential. Source code and documentation for our methods can be found at


    Analysis of relatedness is fundamentally important for identifying relatives, in association studies, and for estimation of population-level estimates of inbreeding. As the awareness of individual and group genomic privacy is growing, privacy-preserving methods for the estimation of relatedness are needed. Presented methods alleviate the ethical and privacy concerns in the analysis of relatedness in admixed, historically isolated and underrepresented populations.

    Short Abstract

    Genetic relatedness is a central quantity used for finding relatives in databases, correcting biases in genome wide association studies and for estimating population-level statistics. Methods for estimating genetic relatedness have high computational requirements, and occasionally do not consider individuals from admixed ancestries. Furthermore, the ethical concerns around using genetic data and calculating relatedness are not considered. We present a projection-based approach that can efficiently and accurately estimate kinship. We implement our method using encryption-based techniques that provide provable security guarantees to protect genetic data while kinship statistics are computed among multiple sites.

    more » « less