skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Critical companionship: Some sensibilities for studying the lived experience of data subjects
What are the challenges of turning data subjects into research participants—and how can we approach this task responsibly? In this paper, we develop a methodology for studying the lived experiences of people who are subject to automated scoring systems. Unlike most media technologies, automated scoring systems are designed to track and rate specific qualities of people without their active participation. Credit scoring, risk assessments, and predictive policing all operate obliquely in the background long before they come to matter. In doing so, they constitute a problem not only for those subject to these systems but also for researchers who try to study their experience. Specifically, we identify three challenges that are distinct to studying experiences of automated scoring: limited awareness, embeddedness, and ongoing inquiry. Starting from the observation that coming to terms with one's position as a data subject constitutes a form of learning in its own right, we propose a research strategy called critical companionship. Originally articulated in the context of nursing research, critical companionship invites us to accompany a data subject over time, paying critical attention to how the participant's and the researcher's inquiries complicate and constitute each other. We illustrate the strengths and limitations of this methodology with materials from a recent study we conducted about people's credit repair practices and sketch a set of sensibilities for studying contemporary scoring systems from the margins.  more » « less
Award ID(s):
1848286
PAR ID:
10541751
Author(s) / Creator(s):
 ;  
Publisher / Repository:
SAGE Publications
Date Published:
Journal Name:
Big Data & Society
Volume:
8
Issue:
2
ISSN:
2053-9517
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The need for responsible data management intensifies with the growing impact of data on society. One central locus of the societal impact of data are Automated Decision Systems (ADS), socio-legal-technical systems that are used broadly in industry, non-pro fits, and government. ADS process data about people, help make decisions that are consequential to people's lives, are designed with the stated goals of improving efficiency and promoting equitable access to opportunity, involve a combination of human and automated decision making, and are subject to auditing for legal compliance and to public disclosure. They may or may not use AI, and may or may not operate with a high degree of autonomy, but they rely heavily on data. In this article, we argue that the data management community is uniquely positioned to lead the responsible design, development, use, and oversight of ADS. We outline a technical research agenda that requires that we step outside our comfort zone of engineering for efficiency and accuracy, to also incorporate reasoning about values and beliefs. This seems high-risk, but one of the upsides is being able to explain to our children what we do and why it matters. 
    more » « less
  2. Recent data reveal that a higher percentage of Black women (9.7%) are enrolled in college than any other group, topping Asian women (8.7%), White women (7.1%) and White men (6.1%). Despite these gains in college attendance, Black women are often underrepresented in the fields of engineering and computer science. This paper presents the findings from a qualitative study that investigated the identity and experiences of Black women who are pursuing doctoral degrees in engineering and computer science. This research is grounded on the tenet that one cannot effectively serve or impact a community until he/she genuinely understands the issues and challenges facing the people who are its members. This work explores how Black female doctoral students persist in environments where they are grossly underrepresented. Content analysis is used to examine interview data obtained from 13 Black women who are pursing doctoral degrees in engineering and computer science. This paper concludes with some of the key challenges these women face in their programs on a daily basis. The goal of this research is to bring awareness to not only the challenges, but also potential strategies to increase the retention and persistence of Black women in engineering and computer science across all academic levels. 
    more » « less
  3. null (Ed.)
    Automated systems like self-driving cars and “smart” thermostats are a challenge for fault-based legal regimes like negligence because they have the potential to behave in unpredictable ways. How can people who build and deploy complex automated systems be said to be at fault when they could not have reasonably anticipated the behavior (and thus risk) of their tools? 
    more » « less
  4. Research Problem. Computer science (CS) education researchers conducting studies that target high school students have likely seen their studies impacted by COVID-19. Interpreting research findings impacted by COVID-19 presents unique challenges that will require a deeper understanding as to how the pandemic has affected underserved and underrepresented students studying or unable to study computing. Research Question. Our research question for this study was: In what ways has the high school computer science educational ecosystem for students been impacted by COVID-19, particularly when comparing schools based on relative socioeconomic status of a majority of students? Methodology. We used an exploratory sequential mixed methods study to understand the types of impacts high school CS educators have seen in their practice over the past year using the CAPE theoretical dissaggregation framework to measure schools’ Capacity to offer CS, student Access to CS education, student Participation in CS, and Experiences of students taking CS. Data Collection Procedure. We developed an instrument to collect qualitative data from open-ended questions, then collected data from CS high school educators (n = 21) and coded them across CAPE. We used the codes to create a quantitative instrument. We collected data from a wider set of CS high school educators ( n = 185), analyzed the data, and considered how these findings shape research conducted over the last year. Findings. Overall, practitioner perspectives revealed that capacity for CS Funding, Policy & Curriculum in both types of schools grew during the pandemic, while the capacity to offer physical and human resources decreased. While access to extracurricular activities decreased, there was still a significant increase in the number of CS courses offered. Fewer girls took CS courses and attendance decreased. Student learning and engagement in CS courses were significantly impacted, while other noncognitive factors like interest in CS and relevance of technology saw increases. Practitioner perspectives also indicated that schools serving students from lower-income families had 1) a greater decrease in the number of students who received information about CS/CTE pathways; 2) a greater decrease in the number of girls enrolled in CS classes; 3) a greater decrease in the number of students receiving college credit for dual-credit CS courses; 4) a greater decrease in student attendance; and 5) a greater decrease in the number of students interested in taking additional CS courses. On the flip-side, schools serving students from higher income families had significantly higher increases in the number of students interested in taking additional CS courses. 
    more » « less
  5. Furht, Borko; Khoshgoftaar, Taghi (Ed.)
    Acquiring labeled datasets often incurs substantial costs primarily due to the requirement of expert human intervention to produce accurate and reliable class labels. In the modern data landscape, an overwhelming proportion of newly generated data is unlabeled. This paradigm is especially evident in domains such as fraud detection and datasets for credit card fraud detection. These types of data have their own difficulties associated with being highly class imbalanced, which poses its own challenges to machine learning and classification. Our research addresses these challenges by extensively evaluating a novel methodology for synthesizing class labels for highly imbalanced credit card fraud data. The methodology uses an autoencoder as its underlying learner to effectively learn from dataset features to produce an error metric for use in creating new binary class labels. The methodology aims to automatically produce new labels with minimal expert input. These class labels are then used to train supervised classifiers for fraud detection. Our empirical results show that the synthesized labels are of high enough quality to produce classifiers that significantly outperform a baseline learner comparison when using area under the precision-recall curve (AUPRC). We also present results of varying levels of positive-labeled instances and their effect on classifier performance. Results show that AUPRC performance improves as more instances are labeled positive and belong to the minority class. Our methodology thereby effectively addresses the concerns of high class imbalance in machine learning by creating new and effective class labels. 
    more » « less