Data science pipelines inform and influence many daily decisions, from what we buy to who we work for and even where we live. When designed incorrectly, these pipelines can easily propagate social inequity and harm. Traditional solutions are technical in nature; e.g., mitigating biased algorithms. In this vision paper, we introduce a novel lens for promoting responsible data science using theories of behavior change that emphasize not only technical solutions but also the behavioral responsibility of practitioners. By integrating behavior change theories from cognitive psychology with data science workflow knowledge and ethics guidelines, we present a new perspective on responsible data science. We present example data science interventions in machine learning and visual data analysis, contextualized in behavior change theories that could be implemented to interrupt and redirect potentially suboptimal or negligent practices while reinforcing ethically conscious behaviors. We conclude with a call to action to our community to explore this new research area of behavior change interventions for responsible data science.
more »
« less
Data Integration as Coordination: The Articulation of Data Work in an Ocean Science Collaboration
Recent CSCW research on the collaborative design and development of research infrastructures for the natural sciences has increasingly focused on the challenges of open data sharing. This qualitative study describes and analyzes how multidisciplinary, geographically distributed ocean scientists are integrating highly diverse data as part of an effort to develop a new research infrastructure to advance science. This paper identifies different kinds of coordination that are necessary to align processes of data collection, production, and analysis. Some of the hard work to integrate data is undertaken before data integration can even become a technical problem. After data integration becomes a technical problem, social and organizational means continue to be critical for resolving differences in assumptions, methods, practices, and priorities. This work calls attention to the diversity of coordinative, social, and organizational practices and concerns that are needed to integrate data and also how, in highly innovative work, the process of integrating data also helps to define scientific problem spaces themselves.
more »
« less
- Award ID(s):
- 1954620
- PAR ID:
- 10310978
- Date Published:
- Journal Name:
- Proceedings of the ACM on Human-Computer Interaction
- Volume:
- 4
- Issue:
- CSCW3
- ISSN:
- 2573-0142
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Multiple studies call for engineering education to integrate social justice into classroom instruction. Yet, there is uncertainty regarding whether integrating these social topics into engineering curriculum will support or detract from the learning of technical concepts. This study focuses on evaluating how reframing technical assessments to include social justice concepts impacts student learning and investigates how well students integrate social justice into engineering decision making. Using a within-subject design, in which students were exposed to both conditions (questions with and without social justice context), we evaluate how social justice framing impacts overall student learning of technical topics. Social justice prompts are added to homework questions, and we assess students’ demonstration of knowledge of original technical content of the course, as well as their ability to consider social justice implications of engineering design. In the earlier homework assignment, the experimental group showed a significant decrease in learning when technical concepts were framed to include social justice. As the students became more familiar with social justice considerations, their learning of technical concepts became comparable to that of students who did not have the social justice components in their assignment. Their evaluation of the social implications of technical decisions also improved. History: This paper has been accepted for the INFORMS Transactions on Education Special Issue on DEI in ORMS Classrooms. Funding: This work was supported by the Carnegie Mellon University’s Wimmer Faculty Fellowship and the National Science Foundation [Grant 2053856]. D. Nock also acknowledges support from the Wilton E. Scott Institute for Energy Innovation, where she is an energy fellow.more » « less
-
Data curation is the process of making a dataset fit-for-use and archivable. It is critical to data-intensive science because it makes complex data pipelines possible, studies reproducible, and data reusable. Yet the complexities of the hands-on, technical, and intellectual work of data curation is frequently overlooked or downplayed. Obscuring the work of data curation not only renders the labor and contributions of data curators invisible but also hides the impact that curators' work has on the later usability, reliability, and reproducibility of data. To better understand the work and impact of data curation, we conducted a close examination of data curation at a large social science data repository, the Inter-university Consortium for Political and Social Research (ICPSR). We asked: What does curatorial work entail at ICPSR, and what work is more or less visible to different stakeholders and in different contexts? And, how is that curatorial work coordinated across the organization? We triangulated accounts of data curation from interviews and records of curation in Jira tickets to develop a rich and detailed account of curatorial work. While we identified numerous curatorial actions performed by ICPSR curators, we also found that curators rely on a number of craft practices to perform their jobs. The reality of their work practices defies the rote sequence of events implied by many life cycle or workflow models. Further, we show that craft practices are needed to enact data curation best practices and standards. The craft that goes into data curation is often invisible to end users, but it is well recognized by ICPSR curators and their supervisors. Explicitly acknowledging and supporting data curators as craftspeople is important in creating sustainable and successful curatorial infrastructures.more » « less
-
Human-designed systems are increasingly leveraged by data-driven methods and artificial intelligence. This leads to an urgent need for responsible design and ethical use. The goal of this conceptual paper is two-fold. First, we will introduce the Framework for Design Reasoning in Data Life-cycle Ethical Management, which integrates three existing frameworks: 1) the design reasoning quadrants framework (representing engineering design research), and 2) the data life-cycle model (representing data management), and 3) the reflexive principles framework (representing ethical decision-making). The integration of three critical components of the framework (design reasoning, data reasoning, and ethical reasoning) is accomplished by centering on the conscientious negotiation of design risks and benefits. Second, we will present an example of a student design project report to demonstrate how this framework guides educators towards delineating and integrating data reasoning, ethical reasoning, and design reasoning in settings where ethical issues (e.g., AI solutions) are commonly experienced. The framework can be implemented to design courses through design review conversations that seamlessly integrate ethical reasoning into the technical and data decision-making processes.more » « less
-
Advances in data infrastructure are often led by disciplinary initiatives aimed at innovation in federation and sharing of data and related research materials. In library and information science (LIS), the data services area has focused on data curation and stewardship to support description and deposit of data for access, reuse, and preservation. At the same time, solutions to societal grand challenges are thought to lie in convergence research, characterized by a problem-focused orientation and deep cross-disciplinary integration, requiring access to highly varied data sources with differing resolutions or scales. We argue that data curation and stewardship work in LIS should expand to foster convergence research based on a robust understanding of the dynamics of disciplinary and interdisciplinary research methods and practices. Highlighting unique contributions by Dr. Linda C. Smith to the field of LIS, we outline how her work illuminates problems that are core to current directions in convergence research. Drawing on advances in data infrastructure in the earth and geosciences and trends in qualitative domains, we emphasize the importance of metastructures and the necessary influence of disciplinary practice on principles, standards, and provisions for ethical use across the evolving data ecosystem.more » « less
An official website of the United States government

