Language-based text provide valuable insights into people’s lived experiences. While traditional qualitative analysis is used to capture these nuances, new paradigms are needed to scale qualitative research effectively. Artificial intelligence presents an unprecedented opportunity to expand the sale of analysis for obtaining such nuances. The study tests the application of GPT-4—a large language modeling—in qualitative data analysis using an existing set of text data derived from 60 qualitative interviews. Specifically, the study provides a practical guide for social and behavioral researchers, illustrating core elements and key processes, demonstrating its reliability by comparing GPT-generated codes with researchers’ codes, and evaluating its capacity for theory-driven qualitative analysis. The study followed a three-step approach: (1) prompt engineering, (2) reliability assessment by comparison of GPT-generated codes with researchers’ codes, and (3) evaluation of theory-driven thematic analysis on psychological constructs. The study underscores the utility of GPT’s capabilities in coding and analyzing text data with established qualitative methods while highlighting the need for qualitative expertise to guide GPT applications. Recommendations for further exploration are also discussed.
more »
« less
Why the Data Revolution Needs Qualitative Methods
This essay draws on qualitative social science to propose a critical intellectual infrastructure for data science of social phenomena. Qualitative sensibilities— interpretivism, abductive reasoning, and reflexivity in particular—could address methodological problems that have emerged in data science and help extend the frontiers of social knowledge. First, an interpretivist lens—which is concerned with the construction of meaning in a given context—can enable the deeper insights that are requisite to understanding high-level behavioral patterns from digital trace data. Without such contextual insights, researchers often misinterpret what they find in large-scale analysis. Second, abductive reasoning—which is the process of using observations to generate a new explanation, grounded in prior assumptions about the world—is common in data science, but its application often is not systematized. Incorporating norms and practices from qualitative traditions for executing, describing, and evaluating the application of abduction would allow for greater transparency and accountability. Finally, data scientists would benefit from increased reflexivity—which is the process of evaluating how researchers’ own assumptions, experiences, and relationships influence their research. Studies demonstrate such aspects of a researcher’s experience that typically are unmentioned in quantitative traditions can influence research findings. Qualitative researchers have long faced these same concerns, and their training in how to deconstruct and document personal and intellectual starting points could prove instructive for data scientists. We believe these and other qualitative sensibilities have tremendous potential to facilitate the production of data science research that is more meaningful, reliable, and ethical.
more »
« less
- Award ID(s):
- 1823547
- PAR ID:
- 10302851
- Date Published:
- Journal Name:
- Harvard Data Science Review
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
What conditions enable novel intellectual contributions to diffuse and become integrated into later scientific work? Prior work tends to focus on whole cultural products, such as patents and articles, and emphasizes external social factors as important. This article focuses on concepts as reflections of ideas, and we identify the combined influence that social factors and internal intellectual structures have on ideational diffusion. To develop this perspective, we use computational techniques to identify nearly 60,000 new ideas introduced over two decades (1993 to 2016) in the Web of Science and follow their diffusion across 38 million later publications. We find new ideas diffuse more widely when they socially and intellectually resonate. New ideas become core concepts of science when they reach expansive networks of unrelated authors, achieve consistent intellectual usage, are associated with other prominent ideas, and fit with extant research traditions. These ecological conditions play an increasingly decisive role later in an idea’s career, after their relations with the environment are established. This work advances the systematic study of scientific ideas by moving beyond products to focus on the content of ideas themselves and applies a relational perspective that takes seriously the contingency of their success.more » « less
-
null (Ed.)Quantitative and qualitative studies of science have historically played radically different roles with opposing epistemological commitments. Using large-scale text analysis, we see that qualitative studies generate and value new theory, especially regarding the complex social and political contexts of scientific action, while quantitative approaches confirm existing theory and evaluate the performance of scientific institutions. Large-scale digital data and emerging computational methods could allow us to refigure these positions, turning qualitative artifacts into quantitative patterns into qualitative insights across many scales, heralding a new era of theory development, engagement, and relevance for scientists, policy-makers, and society.more » « less
-
This Work-in-Progress paper in the Research Category uses a retrospective mixed-methods study to better understand the factors that mediate learning of computational modeling by life scientists. Key stakeholders, including leading scientists, universities and funding agencies, have promoted computational modeling to enable life sciences research and improve the translation of genetic and molecular biology high- throughput data into clinical results. Software platforms to facilitate computational modeling by biologists who lack advanced mathematical or programming skills have had some success, but none has achieved widespread use among life scientists. Because computational modeling is a core engineering skill of value to other STEM fields, it is critical for engineering and computer science educators to consider how we help students from across STEM disciplines learn computational modeling. Currently we lack sufficient research on how best to help life scientists learn computational modeling. To address this gap, in 2017, we observed a short-format summer course designed for life scientists to learn computational modeling. The course used a simulation environment designed to lower programming barriers. We used semi-structured interviews to understand students' experiences while taking the course and in applying computational modeling after the course. We conducted interviews with graduate students and post- doctoral researchers who had completed the course. We also interviewed students who took the course between 2010 and 2013. Among these past attendees, we selected equal numbers of interview subjects who had and had not successfully published journal articles that incorporated computational modeling. This Work-in-Progress paper applies social cognitive theory to analyze the motivations of life scientists who seek training in computational modeling and their attitudes towards computational modeling. Additionally, we identify important social and environmental variables that influence successful application of computational modeling after course completion. The findings from this study may therefore help us educate biomedical and biological engineering students more effectively. Although this study focuses on life scientists, its findings can inform engineering and computer science education more broadly. Insights from this study may be especially useful in aiding incoming engineering and computer science students who do not have advanced mathematical or programming skills and in preparing undergraduate engineering students for collaborative work with life scientists.more » « less
-
Trust is fundamental to effective visual data communication between the visualization designer and the reader. Although personal experience and preference influence readers’ trust in visualizations, visualization designers can leverage design techniques to create visualizations that evoke a "calibrated trust," at which readers arrive after critically evaluating the information presented. To systematically understand what drives readers to engage in "calibrated trust," we must first equip ourselves with reliable and valid methods for measuring trust. Computer science and data visualization researchers have not yet reached a consensus on a trust definition or metric, which are essential to building a comprehensive trust model in human-data interaction. On the other hand, social scientists and behavioral economists have developed and perfected metrics that can measure generalized and interpersonal trust, which the visualization community can reference, modify, and adapt for our needs. In this paper, we gather existing methods for evaluating trust from other disciplines and discuss how we might use them to measure, define, and model trust in data visualization research. Specifically, we discuss quantitative surveys from social sciences, trust games from behavioral economics, measuring trust through measuring belief updating, and measuring trust through perceptual methods. We assess the potential issues with these methods and consider how we can systematically apply them to visualization research.more » « less
An official website of the United States government

