This study investigates the data science inquiry process of high school students from populations historically excluded in computing-related fields. We analyzed 213 student-generated questions from the final project of a newly implemented interest-driven data science curriculum. We used a qualitative analytic approach to identify dominant themes of interest and assess question complexity and scope through four stages of data collection. Findings reveal a shift from descriptive to more complex, evaluative, and exploratory questions. Students asked questions from diverse themes, with music and animals being the most common. These insights highlight the importance of scaffolding, culturally relevant content, and adaptive instructional strategies in data science education to empower students from marginalized backgrounds and foster their engagement and success in the field.
more »
« less
This content will become publicly available on February 18, 2026
Investigating the Evolution of Interest-Driven Data Science Questions Posed by High-School Students
In today's data-driven world, students must be able to explore and analyze the data surrounding them. A crucial aspect of this process is formulating meaningful research questions that can be addressed with the available data. This study investigates the data science inquiry process of high school students. We analyzed 213 student-generated questions from the final project of an innovative interest-driven data science curriculum. Through a qualitative analytic approach, we examined changes in question types, complexity, and scope across four stages of data collection. The findings shed light on a shift from descriptive to more complex, evaluative, and exploratory questions. It also highlights the importance of providing scaffolding, culturally relevant content, and adaptive instructional strategies in data science education. These elements are essential for empowering students from marginalized backgrounds and fostering their engagement and success in the field.
more »
« less
- Award ID(s):
- 2141655
- PAR ID:
- 10653766
- Publisher / Repository:
- Data Science Education K-12 Conference
- Date Published:
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
null (Ed.)This paper describes the design and classroom implementation of a week-long unit that aims to integrate computational thinking (CT) into middle school science classes using programmable sensor technology. The goals of this sensor immersion unit are to help students understand why and how to use sensor and visualization technology as a powerful data-driven tool for scientific inquiry in ways that align with modern scientific practice. The sensor immersion unit is anchored in the investigation of classroom data where students engage with the sensor technology to ask questions about and design displays of the collected data. Students first generate questions about how data data displays work and then proceed through a set of programming exercises to help them understand how to collect and display data collected from their classrooms by building their own mini data displays. Throughout the unit students draw and update their hand drawn models representing their current understanding of how the data displays work. The sensor immersion unit was implemented by ten middle school science teachers during the 2019/2020 school year. Student drawn models of the classroom data displays from four of these teachers were analyzed to examine students’ understandings in four areas: func- tion of sensor components, process models of data flow, design of data displays, and control of the display. Students showed the best understanding when describing sensor components. Students exhibited greater confusion when describing the process of how data streams moved through displays and how programming controlled the data displays.more » « less
-
This paper describes the design and classroom implementation of a week-long unit that aims to integrate computational thinking (CT) into middle school science classes using programmable sensor technology. The goals of this sensor immersion unit are to help students understand why and how to use sensor and visualization technology as a powerful data-driven tool for scientific inquiry in ways that align with modern scientific practice. The sensor immersion unit is anchored in the investigation of classroom data where students engage with the sensor technology to ask questions about and design displays of the collected data. Students first generate questions about how data data displays work and then proceed through a set of programming exercises to help them understand how to collect and display data collected from their classrooms by building their own mini data displays. Throughout the unit students draw and update their hand drawn models representing their current understanding of how the data displays work. The sensor immersion unit was implemented by ten middle school science teachers during the 2019/2020 school year. Student drawn models of the classroom data displays from four of these teachers were analyzed to examine students’ understandings in four areas: func- tion of sensor components, process models of data flow, design of data displays, and control of the display. Students showed the best understanding when describing sensor components. Students exhibited greater confusion when describing the process of how data streams moved through displays and how programming controlled the data displays.more » « less
-
The emphasis on conceptual learning and the development of adaptive instructional design are both emerging areas in science and engineering education. Instructors are writing their own conceptual questions to promote active learning during class and utilizing pools of these questions in assessments. For adaptive assessment strategies, these questions need to be rated based on difficulty level (DL). Historically DL has been determined from the performance of a suitable number of students. The research study reported here investigates whether instructors can save time by predicting DL of newly made conceptual questions without the need for student data. In this paper, we report on the development of one component in an adaptive learning module for materials science – specifically on the topic of crystallography. The summative assessment element consists of five DL scales and 15 conceptual questions This adaptive assessment directs students based on their previous performances and the DL of the questions. Our five expert participants are faculty members who have taught the introductory Materials Science course multiple times. They provided predictions for how many students would answer each question correctly during a two-step process. First, predictions were made individually without an answer key. Second, experts had the opportunity to revise their predictions after being provided an answer key in a group discussion. We compared expert predictions with actual student performance using results from over 400 students spanning multiple courses and terms. We found no clear correlation between expert predictions of the DL and the measured DL from students. Some evidence shows that discussion during the second step made expert predictions closer to student performance. We suggest that, in determining the DL for conceptual questions, using predictions of the DL by experts who have taught the course is not a valid route. The findings in this paper can be applied to assessments in both in-person, hybrid, and online settings and is applicable to subject matter beyond materials science.more » « less
-
Despite the elevated importance of Data Science in Statistics, there exists limited research investigating how students learn the computing concepts and skills necessary for carrying out data science tasks. Computer Science educators have investigated how students debug their own code and how students reason through foreign code. While these studies illuminate different aspects of students’ programming behavior or conceptual understanding, a method has yet to be employed that can shed light on students’ learning processes. This type of inquiry necessitates qualitative methods, which allow for a holistic description of the skills a student uses throughout the computing code they produce, the organization of these descriptions into themes, and a comparison of the emergent themes across students or across time. In this article we share how to conceptualize and carry out the qualitative coding process with students’ computing code. Drawing on the Block Model to frame our analysis, we explore two types of research questions which could be posed about students’ learning. Supplementary materials for this article are available online.more » « less
An official website of the United States government
