We examine how developers of data science curricula determine what makes a pedagogically effective dataset enabling 10–14 year-old students (“middle school” in the United States) to engage in the data investigation cycle by posing their own questions about relationships among variables. We describe strategies for curating existing datasets to address goals for learning about data, and for optimizing the use of these datasets once they are curated. We investigate how data science educators can transform existing datasets into ones appropriate for students with little data experience, drawing on our experience working with several publicly available datasets, which students explored in CODAP (the Common Online Data Analysis Platform).
more »
« less
This content will become publicly available on December 10, 2025
Functional Data Science for Secondary-School Students
CODAP is a widely-used programming environment for secondary school data science. Its direct-manipulation–based design offers many advantages to learners, especially younger students. Unfortunately, these same advantages can become a liability when it comes to repeating operations consistently, replaying operations (for reproducibility), and also for learning abstraction.In response, we have extended CODAP with CODAP Transformers, which add a notion of functions to CODAP. These provide a gentle introduction to reuse and abstraction in the data science context. We present a critique of CODAP that justifies our extension, describe the extension, and showcase some novel operations. Our extension has been integrated into the CODAP codebase, and is now part of the standard CODAP tool. It is already in use by the Bootstrap curriculum.
more »
« less
- Award ID(s):
- 2208731
- PAR ID:
- 10585424
- Publisher / Repository:
- Vilnius University
- Date Published:
- Journal Name:
- Informatics in Education
- ISSN:
- 1648-5831
- Page Range / eLocation ID:
- 723 to 734
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Innovative dynamic data tools afford opportunities for K-12 students and teachers to explore multivariate data and create linked data representations. These tools also support engagement in data moves, which are transnumerative actions to process, organize, and visualize data. The current study sought to understand how prospective K-12 mathematics teachers (PMTs) use data moves in the Common Online Data Analysis Platform (CODAP) to create and interpret visualizations and statistical measures to make sense of state-level data about education in the United States. Extending the work of Erickson et al. (Citation2019), a framework is presented to characterize data moves and provide examples of actions within CODAP that illustrate each data move. Based on analysis of thirty screencasts created by PMTs, four examples highlight PMTs’ use of data moves to investigate data in CODAP.more » « less
-
Abstract The science objectives of the LISA mission have been defined under the implicit assumption of a 4-years continuous data stream. Based on the performance of LISA Pathfinder, it is now expected that LISA will have a duty cycle of $$\approx 0.75$$ ≈ 0.75 , which would reduce the effective span of usable data to 3 years. This paper reports the results of a study by the LISA Science Group, which was charged with assessing the additional science return of increasing the mission lifetime. We explore various observational scenarios to assess the impact of mission duration on the main science objectives of the mission. We find that the science investigations most affected by mission duration concern the search for seed black holes at cosmic dawn, as well as the study of stellar-origin black holes and of their formation channels via multi-band and multi-messenger observations. We conclude that an extension to 6 years of mission operations is recommended.more » « less
-
Through the “COVID-Inspired Data Science through Epidemiology Education” project, 400 underserved middle-school youth across the United States are engaging in a 20-hour out-of-school data club centered on a novel. The narrative is integrated with hands-on data activities and modeling (e.g., creating graphs of infections over time in CODAP; modeling disease transmission rates in NetLogo). Youth learn to: 1) Use data tools to track the spread of a variety of infectious diseases; 2) Ask and address their own questions of data; and 3) Use data to communicate to local audiences about epidemiological patterns and challenges. The project breaks new ground in integrating data science with epidemiology education for 11–14-year-old youth.more » « less
-
In-memory key-value stores that use kernel-bypass networking serve millions of operations per second per machine with microseconds of latency. They are fast in part because they are simple, but their simple interfaces force applications to move data across the network. This is inefficient for operations that aggregate over large amounts of data, and it causes delays when traversing complex data structures. Ideally, applications could push small functions to storage to avoid round trips and data movement; however, pushing code to these fast systems is challenging. Any extra complexity for interpreting or isolating code cuts into their latency and throughput benefits. We present Splinter, a low-latency key-value store that clients extend by pushing code to it. Splinter is designed for modern multi-tenant data centers; it allows mutually distrusting tenants to write their own fine-grained extensions and push them to the store at runtime. The core of Splinter’s design relies on type- and memory-safe extension code to avoid conventional hardware isolation costs. This still allows for bare-metal execution, avoids data copying across trust boundaries, and makes granular storage functions that perform less than a microsecond of compute practical. Our measurements show that Splinter can process 3.5 million remote extension invocations per second with a median round-trip latency of less than 9 μs at densities of more than 1,000 tenants per server. We provide an implementation of Facebook’s TAO as an 800 line extension that, when pushed to a Splinter server, improves performance by 400 Kop/s to perform 3.2 Mop/s over online graph data with 30 μs remote access times.more » « less
An official website of the United States government
