skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.
Attention:The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 7:00 AM ET to 7:30 AM ET on Friday, April 24 due to maintenance. We apologize for the inconvenience.


Title: Putting the Data in MoDa: Integrating Agent-Based Modeling, Quantitative Data Analysis, and Teacher Responsivity to Investigate Complex Phenomena
This tech demo introduces key enhancements to MoDa: A free, open-source web based modeling and data analysis system designed to support students in making sense of complex systems. MoDa is a domain-specific, block-based computational modeling tool that allows students to build models side-by-side to real-world data. Our latest enhancements include integrating quantitative data in the system using the Common Online Data Analysis Platform and Activity Player to support structured data-rich investigations, allowing students to use a wider variety of real-world data sources to refine their model toward reproducing important features of system behavior both qualitatively, by reproducing important visual and dynamic behaviors, and quantitatively by allowing closer comparison of patterns, relationships, and variability.  more » « less
Award ID(s):
2445609
PAR ID:
10665604
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
International Society of the Learning Sciences
Date Published:
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Background In clinical research, important variables may be collected from multiple data sources. Physical pooling of patient-level data from multiple sources often raises several challenges, including proper protection of patient privacy and proprietary interests. We previously developed an SAS-based package to perform distributed regression—a suite of privacy-protecting methods that perform multivariable-adjusted regression analysis using only summary-level information—with horizontally partitioned data, a setting where distinct cohorts of patients are available from different data sources. We integrated the package with PopMedNet, an open-source file transfer software, to facilitate secure file transfer between the analysis center and the data-contributing sites. The feasibility of using PopMedNet to facilitate distributed regression analysis (DRA) with vertically partitioned data, a setting where the data attributes from a cohort of patients are available from different data sources, was unknown. Objective The objective of the study was to describe the feasibility of using PopMedNet and enhancements to PopMedNet to facilitate automatable vertical DRA (vDRA) in real-world settings. Methods We gathered the statistical and informatic requirements of using PopMedNet to facilitate automatable vDRA. We enhanced PopMedNet based on these requirements to improve its technical capability to support vDRA. Results PopMedNet can enable automatable vDRA. We identified and implemented two enhancements to PopMedNet that improved its technical capability to perform automatable vDRA in real-world settings. The first was the ability to simultaneously upload and download multiple files, and the second was the ability to directly transfer summary-level information between the data-contributing sites without a third-party analysis center. Conclusions PopMedNet can be used to facilitate automatable vDRA to protect patient privacy and support clinical research in real-world settings. 
    more » « less
  2. In an increasingly data-driven society, it is essential that students understand and critically engage with the data that surrounds them. A key aspect of accomplishing this is helping students understand the importance of data and the impact it can have on their lives. This paper examines the role of real-world authenticity in a high school interest-driven data science curriculum. Through student reflections and project outcomes analysis, the study highlights how real-world data use fosters data practices by allowing students to see data science as relevant and applicable to real-life issues. Findings indicate that students perceived the data exploration activities as authentic and valued the meaningfulness of the data, recognizing its relevance to real-life contexts. 
    more » « less
  3. This article describes the design, development, and evaluation of an undergraduate learning module that builds students’ skills on how data analysis and numerical modeling can be used to analyze and design water resources engineering projects. The module follows a project-based approach by using a hydrologic restoration project in a coastal basin in south Louisiana, USA. The module has two main phases, a feasibility analysis phase and a hydraulic design phase, and follows an active learning approach where students perform a set of quantitative learning activities that involve extensive data and modeling analyses. The module is designed using open resources, including online datasets, hydraulic simulation models and geographical information system software that are typically used by the engineering industry and research communities. Upon completing the module, students develop skills that involve model formulation, parameter calibration, sensitivity analysis, and the use of data and models to assess and design a hydrologic a proposed hydrologic engineering project. Guided by design-based research framework, the implementation and evaluation of the module focused primarily on assessing students’ perceptions of the module usability and its design attributes, their perceived contribution of the module to their learning, and their overall receptiveness of the module and how it impacts their interest in the subject and future careers. Following an improvement-focused evaluation approach, design attributes that were found most critical to students included the use of user-support resources and self-checking mechanisms. These aspects were identified as key features that facilitate students’ self-learning and independent completion of tasks, while still enriching their learning experiences when using data and modeling-rich applications. Evaluation data showed that the following attributes contributed the most to students’ learning and potential value for future careers: application of modern engineering data analysis; use of real-world hydrologic datasets; and appreciation of uncertainties and challenges imposed by data scarcity. The evaluation results were used to formulate a set of guiding principles on how to design effective and conducive undergraduate learning experiences that adopt technology-enhanced and data and modeling- based strategies, on how to enhance users’ experiences with free and open-source engineering analysis tools, and on how to strike a pedagogical balance between module complexity, student engagement, and flexibility to fit within existing curricula limitations. 
    more » « less
  4. Abstract When learning about scientific phenomena, students are expected tomechanisticallyexplain how underlying interactions produce the observable phenomenon andconceptuallyconnect the observed phenomenon to canonical scientific knowledge. This paper investigates how the integration of the complementary processes of designing and refining computational models using real‐world data can support students in developing mechanistic and canonically accurate explanations of diffusion. Specifically, we examine two types of shifts in how students explain diffusion as they create and refine computational models using real‐world data: a shift towards mechanistic reasoning and a shift from noncanonical to canonical explanations. We present descriptive statistics for the whole class as well as three student work examples to illustrate these two shifts as 6th grade students engage in an 8‐day unit on the diffusion of ink in hot and cold water. Our findings show that (1) students develop mechanistic explanations as they build agent‐based models, (2) students' mechanistic reasoning can co‐exist with noncanonical explanations, and (3) students shift their thinking toward canonical explanations after comparing their models against data. These findings could inform the design of modeling tools that support learners in both expressing a diverse range of mechanistic explanations of scientific phenomena and aligning those explanations with canonical science. 
    more » « less
  5. Driven by steady progress in deep generative modeling, simulation-based inference (SBI) has emerged as the workhorse for inferring the parameters of stochastic simulators. However, recent work has demonstrated that model misspecification can compromise the reliability of SBI, preventing its adoption in important applications where only misspecified simulators are available. This work introduces robust posterior estimation~(RoPE), a framework that overcomes model misspecification with a small real-world calibration set of ground-truth parameter measurements. We formalize the misspecification gap as the solution of an optimal transport~(OT) problem between learned representations of real-world and simulated observations, allowing RoPE to learn a model of the misspecification without placing additional assumptions on its nature. RoPE demonstrates how OT and a calibration set provide a controllable balance between calibrated uncertainty and informative inference, even under severely misspecified simulators. Results on four synthetic tasks and two real-world problems with ground-truth labels demonstrate that RoPE outperforms baselines and consistently returns informative and calibrated credible intervals. 
    more » « less