Abstract In this paper, we explore the crucial role and challenges of computational reproducibility in geosciences, drawing insights from the Climate Informatics Reproducibility Challenge (CICR) in 2023. The competition aimed at (1) identifying common hurdles to reproduce computational climate science; and (2) creating interactive reproducible publications for selected papers of the Environmental Data Science journal. Based on lessons learned from the challenge, we emphasize the significance of open research practices, mentorship, transparency guidelines, as well as the use of technologies such as executable research objects for the reproduction of geoscientific published research. We propose a supportive framework of tools and infrastructure for evaluating reproducibility in geoscientific publications, with a case study for the climate informatics community. While the recommendations focus on future CIRCs, we expect they would be beneficial for wider umbrella of reproducibility initiatives in geosciences.
more »
« less
Initial Thoughts on Cybersecurity And Reproducibility
Cybersecurity, which serves to protect computer systems and data from malicious and accidental abuse and changes, both supports and challenges the reproducibility of computational science. This position paper explores a research agenda by enumerating a set of two types of challenges that emerge at the intersection of cybersecurity and reproducibility: challenges that cybersecurity has in supporting the reproducibility of computational science, and challenges cybersecurity creates for reproducibility of computational science.
more »
« less
- PAR ID:
- 10155431
- Date Published:
- Journal Name:
- 2nd International Workshop on Practical Reproducible Evaluation of Computer Systems (P-RECS'19)
- Page Range / eLocation ID:
- 13 to 15
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
What new questions could ecophysiologists answer if physio-logging research was fully reproducible? We argue that technical debt (computational hurdles resulting from prioritizing short-term goals over long-term sustainability) stemming from insufficient cyberinfrastructure (field-wide tools, standards, and norms for analyzing and sharing data) trapped physio-logging in a scientific silo. This debt stifles comparative biological analyses and impedes interdisciplinary research. Although physio-loggers (e.g., heart rate monitors and accelerometers) opened new avenues of research, the explosion of complex datasets exceeded ecophysiology’s informatics capacity. Like many other scientific fields facing a deluge of complex data, ecophysiologists now struggle to share their data and tools. Adapting to this new era requires a change in mindset, from “data as a noun” (e.g., traits, counts) to “data as a sentence”, where measurements (nouns) are associate with transformations (verbs), parameters (adverbs), and metadata (adjectives). Computational reproducibility provides a framework for capturing the entire sentence. Though usually framed in terms of scientific integrity, reproducibility offers immediate benefits by promoting collaboration between individuals, groups, and entire fields. Rather than a tax on our productivity that benefits some nebulous greater good, reproducibility can accelerate the pace of discovery by removing obstacles and inviting a greater diversity of perspectives to advance science and society. In this article, we 1) describe the computational challenges facing physio-logging scientists and connect them to the concepts of technical debt and cyberinfrastructure , 2) demonstrate how other scientific fields overcame similar challenges by embracing computational reproducibility, and 3) present a framework to promote computational reproducibility in physio-logging, and bio-logging more generally.more » « less
-
Reproducibility is fundamental to science, and an important component of reproducibility is computational reproducibility: the ability of a researcher to recreate the results of a published study using the original author’s raw data and code. Although most people agree that computational reproducibility is important, it is still difficult to achieve in practice. In this article, the authors describe their approach to enabling computational reproducibility for the 12 articles in this special issue of Socius about the Fragile Families Challenge. The approach draws on two tools commonly used by professional software engineers but not widely used by academic researchers: software containers (e.g., Docker) and cloud computing (e.g., Amazon Web Services). These tools made it possible to standardize the computing environment around each submission, which will ease computational reproducibility both today and in the future. Drawing on their successes and struggles, the authors conclude with recommendations to researchers and journals.more » « less
-
The scientific computing community has long taken a leadership role in understanding and assessing the relationship of reproducibility to cyberinfrastructure, ensuring that computational results - such as those from simulations - are "reproducible", that is, the same results are obtained when one re-uses the same input data, methods, software and analysis conditions. Starting almost a decade ago, the community has regularly published and advocated for advances in this area. In this article we trace this thinking and relate it to current national efforts, including the 2019 National Academies of Science, Engineering, and Medicine report on "Reproducibility and Replication in Science". To this end, this work considers high performance computing workflows that emphasize workflows combining traditional simulations (e.g. Molecular Dynamics simulations) with in situ analytics. We leverage an analysis of such workflows to (a) contextualize the 2019 National Academies of Science, Engineering, and Medicine report's recommendations in the HPC setting and (b) envision a path forward in the tradition of community driven approaches to reproducibility and the acceleration of science and discovery. The work also articulates avenues for future research at the intersection of transparency, reproducibility, and computational infrastructure that supports scientific discovery.more » « less
-
With increasing recognition of the importance of reproducibility in computer science research, a wide range of efforts to promote reproducible research have been implemented across various sub-disciplines of computer science. These include artifact review and badging processes, and dedicated reproducibility tracks at conferences. However, these initiatives primarily engage active researchers and students already involved in research in their respective areas. In this paper, we present an argument for expanding the scope of these efforts to include a much larger audience, by introducing more reproducibility content into computer science courses. We describe various ways to integrate reproducibility content into the curriculum, drawing on our own experiences, as well as published experience reports from several sub-disciplines of computer science and computational science.more » « less
An official website of the United States government

