skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on July 29, 2026

Title: Summer of Reproducibility: Building Global Capacity for Practical Reproducibility through Hands-On Mentorship
Thanks to increasing awareness of the importance of reproducibility in computer science research, initiatives such as artifact review and badging have been introduced to encourage reproducible research in this field. However, making "practical reproducibility" truly widespread requires more than just incentives. It demands an increase in capacity for reproducible research among computer scientists - more tools, workflows, and exemplar artifacts, and more human resources trained in best practices for reproducibility. In this paper, we describe our experiences in the first two years of the Summer of Reproducibility (SoR), a mentoring program that seeks to build global capacity by enabling students around the world to work with expert mentors while producing reproducibility artifacts, tools, and education materials. We give an overview of the program, report preliminary outcomes, and discuss plans to evolve this program.  more » « less
Award ID(s):
2226408
PAR ID:
10636854
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
ACM
Date Published:
Format(s):
Medium: X
Location:
Vancouver, BC, Canada
Sponsoring Org:
National Science Foundation
More Like this
  1. Tirthankar Ghosal, Sergi Blanco-Cuaresma (Ed.)
    Reproducibility is an important feature of science; experiments are retested, and analyses are repeated. Trust in the findings increases when consistent results are achieved. Despite the importance of reproducibility, significant work is often involved in these efforts, and some published findings may not be reproducible due to oversights or errors. In this paper, we examine a myriad of features in scholarly articles published in computer science conferences and journals and test how they correlate with reproducibility. We collected data from three different sources that labeled publications as either reproducible or irreproducible and employed statistical significance tests to identify features of those publications that hold clues about reproducibility. We found the readability of the scholarly article and accessibility of the software artifacts through hyperlinks to be strong signals noticeable amongst reproducible scholarly articles. 
    more » « less
  2. With increasing recognition of the importance of reproducibility in computer science research, a wide range of efforts to promote reproducible research have been implemented across various sub-disciplines of computer science. These include artifact review and badging processes, and dedicated reproducibility tracks at conferences. However, these initiatives primarily engage active researchers and students already involved in research in their respective areas. In this paper, we present an argument for expanding the scope of these efforts to include a much larger audience, by introducing more reproducibility content into computer science courses. We describe various ways to integrate reproducibility content into the curriculum, drawing on our own experiences, as well as published experience reports from several sub-disciplines of computer science and computational science. 
    more » « less
  3. Practical reproducibility is the ability to reproduce results is a manner that is cost-effective enough to become a vehicle of mainstream scientific exploration. Since computational research artifacts usually require some form of computing to interpret, open and programmable infrastructure, such as a range of NSF-supported testbeds spanning infrastructure from datacen- ter through networks to wireless systems, is a necessary – but not sufficient – requirement for reproducibility. The question arises what other services and tools should build on the availability of such programmable infrastructure to foster the development and sharing of findable, accessible, integrated, and reusable (FAIR) experiments that underpin practical reproducibility. In this paper, we propose three such services addressing the problems of packaging for reuse, findability, and accessibility, respectively. We describe how we developed these services in Chameleon, an NSF-funded testbed for computer science research which has supported the research of a community of 8,000+ users, and discuss their strengths and limitations. 
    more » « less
  4. Reproducibility of research in Computer Science and in the field of networking in particular is a well-recognized problem. For several reasons, including the sensitive and/or proprietary nature of some Internet measurements, the networking research community pays limited attention to the of reproducibility of results, instead tending to accept papers that appear plausible. This article summarises a 2.5 day long Dagstuhl seminar on Encouraging Reproducibility in Scientific Research of the Internet held in October 2018. The seminar discussed challenges to improving reproducibility of scientific Internet research, and developed a set of recommendations that we as a community can undertake to initiate a cultural change toward reproducibility of our work. It brought together people both from academia and industry to set expectations and formulate concrete recommendations for reproducible research. This iteration of the seminar was scoped to computer networking research, although the outcomes are likely relevant for a broader audience from multiple interdisciplinary fields. 
    more » « less
  5. What new questions could ecophysiologists answer if physio-logging research was fully reproducible? We argue that technical debt (computational hurdles resulting from prioritizing short-term goals over long-term sustainability) stemming from insufficient cyberinfrastructure (field-wide tools, standards, and norms for analyzing and sharing data) trapped physio-logging in a scientific silo. This debt stifles comparative biological analyses and impedes interdisciplinary research. Although physio-loggers (e.g., heart rate monitors and accelerometers) opened new avenues of research, the explosion of complex datasets exceeded ecophysiology’s informatics capacity. Like many other scientific fields facing a deluge of complex data, ecophysiologists now struggle to share their data and tools. Adapting to this new era requires a change in mindset, from “data as a noun” (e.g., traits, counts) to “data as a sentence”, where measurements (nouns) are associate with transformations (verbs), parameters (adverbs), and metadata (adjectives). Computational reproducibility provides a framework for capturing the entire sentence. Though usually framed in terms of scientific integrity, reproducibility offers immediate benefits by promoting collaboration between individuals, groups, and entire fields. Rather than a tax on our productivity that benefits some nebulous greater good, reproducibility can accelerate the pace of discovery by removing obstacles and inviting a greater diversity of perspectives to advance science and society. In this article, we 1) describe the computational challenges facing physio-logging scientists and connect them to the concepts of technical debt and cyberinfrastructure , 2) demonstrate how other scientific fields overcame similar challenges by embracing computational reproducibility, and 3) present a framework to promote computational reproducibility in physio-logging, and bio-logging more generally. 
    more » « less