skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Three Pillars of Practical Reproducibility
Practical reproducibility is the ability to reproduce results is a manner that is cost-effective enough to become a vehicle of mainstream scientific exploration. Since computational research artifacts usually require some form of computing to interpret, open and programmable infrastructure, such as a range of NSF-supported testbeds spanning infrastructure from datacen- ter through networks to wireless systems, is a necessary – but not sufficient – requirement for reproducibility. The question arises what other services and tools should build on the availability of such programmable infrastructure to foster the development and sharing of findable, accessible, integrated, and reusable (FAIR) experiments that underpin practical reproducibility. In this paper, we propose three such services addressing the problems of packaging for reuse, findability, and accessibility, respectively. We describe how we developed these services in Chameleon, an NSF-funded testbed for computer science research which has supported the research of a community of 8,000+ users, and discuss their strengths and limitations.  more » « less
Award ID(s):
2226406
PAR ID:
10560382
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
ReWorDS 23 workshop at eScience'23
Date Published:
Format(s):
Medium: X
Location:
Limassol, Cyprus
Sponsoring Org:
National Science Foundation
More Like this
  1. Communications infrastructures and compute resources are critical to enabling advanced science research projects. Science cyberinfrastructures must meet clear performance requirements, must be adjustable to changing requirements and must facilitate reproducibility. These characteristics can be met by a programmable infrastructure with guaranteed resources such as the BRIDGES infrastructure enabling cross Atlantic research projects. While programmability should be a foundational design principle for research cyberinfrastructures, by itself might not be sufficient to enabling scientists who have no or limited experience with advanced IT technologies operate their testbeds independent of IT support teams. The trend of offering “no code” platforms enabling users without IT core competency to achieve business goals should manifest itself in the context of research and educational infrastructures as well. In this paper we describe the architecture of a “no code” platform which would enable scientists to easily configure and modify a programmable infrastructure by using a large language model-based interface integrated with the composable services language of the infrastructure. The BRIDGES testbed is used as an example for such an integration where the functionality benefits projects operated by large, diverse teams. 
    more » « less
  2. Large scientific facilities are unique and complex infrastructures that have become fundamental instruments for enabling high quality, world-leading research to tackle scientific problems at unprecedented scales. Cyberinfrastructure (CI) is an essential component of these facilities, providing the user community with access to data, data products, and services with the potential to transform data into knowledge. However, the timely evolution of the CI available at large facilities is challenging and can result in science communities requirements not being fully satisfied. Furthermore, integrating CI across multiple facilities as part of a scientific workflow is hard, resulting in data silos. In this paper, we explore how science gateways can provide improved user experiences and services that may not be offered at large facility datacenters. Using a science gateway supported by the Science Gateway Community Institute, which provides subscription-based delivery of streamed data and data products from the NSF Ocean Observatories Initiative (OOI), we propose a system that enables streaming-based capabilities and workflows using data from large facilities, such as the OOI, in a scalable manner. We leverage data infrastructure building blocks, such as the Virtual Data Collaboratory, which provides data and comput- ing capabilities in the continuum to efficiently and collaboratively integrate multiple data-centric CIs, build data-driven workflows, and connect large facilities data sources with NSF-funded CI, such as XSEDE. We also introduce architectural solutions for running these workflows using dynamically provisioned federated CI. 
    more » « less
  3. FABRIC is a unique national research infrastructure to enable cutting-edge andexploratory research at-scale in networking, cybersecurity, distributed computing andstorage systems, machine learning, and science applications. It is an everywhere-programmable nationwide instrument comprised of novel extensible network elementsequipped with large amounts of compute and storage, interconnected by high speed,dedicated optical links. It will connect a number of specialized testbeds for cloudresearch (NSF Cloud testbeds CloudLab and Chameleon), for research beyond 5Gtechnologies (Platforms for Advanced Wireless Research or PAWR), as well as productionhigh-performance computing facilities and science instruments to create a rich fabric fora wide variety of experimental activities. 
    more » « less
  4. Thanks to increasing awareness of the importance of reproducibility in computer science research, initiatives such as artifact review and badging have been introduced to encourage reproducible research in this field. However, making "practical reproducibility" truly widespread requires more than just incentives. It demands an increase in capacity for reproducible research among computer scientists - more tools, workflows, and exemplar artifacts, and more human resources trained in best practices for reproducibility. In this paper, we describe our experiences in the first two years of the Summer of Reproducibility (SoR), a mentoring program that seeks to build global capacity by enabling students around the world to work with expert mentors while producing reproducibility artifacts, tools, and education materials. We give an overview of the program, report preliminary outcomes, and discuss plans to evolve this program. 
    more » « less
  5. Abstract In this paper, we explore the crucial role and challenges of computational reproducibility in geosciences, drawing insights from the Climate Informatics Reproducibility Challenge (CICR) in 2023. The competition aimed at (1) identifying common hurdles to reproduce computational climate science; and (2) creating interactive reproducible publications for selected papers of the Environmental Data Science journal. Based on lessons learned from the challenge, we emphasize the significance of open research practices, mentorship, transparency guidelines, as well as the use of technologies such as executable research objects for the reproduction of geoscientific published research. We propose a supportive framework of tools and infrastructure for evaluating reproducibility in geoscientific publications, with a case study for the climate informatics community. While the recommendations focus on future CIRCs, we expect they would be beneficial for wider umbrella of reproducibility initiatives in geosciences. 
    more » « less