A Framework to capture and reproduce the Absolute State of Jupyter Notebooks

Wannipurage, Dimuthu; Marru, Suresh; Pierce, Marlon

doi:10.1145/3491418.3530296

Citation Details

A Framework to capture and reproduce the Absolute State of Jupyter Notebooks

Jupyter Notebooks are an enormously popular tool for creating and narrating computational research projects. They also have enormous potential for creating reproducible scientific research artifacts. Capturing the complete state of a notebook has additional benefits; for instance, the notebook execution may be split between local and remote resources, where the latter may have more powerful processing capabilities or store large or access-limited data. There are several challenges for making notebooks fully reproducible when examined in detail. The notebook code must be replicated entirely, and the underlying Python runtime environments must be identical. More subtle problems arise in replicating referenced data, external library dependencies, and runtime variable states. This paper presents solutions to these problems using Juptyer’s standard extension mechanisms to create an archivable system state for a running notebook. We show that the overhead for these additional mechanisms, which involve interacting with the underlying Linux kernel, does not introduce substantial execution time overheads, demonstrating the approach’s feasibility. more »

Award ID(s):: 2005506

PAR ID:: 10359394

Author(s) / Creator(s):: Wannipurage, Dimuthu; Marru, Suresh; Pierce, Marlon

Date Published:: 2022-07-08

Journal Name:: PEARC '22: Practice and Experience in Advanced Research Computing

Page Range / eLocation ID:: 1 to 8

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.1145/3491418.3530296

More Like this