skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: A Case for Integrating Experimental Containers with Notebooks
Computational notebooks have gained much pop- ularity as a way of documenting research processes; they allow users to express research narrative by integrating ideas expressed as text, process expressed as code, and results in one executable document. However, the environments in which the code can run are currently limited, often containing only a fraction of the resources of one node, posing a barrier to many computations. In this paper, we make the case that integrating complex experimental environments, such as virtual clusters or complex networking environments that can be provisioned via infrastructure clouds, into computational notebooks will significantly broaden their reach and at the same time help realize the potential of clouds as a platform for repeatable research. To support our argument, we describe the integration of Jupyter notebooks into the Chameleon cloud testbed, which allows the user to define complex experimental environments and then assign processes to elements of this environment similarly to the way a laptop user may switch between different desktops. We evaluate our approach on an actual experiment from both the development and replication perspective.  more » « less
Award ID(s):
1743358
PAR ID:
10195660
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Proceedings of the 11th IEEE International Conference on Cloud Computing Technology and Science (CloudCom 2019)
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The computational notebook serves as a versatile tool for data analysis. However, its conventional user interface falls short of keeping pace with the ever-growing data-related tasks, signaling the need for novel approaches. With the rapid development of interaction techniques and computing environments, there is a growing interest in integrating emerging technologies in data-driven workflows. Virtual reality, in particular, has demonstrated its potential in interactive data visualizations. In this work, we aimed to experiment with adapting computational notebooks into VR and verify the potential benefits VR can bring. We focus on the navigation and comparison aspects as they are primitive components in analysts' workflow. To further improve comparison, we have designed and implemented a Branching&Merging functionality. We tested computational notebooks on the desktop and in VR, both with and without the added Branching&Merging capability. We found VR significantly facilitated navigation compared to desktop, and the ability to create branches enhanced comparison. 
    more » « less
  2. null (Ed.)
    Data scientists have embraced computational notebooks to author analysis code and accompanying visualizations within a single document. Currently, although these media may be interleaved, they remain siloed: interactive visualizations must be manually specified as they are divorced from the analysis provenance expressed via dataframes, while code cells have no access to users' interactions with visualizations, and hence no way to operate on the results of interaction. To bridge this divide, we present B2, a set of techniques grounded in treating data queries as a shared representation between the code and interactive visualizations. B2 instruments data frames to track the queries expressed in code and synthesize corresponding visualizations. These visualizations are displayed in a dashboard to facilitate interactive analysis. When an interaction occurs, B2 reifies it as a data query and generates a history log in a new code cell. Subsequent cells can use this log to further analyze interaction results and, when marked as reactive, to ensure that code is automatically recomputed when new interaction occurs. In an evaluative study with data scientists, we find that B2 promotes a tighter feedback loop between coding and interacting with visualizations. All participants frequently moved from code to visualization and vice-versa, which facilitated their exploratory data analysis in the notebook. 
    more » « less
  3. The Protein Data Bank (PDB) holds an extensive amount of information, and can be a vital tool when performing background research for biochemical work. In an attempt to make the information in the PDB more accessible, the RCSB Search API was employed within Jupyter Notebooks to create more customizable and user-friendly tools with Python code. Areas of focus include searches targeting ligands with specific characteristics, searches for FDA Approved Drugs, as well as sequence searches, used to search for entries based on different sequence characteristics. This code has been built into Jupyter Notebook templates that include examples of these searches as well as annotated code that users can customize to more efficiently run advanced searches on the PDB and download structure and small molecule files returned by the search. These notebooks also walk users through different ways to organize or utilize the returns from advanced searches. Future plans include increasing the amount and type of information available from a search, improved ease of access for visualizing and downloading search results, and expanding the scope of our notebooks to cover more types of searches. This research was supported by NSF-IUSE award number 2142033. 
    more » « less
  4. Current computational notebooks, such as Jupyter, are a popular tool for data science and analysis. However, they use a 1D list structure for cells that introduces and exacerbates user issues, such as messiness, tedious navigation, inefficient use of large screen space, performance of non-linear analyses, and presentation of non-linear narratives. To ameliorate these issues, we designed a prototype extension for Jupyter Notebooks that enables 2D organization of computational notebook cells into multiple columns. In this paper, we present two evaluative studies to determine whether such “2D computational notebooks” provide advantages over the current computational notebook structure. From these studies, we found empirical evidence that our multi-olumn 2D computational notebooks provide enhanced efficiency and usability. We also gathered design feedback which may inform future works. Overall, the prototype was positively received, with some users expressing a clear preference for 2D computational notebooks even at this early stage of development. 
    more » « less
  5. Current computational notebooks, such as Jupyter, are a popular tool for data science and analysis. However, they use a 1D list structure for cells that introduces and exacerbates user issues, such as messiness, tedious navigation, inefficient use of large screen space, performance of non-linear analyses, and presentation of non-linear narratives. To ameliorate these issues, we designed a prototype extension for Jupyter Notebooks that enables 2D organization of computational notebook cells into multiple columns. In this paper, we present two evaluative studies to determine whether such “2D computational notebooks” provide advantages over the current computational notebook structure. From these studies, we found empirical evidence that our multi-olumn 2D computational notebooks provide enhanced efficiency and usability. We also gathered design feedback which may inform future works. Overall, the prototype was positively received, with some users expressing a clear preference for 2D computational notebooks even at this early stage of development. 
    more » « less