Automated Assessment of Quality of Jupyter Notebooks Using Artificial Intelligence and Big Code

Oli, Priti; Banjade, Rabin; Tamang, Lasang Jimba; Rus, Vasile

doi:10.32473/flairs.v34i1.128560

Citation Details

Automated Assessment of Quality of Jupyter Notebooks Using Artificial Intelligence and Big Code

We present in this paper an automated method to assess the quality of Jupyter notebooks. The quality of notebooks is assessed in terms of reproducibility and executability. Specifically, we automatically extract a number of expert-defined features for each notebook, perform a feature selection step, and then trained supervised binary classifiers to predict whether a notebook is reproducible and executable, respectively. We also experimented with semantic code embeddings to capture the notebooks' semantics. We have evaluated these methods on a dataset of 306,539 notebooks and achieved an F1 score of 0.87 for reproducibility and 0.96 for executability (using expert-defined features) and an F1 score of 0.81 for reproducibility and 0.78 for executability (using code embeddings). Our results suggest that semantic code embeddings can be used to determine with good performance the reproducibility and executability of Jupyter notebooks, and since they can be automatically derived, they have the advantage of no need for expert involvement to define features. more »

Award ID(s):: 1918751

PAR ID:: 10311570

Author(s) / Creator(s):: Oli, Priti; Banjade, Rabin; Tamang, Lasang Jimba; Rus, Vasile

Date Published:: 2021-05-11

Journal Name:: The International FLAIRS Conference Proceedings

Volume:: 34

Issue:: 1

ISSN:: 2334-0762

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
https://doi.org/10.32473/flairs.v34i1.128560

More Like this