skip to main content


Title: Utility of the Python package Geoweaver_cwl for improving workflow reusability: an illustration with multidisciplinary use cases
Abstract

Computational workflows are widely used in data analysis, enabling automated tracking of steps and storage of provenance information, leading to innovation and decision-making in the scientific community. However, the growing popularity of workflows has raised concerns about reproducibility and reusability which can hinder collaboration between institutions and users. In order to address these concerns, it is important to standardize workflows or provide tools that offer a framework for describing workflows and enabling computational reusability. One such set of standards that has recently emerged is the Common Workflow Language (CWL), which offers a robust and flexible framework for data analysis tools and workflows. To promote portability, reproducibility, and interoperability of AI/ML workflows, we developedgeoweaver_cwl, a Python package that automatically describes AI/ML workflows from a workflow management system (WfMS) named Geoweaver into CWL. In this paper, we test our Python package on multiple use cases from different domains. Our objective is to demonstrate and verify the utility of this package. We make all the code and dataset open online and briefly describe the experimental implementation of the package in this paper, confirming thatgeoweaver_cwlcan lead to a well-versed AI process while disclosing opportunities for further extensions. Thegeoweaver_cwlpackage is publicly released online athttps://pypi.org/project/geoweaver-cwl/0.0.1/and exemplar results are accessible at:https://github.com/amrutakale08/geoweaver_cwl-usecases.

 
more » « less
PAR ID:
10430573
Author(s) / Creator(s):
; ;
Publisher / Repository:
Springer Science + Business Media
Date Published:
Journal Name:
Earth Science Informatics
Volume:
16
Issue:
3
ISSN:
1865-0473
Page Range / eLocation ID:
p. 2955-2961
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Summary

    dadi is a popular software package for inferring models of demographic history and natural selection from population genomic data. But using dadi requires Python scripting and manual parallelization of optimization jobs. We developed dadi-cli to simplify dadi usage and also enable straighforward distributed computing.

    Availability and Implementation

    dadi-cli is implemented in Python and released under the Apache License 2.0. The source code is available athttps://github.com/xin-huang/dadi-cli. dadi-cli can be installed via PyPI and conda, and is also available through Cacao on Jetstream2https://cacao.jetstream-cloud.org/.

     
    more » « less
  2. Abstract <bold>Background</bold>

    Existing software for comparison of species delimitation models do not provide a (true) metric or distance functions between species delimitation models, nor a way to compare these models in terms of relative clustering differences along a lattice of partitions.

    <bold>Results</bold>

    is a Python package for analyzing and visualizing species delimitation models in an information theoretic framework that, in addition to classic measures of information such as the entropy and mutual information [1], provides for the calculation of the Variation of Information (VI) criterion [2], a true metric or distance function for species delimitation models that is aligned with the lattice of partitions.

    <bold>Conclusions</bold>

    is available under the MIT license from its public repository (https://github.com/jeetsukumaran/piikun), and can be installed locally using the Python package manager ‘pip‘.

     
    more » « less
  3. Abstract

    The Locust simulation package is a new C++ software tool developed to simulate the measurement of time-varying electromagnetic fields using RF detection techniques. Modularity and flexibility allow for arbitrary input signals, while concurrently supporting tight integration with physics-based simulations as input. External signals driven by the Kassiopeia particle tracking package are discussed, demonstrating conditional feedback between Locust and Kassiopeia during software execution. An application of the simulation to the Project 8 experiment is described. Locust is publicly available athttps://github.com/project8/locust_mc.

     
    more » « less
  4. A<sc>bstract</sc>

    We presentνDoBe, a Python tool for the computation of neutrinoless double beta decay (0νββ) rates in terms of lepton-number-violating operators in the Standard Model Effective Field Theory (SMEFT). The tool can be used for automated calculations of 0νββrates, electron spectra and angular correlations for all isotopes of experimental interest, for lepton-number-violating operators up to and including dimension 9. The tool takes care of renormalization-group running to lower energies and provides the matching to the low-energy effective field theory and, at lower scales, to a chiral effective field theory description of 0νββrates. The user can specify different sets of nuclear matrix elements from various many-body methods and hadronic low-energy constants. The tool can be used to quickly generate analytical and numerical expressions for 0νββrates and to generate a large variety of plots. In this work, we provide examples of possible use along with a detailed code documentation. The code can be accessed through:

    GitHub:https://github.com/OScholer/nudobe

    Online User-Interface:https://oscholer-nudobe-streamlit-4foz22.streamlit.app/

     
    more » « less
  5. With the ever‐expanding toolkit of molecular viewers, the ability to visualize macromolecular structures has never been more accessible. Yet, the idiosyncratic technical intricacies across tools and the integration complexities associated with handling structure annotation data present significant barriers to seamless interoperability and steep learning curves for many users. The necessity for reproducible data visualizations is at the forefront of the current challenges. Recently, we introduced MolViewSpec (homepage:https://molstar.org/mol‐view‐spec/, GitHub project:https://github.com/molstar/mol‐view‐spec), a specification approach that defines molecular visualizations, decoupling them from the varying implementation details of different molecular viewers. Through the protocols presented herein, we demonstrate how to use MolViewSpec and its 3D view–building Python library for creating sophisticated, customized 3D views covering all standard molecular visualizations. MolViewSpec supports representations like cartoon and ball‐and‐stick with coloring, labeling, and applying complex transformations such as superposition to any macromolecular structure file in mmCIF, BinaryCIF, and PDB formats. These examples showcase progress towards reusability and interoperability of molecular 3D visualization in an era when handling molecular structures at scale is a timely and pressing matter in structural bioinformatics as well as research and education across the life sciences. 
    more » « less