Title: Utility of the Python package Geoweaver_cwl for improving workflow reusability: an illustration with multidisciplinary use cases
Abstract
Computational workflows are widely used in data analysis, enabling automated tracking of steps and storage of provenance information, leading to innovation and decision-making in the scientific community. However, the growing popularity of workflows has raised concerns about reproducibility and reusability which can hinder collaboration between institutions and users. In order to address these concerns, it is important to standardize workflows or provide tools that offer a framework for describing workflows and enabling computational reusability. One such set of standards that has recently emerged is the Common Workflow Language (CWL), which offers a robust and flexible framework for data analysis tools and workflows. To promote portability, reproducibility, and interoperability of AI/ML workflows, we developedgeoweaver_cwl, a Python package that automatically describes AI/ML workflows from a workflow management system (WfMS) named Geoweaver into CWL. In this paper, we test our Python package on multiple use cases from different domains. Our objective is to demonstrate and verify the utility of this package. We make all the code and dataset open online and briefly describe the experimental implementation of the package in this paper, confirming thatgeoweaver_cwlcan lead to a well-versed AI process while disclosing opportunities for further extensions. Thegeoweaver_cwlpackage is publicly released online athttps://pypi.org/project/geoweaver-cwl/0.0.1/and exemplar results are accessible at:https://github.com/amrutakale08/geoweaver_cwl-usecases.
Huang, Xin; Struck, Travis J; Davey, Sean W; Gutenkunst, Ryan N(
, bioRxiv)
AbstractSummary
dadi is a popular software package for inferring models of demographic history and natural selection from population genomic data. But using dadi requires Python scripting and manual parallelization of optimization jobs. We developed dadi-cli to simplify dadi usage and also enable straighforward distributed computing.
Availability and Implementation
dadi-cli is implemented in Python and released under the Apache License 2.0. The source code is available athttps://github.com/xin-huang/dadi-cli. dadi-cli can be installed via PyPI and conda, and is also available through Cacao on Jetstream2https://cacao.jetstream-cloud.org/.
Existing software for comparison of species delimitation models do not provide a (true) metric or distance functions between species delimitation models, nor a way to compare these models in terms of relative clustering differences along a lattice of partitions.
Results
is a Python package for analyzing and visualizing species delimitation models in an information theoretic framework that, in addition to classic measures of information such as the entropy and mutual information [1], provides for the calculation of the Variation of Information (VI) criterion [2], a true metric or distance function for species delimitation models that is aligned with the lattice of partitions.
Conclusions
is available under the MIT license from its public repository (https://github.com/jeetsukumaran/piikun), and can be installed locally using the Python package manager ‘pip‘.
Ashtari Esfahani, A.; Böser, S.; Buzinsky, N.; Cervantes, R.; Claessens, C.; Viveiros, L. de; Fertl, M.; Formaggio, J. A.; Gladstone, L.; Guigue, M.; et al(
, New Journal of Physics)
Abstract
The Locust simulation package is a new C++ software tool developed to simulate the measurement of time-varying electromagnetic fields using RF detection techniques. Modularity and flexibility allow for arbitrary input signals, while concurrently supporting tight integration with physics-based simulations as input. External signals driven by the Kassiopeia particle tracking package are discussed, demonstrating conditional feedback between Locust and Kassiopeia during software execution. An application of the simulation to the Project 8 experiment is described. Locust is publicly available athttps://github.com/project8/locust_mc.
Scholer, Oliver; de_Vries, Jordy; Gráf, Lukáš(
, Journal of High Energy Physics)
Abstract
We presentνDoBe, a Python tool for the computation of neutrinoless double beta decay (0νββ) rates in terms of lepton-number-violating operators in the Standard Model Effective Field Theory (SMEFT). The tool can be used for automated calculations of 0νββrates, electron spectra and angular correlations for all isotopes of experimental interest, for lepton-number-violating operators up to and including dimension 9. The tool takes care of renormalization-group running to lower energies and provides the matching to the low-energy effective field theory and, at lower scales, to a chiral effective field theory description of 0νββrates. The user can specify different sets of nuclear matrix elements from various many-body methods and hadronic low-energy constants. The tool can be used to quickly generate analytical and numerical expressions for 0νββrates and to generate a large variety of plots. In this work, we provide examples of possible use along with a detailed code documentation. The code can be accessed through:
With the ever‐expanding toolkit of molecular viewers, the ability to visualize macromolecular structures has never been more accessible. Yet, the idiosyncratic technical intricacies across tools and the integration complexities associated with handling structure annotation data present significant barriers to seamless interoperability and steep learning curves for many users. The necessity for reproducible data visualizations is at the forefront of the current challenges. Recently, we introduced MolViewSpec (homepage:https://molstar.org/mol‐view‐spec/, GitHub project:https://github.com/molstar/mol‐view‐spec), a specification approach that defines molecular visualizations, decoupling them from the varying implementation details of different molecular viewers. Through the protocols presented herein, we demonstrate how to use MolViewSpec and its 3D view–building Python library for creating sophisticated, customized 3D views covering all standard molecular visualizations. MolViewSpec supports representations like cartoon and ball‐and‐stick with coloring, labeling, and applying complex transformations such as superposition to any macromolecular structure file in mmCIF, BinaryCIF, and PDB formats. These examples showcase progress towards reusability and interoperability of molecular 3D visualization in an era when handling molecular structures at scale is a timely and pressing matter in structural bioinformatics as well as research and education across the life sciences.
Kale, Amruta, Sun, Ziheng, and Ma, Xiaogang. Utility of the Python package Geoweaver_cwl for improving workflow reusability: an illustration with multidisciplinary use cases. Earth Science Informatics 16.3 Web. doi:10.1007/s12145-023-01045-0.
Kale, Amruta, Sun, Ziheng, & Ma, Xiaogang. Utility of the Python package Geoweaver_cwl for improving workflow reusability: an illustration with multidisciplinary use cases. Earth Science Informatics, 16 (3). https://doi.org/10.1007/s12145-023-01045-0
Kale, Amruta, Sun, Ziheng, and Ma, Xiaogang.
"Utility of the Python package Geoweaver_cwl for improving workflow reusability: an illustration with multidisciplinary use cases". Earth Science Informatics 16 (3). Country unknown/Code not available: Springer Science + Business Media. https://doi.org/10.1007/s12145-023-01045-0.https://par.nsf.gov/biblio/10430573.
@article{osti_10430573,
place = {Country unknown/Code not available},
title = {Utility of the Python package Geoweaver_cwl for improving workflow reusability: an illustration with multidisciplinary use cases},
url = {https://par.nsf.gov/biblio/10430573},
DOI = {10.1007/s12145-023-01045-0},
abstractNote = {Abstract Computational workflows are widely used in data analysis, enabling automated tracking of steps and storage of provenance information, leading to innovation and decision-making in the scientific community. However, the growing popularity of workflows has raised concerns about reproducibility and reusability which can hinder collaboration between institutions and users. In order to address these concerns, it is important to standardize workflows or provide tools that offer a framework for describing workflows and enabling computational reusability. One such set of standards that has recently emerged is the Common Workflow Language (CWL), which offers a robust and flexible framework for data analysis tools and workflows. To promote portability, reproducibility, and interoperability of AI/ML workflows, we developedgeoweaver_cwl, a Python package that automatically describes AI/ML workflows from a workflow management system (WfMS) named Geoweaver into CWL. In this paper, we test our Python package on multiple use cases from different domains. Our objective is to demonstrate and verify the utility of this package. We make all the code and dataset open online and briefly describe the experimental implementation of the package in this paper, confirming thatgeoweaver_cwlcan lead to a well-versed AI process while disclosing opportunities for further extensions. Thegeoweaver_cwlpackage is publicly released online athttps://pypi.org/project/geoweaver-cwl/0.0.1/and exemplar results are accessible at:https://github.com/amrutakale08/geoweaver_cwl-usecases.},
journal = {Earth Science Informatics},
volume = {16},
number = {3},
publisher = {Springer Science + Business Media},
author = {Kale, Amruta and Sun, Ziheng and Ma, Xiaogang},
}
Warning: Leaving National Science Foundation Website
You are now leaving the National Science Foundation website to go to a non-government website.
Website:
NSF takes no responsibility for and exercises no control over the views expressed or the accuracy of
the information contained on this site. Also be aware that NSF's privacy policy does not apply to this site.