Title: ipyrad: Interactive assembly and analysis of RADseq datasets
AbstractSummary
ipyrad is a free and open source tool for assembling and analyzing restriction site-associated DNA sequence datasets using de novo and/or reference-based approaches. It is designed to be massively scalable to hundreds of taxa and thousands of samples, and can be efficiently parallelized on high performance computing clusters. It is available both as a command line interface and as a Python package with an application programming interface, the latter of which can be used interactively to write complex, reproducible scripts and implement a suite of downstream analysis tools.
Availability and implementation
ipyrad is a free and open source program written in Python. Source code is available from the GitHub repository (https://github.com/dereneaton/ipyrad/), and Linux and MacOS installs are distributed through the conda package manager. Complete documentation, including numerous tutorials, and Jupyter notebooks demonstrating example assemblies and applications of downstream analysis tools are available online: https://ipyrad.readthedocs.io/.
Harris, Leonard A.; Nobile, Marco S.; Pino, James C.; Lubbock, Alexander L. R.; Besozzi, Daniela; Mauri, Giancarlo; Cazzaniga, Paolo; Lopez, Carlos F.; Wren, ed., Jonathan(
, Bioinformatics)
AbstractSummary
A major barrier to the practical utilization of large, complex models of biochemical systems is the lack of open-source computational tools to evaluate model behaviors over high-dimensional parameter spaces. This is due to the high computational expense of performing thousands to millions of model simulations required for statistical analysis. To address this need, we have implemented a user-friendly interface between cupSODA, a GPU-powered kinetic simulator, and PySB, a Python-based modeling and simulation framework. For three example models of varying size, we show that for large numbers of simulations PySB/cupSODA achieves order-of-magnitude speedups relative to a CPU-based ordinary differential equation integrator.
Availability and implementation
The PySB/cupSODA interface has been integrated into the PySB modeling framework (version 1.4.0), which can be installed from the Python Package Index (PyPI) using a Python package manager such as pip. cupSODA source code and precompiled binaries (Linux, Mac OS/X, Windows) are available at github.com/aresio/cupSODA (requires an Nvidia GPU; developer.nvidia.com/cuda-gpus). Additional information about PySB is available at pysb.org.
Supplementary information
Supplementary data are available at Bioinformatics online.
Welsh, Ciaran; Xu, Jin; Smith, Lucian; König, Matthias; Choi, Kiri; Sauro, Herbert M.; Martelli, ed., Pier Luigi(
, Bioinformatics)
AbstractMotivation
This article presents libRoadRunner 2.0, an extensible, high-performance, cross-platform, open-source software library for the simulation and analysis of models expressed using the systems biology markup language (SBML).
Results
libRoadRunner is a self-contained library, able to run either as a component inside other tools via its C++, C and Python APIs, or interactively through its Python or Julia interface. libRoadRunner uses a custom just-in-time (JIT) compiler built on the widely used LLVM JIT compiler framework. It compiles SBML-specified models directly into native machine code for a large variety of processors, making it fast enough to simulate extremely large models or repeated runs in reasonable timeframes. libRoadRunner is flexible, supporting the bulk of the SBML specification (except for delay and non-linear algebraic equations) as well as several SBML extensions such as hierarchical composition and probability distributions. It offers multiple deterministic and stochastic integrators, as well as tools for steady-state, sensitivity, stability and structural analyses.
Availability and implementation
libRoadRunner binary distributions for Windows, Mac OS and Linux, Julia and Python bindings, source code and documentation are all available at https://github.com/sys-bio/roadrunner, and Python bindings are also available via pip. The source code can be compiled for the supported systems as well as in principle any system supported by LLVM-13, such as ARM-based computers like the Raspberry Pi. The library is licensed under the Apache License Version 2.0.
Hopf, Thomas A.; Green, Anna G.; Schubert, Benjamin; Mersmann, Sophia; Schärfe, Charlotta P. I.; Ingraham, John B.; Toth-Petroczy, Agnes; Brock, Kelly; Riesselman, Adam J.; Palmedo, Perry; et al(
, Bioinformatics)
AbstractSummary
Coevolutionary sequence analysis has become a commonly used technique for de novo prediction of the structure and function of proteins, RNA, and protein complexes. We present the EVcouplings framework, a fully integrated open-source application and Python package for coevolutionary analysis. The framework enables generation of sequence alignments, calculation and evaluation of evolutionary couplings (ECs), and de novo prediction of structure and mutation effects. The combination of an easy to use, flexible command line interface and an underlying modular Python package makes the full power of coevolutionary analyses available to entry-level and advanced users.
PmagPy Online: Jupyter Notebooks, the PmagPy Software Package and the Magnetics Information Consortium (MagIC) Database
Lisa Tauxe$^1$, Rupert Minnett$^2$, Nick Jarboe$^1$, Catherine Constable$^1$, Anthony Koppers$^2$, Lori Jonestrask$^1$, Nick Swanson-Hysell$^3$
$^1$Scripps Institution of Oceanography, United States of America; $^2$ Oregon State University; $^3$ University of California, Berkely; ltauxe@ucsd.edu
The Magnetics Information Consortium (MagIC), hosted at http://earthref.org/MagIC is a database that serves as a Findable, Accessible, Interoperable, Reusable (FAIR) archive for paleomagnetic and rock magnetic data. It has a flexible, comprehensive data model that can accomodate most kinds of paleomagnetic data. The PmagPy software package is a cross-platform and open-source set of tools written in Python for the analysis of paleomagnetic data that serves as one interface to MagIC, accommodating various levels of user expertise. It is available through github.com/PmagPy. Because PmagPy requires installation of Python, several non-standard Python modules, and the PmagPy software package, there is a speed bump for many practitioners on beginning to use the software. In order to make the software and MagIC more accessible to the broad spectrum of scientists interested in paleo and rock magnetism, we have prepared a set of Jupyter notebooks, hosted on jupyterhub.earthref.org which serve a set of purposes. 1) There is a complete course in Python for Earth Scientists, 2) a set of notebooks that introduce PmagPy (pulling the software package from the github repository) and illustrate how it can be used to create data products and figures for typical papers, and 3) show how to prepare data from the laboratory to upload into the MagIC database. The latter will satisfy expectations from NSF for data archiving and for example the AGU publication data archiving requirements.
Getting started
To use the PmagPy notebooks online, go to website at https://jupyterhub.earthref.org/. Create an Earthref account using your ORCID and log on. [This allows you to keep files in a private work space.]
Open the PmagPy Online - Setup notebook and execute the two cells. Then click on File = > Open and click on the PmagPy_Online folder. Open the PmagPy_online notebook and work through the examples. There are other notebooks that are useful for the working paleomagnetist.
Alternatively, you can install Python and the PmagPy software package on your computer (see https://earthref.org/PmagPy/cookbook for instructions). Follow the instructions for "Full PmagPy install and update" through section 1.4 (Quickstart with PmagPy notebooks). This notebook is in the collection of PmagPy notebooks.
Overview of MagIC
The Magnetics Information Consortium (MagIC), hosted at http://earthref.org/MagIC is a database that serves as a Findable, Accessible, Interoperable, Reusable (FAIR) archive for paleomagnetic and rock magnetic data. Its datamodel is fully described here: https://www2.earthref.org/MagIC/data-models/3.0. Each contribution is associated with a publication via the DOI. There are nine data tables:
contribution: metadata of the associated publication.
locations: metadata for locations, which are groups of sites (e.g., stratigraphic section, region, etc.)
sites: metadata and derived data at the site level (units with a common expectation)
samples: metadata and derived data at the sample level.
specimens: metadata and derived data at the specimen level.
criteria: criteria by which data are deemed acceptable
ages: ages and metadata for sites/samples/specimens
images: associated images and plots.
Overview of PmagPy
The functionality of PmagPy is demonstrated within notebooks in the PmagPy repository:
PmagPy_online.ipynb: serves as an introdution to PmagPy and MagIC (this conference). It highlights the link between PmagPy and the Findable Accessible Interoperable Reusabe (FAIR) database maintained by the Magnetics Information Consortium (MagIC) at https://earthref.org/MagIC.
Other notebooks of interest are:
PmagPy_calculations.ipynb: demonstrates many of the PmagPy calculation functions such as those that rotate directions, return statistical parameters, and simulate data from specified distributions.
PmagPy_plots_analysis.ipynb: demonstrates PmagPy functions that can be used to visual data as well as those that conduct statistical tests that have associated visualizations.
PmagPy_MagIC.ipynb: demonstrates how PmagPy can be used to read and write data to and from the MagIC database format including conversion from many individual lab measurement file formats.
Please see also our YouTube channel with more presentations from the 2020 MagIC workshop here: https://www.youtube.com/playlist?list=PLirL2unikKCgUkHQ3m8nT29tMCJNBj4kj
MicroRNAs (miRNAs) are small RNA molecules (∼22 nucleotide long) involved in post-transcriptional gene regulation. Advances in high-throughput sequencing technologies led to the discovery of isomiRs, which are miRNA sequence variants. While many miRNA-seq analysis tools exist, the diversity of output formats hinders accurate comparisons between tools and precludes data sharing and the development of common downstream analysis methods.
Results
To overcome this situation, we present here a community-based project, miRNA Transcriptomic Open Project (miRTOP) working towards the optimization of miRNA analyses. The aim of miRTOP is to promote the development of downstream isomiR analysis tools that are compatible with existing detection and quantification tools. Based on the existing GFF3 format, we first created a new standard format, mirGFF3, for the output of miRNA/isomiR detection and quantification results from small RNA-seq data. Additionally, we developed a command line Python tool, mirtop, to create and manage the mirGFF3 format. Currently, mirtop can convert into mirGFF3 the outputs of commonly used pipelines, such as seqbuster, isomiR-SEA, sRNAbench, Prost! as well as BAM files. Some tools have also incorporated the mirGFF3 format directly into their code, such as, miRge2.0, IsoMIRmap and OptimiR. Its open architecture enables any tool or pipeline to output or convert results into mirGFF3. Collectively, this isomiR categorization system, along with the accompanying mirGFF3 and mirtop API, provide a comprehensive solution for the standardization of miRNA and isomiR annotation, enabling data sharing, reporting, comparative analyses and benchmarking, while promoting the development of common miRNA methods focusing on downstream steps of miRNA detection, annotation and quantification.
Availability and implementation
https://github.com/miRTop/mirGFF3/ and https://github.com/miRTop/mirtop.
Contact
desvignes@uoneuro.uoregon.edu or lpantano@iscb.org
Supplementary information
Supplementary data are available at Bioinformatics online.
Eaton, Deren A. R., Overcast, Isaac, and Schwartz, ed., Russell. ipyrad: Interactive assembly and analysis of RADseq datasets. Bioinformatics . Web. doi:10.1093/bioinformatics/btz966.
Eaton, Deren A. R., Overcast, Isaac, & Schwartz, ed., Russell. ipyrad: Interactive assembly and analysis of RADseq datasets. Bioinformatics, (). https://doi.org/10.1093/bioinformatics/btz966
@article{osti_10131249,
place = {Country unknown/Code not available},
title = {ipyrad: Interactive assembly and analysis of RADseq datasets},
url = {https://par.nsf.gov/biblio/10131249},
DOI = {10.1093/bioinformatics/btz966},
abstractNote = {Abstract Summaryipyrad is a free and open source tool for assembling and analyzing restriction site-associated DNA sequence datasets using de novo and/or reference-based approaches. It is designed to be massively scalable to hundreds of taxa and thousands of samples, and can be efficiently parallelized on high performance computing clusters. It is available both as a command line interface and as a Python package with an application programming interface, the latter of which can be used interactively to write complex, reproducible scripts and implement a suite of downstream analysis tools. Availability and implementationipyrad is a free and open source program written in Python. Source code is available from the GitHub repository (https://github.com/dereneaton/ipyrad/), and Linux and MacOS installs are distributed through the conda package manager. Complete documentation, including numerous tutorials, and Jupyter notebooks demonstrating example assemblies and applications of downstream analysis tools are available online: https://ipyrad.readthedocs.io/.},
journal = {Bioinformatics},
publisher = {Oxford University Press},
author = {Eaton, Deren A. R. and Overcast, Isaac and Schwartz, ed., Russell},
}
Warning: Leaving National Science Foundation Website
You are now leaving the National Science Foundation website to go to a non-government website.
Website:
NSF takes no responsibility for and exercises no control over the views expressed or the accuracy of
the information contained on this site. Also be aware that NSF's privacy policy does not apply to this site.