PmagPy Online: Jupyter Notebooks, the PmagPy Software Package and the Magnetics Information Consortium (MagIC) Database Lisa Tauxe$^1$, Rupert Minnett$^2$, Nick Jarboe$^1$, Catherine Constable$^1$, Anthony Koppers$^2$, Lori Jonestrask$^1$, Nick Swanson-Hysell$^3$ $^1$Scripps Institution of Oceanography, United States of America; $^2$ Oregon State University; $^3$ University of California, Berkely; ltauxe@ucsd.edu The Magnetics Information Consortium (MagIC), hosted at http://earthref.org/MagIC is a database that serves as a Findable, Accessible, Interoperable, Reusable (FAIR) archive for paleomagnetic and rock magnetic data. It has a flexible, comprehensive data model that can accomodate most kinds of paleomagnetic data. The PmagPy software package is a cross-platform and open-source set of tools written in Python for the analysis of paleomagnetic data that serves as one interface to MagIC, accommodating various levels of user expertise. It is available through github.com/PmagPy. Because PmagPy requires installation of Python, several non-standard Python modules, and the PmagPy software package, there is a speed bump for many practitioners on beginning to use the software. In order to make the software and MagIC more accessible to the broad spectrum of scientists interested in paleo and rock magnetism, we have prepared a set of Jupyter notebooks, hosted on jupyterhub.earthref.org which serve a set of purposes. 1) There is a complete course in Python for Earth Scientists, 2) a set of notebooks that introduce PmagPy (pulling the software package from the github repository) and illustrate how it can be used to create data products and figures for typical papers, and 3) show how to prepare data from the laboratory to upload into the MagIC database. The latter will satisfy expectations from NSF for data archiving and for example the AGU publication data archiving requirements. Getting started To use the PmagPy notebooks online, go to website at https://jupyterhub.earthref.org/. Create an Earthref account using your ORCID and log on. [This allows you to keep files in a private work space.] Open the PmagPy Online - Setup notebook and execute the two cells. Then click on File = > Open and click on the PmagPy_Online folder. Open the PmagPy_online notebook and work through the examples. There are other notebooks that are useful for the working paleomagnetist. Alternatively, you can install Python and the PmagPy software package on your computer (see https://earthref.org/PmagPy/cookbook for instructions). Follow the instructions for "Full PmagPy install and update" through section 1.4 (Quickstart with PmagPy notebooks). This notebook is in the collection of PmagPy notebooks. Overview of MagIC The Magnetics Information Consortium (MagIC), hosted at http://earthref.org/MagIC is a database that serves as a Findable, Accessible, Interoperable, Reusable (FAIR) archive for paleomagnetic and rock magnetic data. Its datamodel is fully described here: https://www2.earthref.org/MagIC/data-models/3.0. Each contribution is associated with a publication via the DOI. There are nine data tables: contribution: metadata of the associated publication. locations: metadata for locations, which are groups of sites (e.g., stratigraphic section, region, etc.) sites: metadata and derived data at the site level (units with a common expectation) samples: metadata and derived data at the sample level. specimens: metadata and derived data at the specimen level. criteria: criteria by which data are deemed acceptable ages: ages and metadata for sites/samples/specimens images: associated images and plots. Overview of PmagPy The functionality of PmagPy is demonstrated within notebooks in the PmagPy repository: PmagPy_online.ipynb: serves as an introdution to PmagPy and MagIC (this conference). It highlights the link between PmagPy and the Findable Accessible Interoperable Reusabe (FAIR) database maintained by the Magnetics Information Consortium (MagIC) at https://earthref.org/MagIC. Other notebooks of interest are: PmagPy_calculations.ipynb: demonstrates many of the PmagPy calculation functions such as those that rotate directions, return statistical parameters, and simulate data from specified distributions. PmagPy_plots_analysis.ipynb: demonstrates PmagPy functions that can be used to visual data as well as those that conduct statistical tests that have associated visualizations. PmagPy_MagIC.ipynb: demonstrates how PmagPy can be used to read and write data to and from the MagIC database format including conversion from many individual lab measurement file formats. Please see also our YouTube channel with more presentations from the 2020 MagIC workshop here: https://www.youtube.com/playlist?list=PLirL2unikKCgUkHQ3m8nT29tMCJNBj4kj
more »
« less
Vivarium: an interface and engine for integrative multiscale modeling in computational biology
Abstract Motivation This article introduces Vivarium—software born of the idea that it should be as easy as possible for computational biologists to define any imaginable mechanistic model, combine it with existing models and execute them together as an integrated multiscale model. Integrative multiscale modeling confronts the complexity of biology by combining heterogeneous datasets and diverse modeling strategies into unified representations. These integrated models are then run to simulate how the hypothesized mechanisms operate as a whole. But building such models has been a labor-intensive process that requires many contributors, and they are still primarily developed on a case-by-case basis with each project starting anew. New software tools that streamline the integrative modeling effort and facilitate collaboration are therefore essential for future computational biologists. Results Vivarium is a software tool for building integrative multiscale models. It provides an interface that makes individual models into modules that can be wired together in large composite models, parallelized across multiple CPUs and run with Vivarium’s discrete-event simulation engine. Vivarium’s utility is demonstrated by building composite models that combine several modeling frameworks: agent-based models, ordinary differential equations, stochastic reaction systems, constraint-based models, solid-body physics and spatial diffusion. This demonstrates just the beginning of what is possible—Vivarium will be able to support future efforts that integrate many more types of models and at many more biological scales. Availability and implementation The specific models, simulation pipelines and notebooks developed for this article are all available at the vivarium-notebooks repository: https://github.com/vivarium-collective/vivarium-notebooks. Vivarium-core is available at https://github.com/vivarium-collective/vivarium-core, and has been released on Python Package Index. The Vivarium Collective (https://vivarium-collective.github.io) is a repository of freely available Vivarium processes and composites, including the processes used in Section 3. Supplementary Materials provide with an extensive methodology section, with several code listings that demonstrate the basic interfaces. Supplementary information Supplementary data are available at Bioinformatics online.
more »
« less
- Award ID(s):
- 1903477
- PAR ID:
- 10337881
- Editor(s):
- Valencia, Alfonso
- Date Published:
- Journal Name:
- Bioinformatics
- Volume:
- 38
- Issue:
- 7
- ISSN:
- 1367-4803
- Page Range / eLocation ID:
- 1972 to 1979
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
EnGRaiN : a supervised ensemble learning method for recovery of large-scale gene regulatory networksMartelli, Pier Luigi (Ed.)Abstract Motivation Reconstruction of genome-scale networks from gene expression data is an actively studied problem. A wide range of methods that differ between the types of interactions they uncover with varying trade-offs between sensitivity and specificity have been proposed. To leverage benefits of multiple such methods, ensemble network methods that combine predictions from resulting networks have been developed, promising results better than or as good as the individual networks. Perhaps owing to the difficulty in obtaining accurate training examples, these ensemble methods hitherto are unsupervised. Results In this article, we introduce EnGRaiN, the first supervised ensemble learning method to construct gene networks. The supervision for training is provided by small training datasets of true edge connections (positives) and edges known to be absent (negatives) among gene pairs. We demonstrate the effectiveness of EnGRaiN using simulated datasets as well as a curated collection of Arabidopsis thaliana datasets we created from microarray datasets available from public repositories. EnGRaiN shows better results not only in terms of receiver operating characteristic and PR characteristics for both real and simulated datasets compared with unsupervised methods for ensemble network construction, but also generates networks that can be mined for elucidating complex biological interactions. Availability and implementation EnGRaiN software and the datasets used in the study are publicly available at the github repository: https://github.com/AluruLab/EnGRaiN. Supplementary information Supplementary data are available at Bioinformatics online.more » « less
-
Abstract Motivation Two-dimensional [15N-1H] separated local field solid-state nuclear magnetic resonance (NMR) experiments of membrane proteins aligned in lipid bilayers provide tilt and rotation angles for α-helical segments using Polar Index Slant Angle (PISA)-wheel models. No integrated software has been made available for data analysis and visualization. Results We have developed the PISA-SPARKY plugin to seamlessly integrate PISA-wheel modeling into the NMRFAM-SPARKY platform. The plugin performs basic simulations, exhaustive fitting against experimental spectra, error analysis and dipolar and chemical shift wave plotting. The plugin also supports PyMOL integration and handling of parameters that describe variable alignment and dynamic scaling encountered with magnetically aligned media, ensuring optimal fitting and generation of restraints for structure calculation. Availability and implementation PISA-SPARKY is freely available in the latest version of NMRFAM-SPARKY from the National Magnetic Resonance Facility at Madison (http://pine.nmrfam.wisc.edu/download_packages.html), the NMRbox Project (https://nmrbox.org) and to subscribers of the SBGrid (https://sbgrid.org). The pisa.py script is available and documented on GitHub (https://github.com/weberdak/pisa.py) along with a tutorial video and sample data. Supplementary information Supplementary data are available at Bioinformatics online.more » « less
-
Abstract SummaryNew advances in single-cell multi-omics experiments have allowed biologists to examine how various biological factors regulate processes in concert on the cellular level. However, measuring multiple cellular features for a single cell can be quite resource-intensive or impossible with the current technology. By using optimal transport (OT) to align cells and features across disparate datasets produced by separate assays, Single Cell alignment using Optimal Transport + (SCOT+), our unsupervised single-cell alignment software suite, allows biologists to align their data without the need for any correspondence. SCOT+ implements a generic optimal transport solution that can be reduced to multiple different previously studied OT optimization procedures including SCOT, SCOTv2, SCOOTR, and AGW for single cell, each of which provides state-of-the-art single-cell alignment performance. Outside of giving a unified framework to interact with prior formulations, the generality of SCOT+ optimization naturally gives rise to a new OT loss, Unbalanced Augmented Gromov-Wasserstein (UAGW), and a corresponding optimizer. With our user-friendly website and tutorials, this new package will help improve biological analyses by allowing for more accurate downstream analyses on multi-omics single-cell measurements. Implementation and AvailabilityOur algorithm is implemented in Pytorch and available on PyPI and GitHub (https://github.com/scotplus/scotplus). Additionally, we have many tutorials available in a separate GitHub repository (https://github.com/scotplus/book_source) and on our website (https://scotplus.github.io/).more » « less
-
Simulation optimization involves optimizing some objective function that can only be estimated via stochastic simulation. Many important problems can be profitably viewed within this framework. Whereas many solvers—implementations of simulation-optimization algorithms—exist or are in development, comparisons among solvers are not standardized and are often limited in scope. Such comparisons help advance solver development, clarify the relative performance of solvers, and identify classes of problems that defy efficient solution, among many other uses. We develop performance measures and plots, and estimators thereof, to evaluate and compare solvers and diagnose their strengths and weaknesses on a testbed of simulation-optimization problems. We explain the need for two-level simulation in this context and provide supporting convergence theory. We also describe how to use bootstrapping to obtain error estimates for the estimators. History: Accepted by Bruno Tuffin, area editor for simulation. Funding: This work was supported by the National Science Foundation [Grants CMMI-2035086, CMMI-2206972, and TRIPODS+X DMS-1839346]. Supplemental Material: The software that supports the findings of this study is available within the paper and its Supplementary Information [ https://pubsonline.informs.org/doi/suppl/10.1287/ijoc.2022.1261 ] or is available from the IJOC GitHub software repository ( https://github.com/INFORMSJoC ) at [ http://dx.doi.org/10.5281/zenodo.7329235 ].more » « less
An official website of the United States government

