Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Abstract MotivationThe Galaxy application is a popular open-source framework for data intensive sciences, counting thousands of monthly users across more than 100 public servers. To support a growing number of users and a greater variety of use cases, the complexity of a production-grade Galaxy installation has also grown, requiring more administration effort. There is a need for a rapid and reproducible Galaxy deployment method that can be maintained at high-availability with minimal maintenance. ResultsWe describe the Galaxy Helm chart that codifies all elements of a production-grade Galaxy installation into a single package. Deployable on Kubernetes clusters, the chart encapsulates supporting software services and implements the best-practices model for running Galaxy. It is also the most rapid method available for deploying a scalable, production-grade Galaxy instance on one’s own infrastructure. The chart is highly configurable, allowing systems administrators to swap dependent services if desired. Notable uses of the chart include on-demand, fully-automated deployments on AnVIL, providing training infrastructure for the Bioconductor project, and as the AWS-recommended solution for running Galaxy on the Amazon cloud. Availability and implementationThe source code for Galaxy Helm is available at https://github.com/galaxyproject/galaxy-helm, the corresponding Helm package at https://github.com/CloudVE/helm-charts, and the required Galaxy container image https://github.com/galaxyproject/galaxy-docker-k8s.more » « less
-
BackgroundMolecular Dynamics (MD) simulation of biomolecules provides important insights into conformational changes and dynamic behavior, revealing critical information about folding and interactions with other molecules. This enables advances in drug discovery and the design of therapeutic interventions. The collection of simulations stored in computers across the world holds immense potential to serve as training data for future Machine Learning models that will transform the prediction of structure, dynamics, drug interactions, and more. A needIdeally, there should exist an open access repository that enables scientists to submit and store their MD simulations of proteins and protein-drug interactions, and to find, retrieve, analyze, and visualize simulations produced by others. However, despite the ubiquity of MD simulation in structural biology, no such repository exists; as a result, simulations are instead stored in scattered locations without uniform metadata or access protocols. A solutionHere, we introduce MDRepo, a robust infrastructure that supports a relatively simple process for standardized community contribution of simulations, activates common downstream analyses on stored data, and enables search, retrieval, and visualization of contributed data. MDRepo is built on top of the open-source CyVerse research cyberinfrastructure, and is capable of storing petabytes of simulations, while providing high bandwidth upload and download capabilities and laying a foundation for cloud-based access to its stored data.more » « lessFree, publicly-accessible full text available July 12, 2025
-
Abstract The Hawai‘i Climate Data Portal (HCDP) is designed to facilitate streamlined access to a wide variety of climate data and information for the State of Hawai‘i. Prior to the development of the HCDP, gridded climate products and point datasets were fragmented, outdated, not easily accessible, and not available in near–real time. To address these limitations, HCDP researchers developed the cyberinfrastructure necessary to 1) operationalize data acquisition and product production in a near-real-time environment and 2) make data and products easily accessible to a wide range of users. The HCDP hosts several high-resolution (250 m) gridded products including monthly rainfall and daily temperature (maximum, minimum, and mean), station data, and gridded future projections of rainfall and temperature. HCDP users can visualize both gridded and point data, create and download custom maps, and query station and gridded data for export with relative ease. The “virtual station” feature allows users to create a climate time series at any grid point. The primary objective of the HCDP is to promote sharing and access to data and information to streamline research activities, improve awareness, and promote the development of tools and resources that can help to build adaptive capacities. The HCDP products have the potential to serve a wide range of users including researchers, resource managers, city planners, engineers, teachers, students, civil society organizations, and the broader community.more » « less
-
dadi-cli: Automated and distributed population genetic model inference from allele frequency spectraAbstract Summarydadi is a popular software package for inferring models of demographic history and natural selection from population genomic data. But using dadi requires Python scripting and manual parallelization of optimization jobs. We developed dadi-cli to simplify dadi usage and also enable straighforward distributed computing. Availability and Implementationdadi-cli is implemented in Python and released under the Apache License 2.0. The source code is available athttps://github.com/xin-huang/dadi-cli. dadi-cli can be installed via PyPI and conda, and is also available through Cacao on Jetstream2https://cacao.jetstream-cloud.org/.more » « less
-
Abstract Neuroscience is advancing standardization and tool development to support rigor and transparency. Consequently, data pipeline complexity has increased, hindering FAIR (findable, accessible, interoperable and reusable) access. brainlife.io was developed to democratize neuroimaging research. The platform provides data standardization, management, visualization and processing and automatically tracks the provenance history of thousands of data objects. Here, brainlife.io is described and evaluated for validity, reliability, reproducibility, replicability and scientific utility using four data modalities and 3,200 participants.more » « less
-
Over the past decade, the convergence of Cloud and High-Performance Computing (HPC) has undergone significant movement. We explore the evolution, motivations, and practicalities of establishing on-premise research cloud infrastructure and the complementary nature with HPC and commercial resources; under the belief that research clouds serve a unique role within research and education as a convergence accelerator. This role is highlighted through exploring the design tradeoffs in architecting research clouds versus HPC resources, focusing on the balance between utility, availability, and hardware utilization. The discussion provides insights from experiences with the National Science Foundation-supported Jetstream and Jetstream2 systems, showcasing convergence technologies and challenges. A variety of real-world use cases are provided that show the interplay between these computing paradigms; exploring use in research and education for interactive and iterative development, as an on-ramp to large-scale resources, as a powerful tool for education and workforce development, and for domain specific science gateways.more » « less
-
Lu, Baochuan; Smallwood, Pam (Ed.)As research and education advance, so does their need for advanced computational resources. While some universities are fortunate to be able to provide these resources in abundance, many do not have free availability to such cyberinfrastructure for their research, much less for their instruction. Through Advanced Cyberinfrastructure Coordination Ecosystem: Services \& Support (ACCESS), advanced computing resources such as Jetstream2 are shared with educators for free. This sharing of resources provides access to educators who normally would not have access to such platforms.more » « less