The C-MĀIKI gateway is a science gateway that leverages a computational workload management API called Tapis to support modern, interoperable, and scalable microbiome data analysis. This project is focused on migrating an existing C-MĀIKI gateway pipeline from Tapis v2 to Tapis v3 so that it can take advantage of the new robust Tapis v3 features and stay modern. This requires three major steps: 1) Containerization of each existing microbiome workflow. 2) Create a new app definition for each of the workflows. 3) Enabling the ability to submit jobs to a SLURM scheduler inside of a singularity container to support the Nextflow workflow manager. This work presents the experience and challenges in upgrading the pipeline.
more »
« less
The C-MĀIKI Gateway: A Modern Science Platform for Analyzing Microbiome Data
In collaboration with the Center for Microbiome Analysis through Island Knowledge and Investigations (C-MĀIKI), the Hawaii EPSCoR Ike Wai project and the Hawaii Data Science Institute, a new science gateway, the C-MĀIKI gateway, was developed to support modern, interoperable and scalable microbiome data analysis. This gateway provides a web-based interface for accessing high-performance computing resources and storage to enable and support reproducible microbiome data analysis. The C-MĀIKI gateway is accelerating the analysis of microbiome data for Hawaii through ease of use and centralized infrastructure.
more »
« less
- PAR ID:
- 10343597
- Date Published:
- Journal Name:
- Practice and Experience in Advanced Research Computing
- Volume:
- 2022
- Page Range / eLocation ID:
- 1 to 7
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
The Change Hawaii (Change(HI)) project is fundamentally addressing the existential threat of climate change in Hawaii by integrating data and climate science to foster statewide resilience, enhance decision science, and support workforce development in critical fields. A cornerstone of this initiative is the \textbf{Hawaii Climate Data Portal (HCDP)}, which operates as a vital science gateway and data hub \cite. The HCDP's primary objective is to build capacity through advanced data science and artificial intelligence (AI), serving as a robust resource for monitoring, visualizing, and communicating environmental change \cite{longman_hawaii_2024}. Its critical role is highlighted by its extensive provision of climate data and its Application Programming Interface (API), which is instrumental in the development and functionality of diverse decision support tools tailored for various stakeholders across the state. This paper details the HCDP's integration with the Tapis API platform, and its successful application in developing actionable climate science outcomes for Hawaii.more » « less
-
Community growth is one of the cornerstones contributing to the sustainability of a science gateway. Achieving community growth requires careful planning and a multifaceted approach. The Science Gateways Community Institute (SGCI) and the Center of Excellence for Science Gateways (SGX3) offer services such as UX advice, sustainability training via the Focus Week, and an annual conference to support the science gateway community with developers and users. This panel will discuss four successful use cases – QUBES, MyGeoHub, CHEESE, and the Hawaii Behavioral Health Dashboard – where the teams utilized various SGCI/SGX3 services, which significantly contributed to their community growth. The discussion will highlight specific strategies and outcomes from these use cases, providing valuable insights into the effective practices that drive community engagement and sustainability in science gateways. Additionally, panelists will share lessons learned and good practices that can be applied to other science gateways seeking to enhance their community presence and impact.more » « less
-
. Granting agencies invest millions of dollars on the generation and analysis of data, making these products extremely valuable. However, without sufficient annotation of the methods used to collect and analyze the data, the ability to reproduce and reuse those products suffers. This lack of assurance of the quality and credibility of the data at the different stages in the research process essentially wastes much of the investment of time and funding and fails to drive research forward to the level of potential possible if everything was effectively annotated and disseminated to the wider research community. In order to address this issue for the Hawai'i Established Program to Stimulate Competitive Research (EPSCoR) project, a water science gateway was developed at the University of Hawai‘i (UH), called the ‘Ike Wai Gateway. In Hawaiian, ‘Ike means knowledge and Wai means water. The gateway supports research in hydrology and water management by providing tools to address questions of water sustainability in Hawai‘i. The gateway provides a framework for data acquisition, analysis, model integration, and display of data products. The gateway is intended to complement and integrate with the capabilities of the Consortium of Universities for the Advancement of Hydrologic Science's (CUAHSI) Hydroshare by providing sound data and metadata management capabilities for multi-domain field observations, analytical lab actions, and modeling outputs. Functionality provided by the gateway is supported by a subset of the CUAHSI’s Observations Data Model (ODM) delivered as centralized web based user interfaces and APIs supporting multi-domain data management, computation, analysis, and visualization tools to support reproducible science, modeling, data discovery, and decision support for the Hawai'i EPSCoR ‘Ike Wai research team and wider Hawai‘i hydrology community. By leveraging the Tapis platform, UH has constructed a gateway that ties data and advanced computing resources together to support diverse research domains including microbiology, geochemistry, geophysics, economics, and humanities, coupled with computational and modeling workflows delivered in a user friendly web interface with workflows for effectively annotating the project data and products. Disseminating results for the ‘Ike Wai project through the ‘Ike Wai data gateway and Hydroshare makes the research products accessible and reusable.more » « less
-
To improve accessibility and community knowledge of applications in the Lidar Radar Open Software Environment (LROSE), a team from the National Science Foundation (NSF) National Center for Atmospheric Research, Colorado State University, and NSF Unidata has developed a lidar and radar meteorology science gateway deployed on the NSF Jetstream2 cloud. Utilizing the “Zero to JupyterHub with Kubernetes” workflow, the science gateway integrates LROSE with other lidar and radar meteorology software packages. This integration allows users to execute applications directly from the JupyterLab terminal, streamlining the creation of datasets for further analysis and visualization within Jupyter notebooks. By combining traditional command-line operations with modern Python-based tools for data analysis and visualization, this gateway provides a robust end-to-end solution that caters to both educational and research needs. The gateway has already facilitated LROSE instructional workshops and classroom exercises. Our work demonstrates the significant potential of merging established scientific computing techniques with advanced Python environments, opening new avenues for computational science education and research. The LROSE team has acquired successive allocations on the NSF Jetstream2 cloud at Indiana University through ACCESS. To develop the LROSE Science Gateway, we employed the “Zero to JupyterHub with Kubernetes” workflow ported to the NSF Jetstream2 cloud, enabling rapid and scalable deployment to accommodate a variable number of users. Authentication is managed through either GitHub OAuth or temporary credentials, depending on the situation. Since LROSE is a collection of C/C++ applications, we configured Docker containers based on the Jupyter Docker Stack to integrate the LROSE software, available via the JupyterLab terminal. These containers also include Conda package manager environments equipped with Python packages like Py-ART, CSU RadarTools, and Metpy for further data analysis. A shared drive accessible to all participants contains instructional datasets for lidar and radar data analysis. Tutorials take the form of Jupyter notebooks for use by individuals, in classroom exercises, or at instructional workshops. Some tutorials are complete with pre-loaded examples to quickly visualize workflows and results. Other tutorials guide students how to run the applications independently. All tutorials are hosted on the LROSE Science Gateway GitHub repository, which is open to contributions from colleagues and community members. Future plans include an "intermediate" level workshop on SAMURAI, one of the multi-Doppler wind applications of the LROSE suite. Additionally, work is currently underway to run GUI applications in the same browser-based JupyterLab environment. GUI applications for radar and lidar data visualization utilize the QT framework and present unique technical challenges. The techniques to accomplish GUI access have immediate applications for other GUI programs, such as NSF Unidata's IDV and their version of the AWIPS CAVE data visualization tools. Lastly, as demand for the resources found on the gateway increases, it becomes increasingly important to efficiently manage the Jetstream2 resources allocated by the ACCESS program. LROSE, NSF Unidata, San Diego Supercomputing Center (SDSC), and Indiana University staff are working together to deploy and evaluate Kubernetes cluster auto-scaling. With auto-scaling, resources will no longer sit idle while awaiting new logins and will instead be provisioned on-demand.more » « less
An official website of the United States government

