skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Composable Infrastructures for an Academic Research Environment: Lessons Learned
Composable infrastructure holds the promise of accelerating the pace of academic research and discovery by enabling researchers to tailor the resources of a machine (e.g., GPUs, storage, NICs), on-demand, to address application needs. We were first introduced to composable infrastructure in 2018, and at the same time, there was growing demand among our College of Engineering faculty for GPU systems for data science, artificial intelligence / machine learning / deep learning, and visualization. Many purchased their own individual desktop or deskside systems, a few pursued more costly cloud and HPC solutions, and others looked to the College or campus computer center for GPU resources which, at the time, were scarce. After surveying the diverse needs of our faculty and studying product offerings by a few nascent startups in the composable infrastructure sector, we applied for and received a grant from the National Science Foundation in November 2019 to purchase a mid-scale system, configured to our specifications, for use by faculty and students for research and research training. This paper describes our composable infrastructure solution and implementation for our academic community. Given how modern workflows are progressively moving to containers and cloud frameworks (using Kubernetes) and to programming notebooks (primarily Jupyter), both for ease of use and for ensuring reproducible experiments, we initially adapted these tools for our system. We have since made it simpler to use our system, and now provide our users with a public facing JupyterHub server. We also added an expansion chassis to our system to enable composable co-location, which is a shared central architecture in which our researchers can insert and integrate specialized resources (GPUs, accelerators, networking cards, etc.) needed for their research. In February 2020, installation of our system was finalized and made operational and we began providing access to faculty in the College of Engineering. Now, two years later, it is used by over 40 faculty and students plus some external collaborators for research and research training. Their use cases and experiences are briefly described in this paper. Composable infrastructure has proven to be a useful computational system for workload variability, uneven applications, and modern workflows in academic environments.  more » « less
Award ID(s):
1828265
PAR ID:
10356861
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
36th IEEE International Parallel & Distributed Processing Symposium Workshops (IPDPSW)
Page Range / eLocation ID:
1209-1214
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. In today's Big Data era, data scientists require new computational instruments in order to quickly analyze large-scale datasets using complex codes and quicken the rate of scientific progress. While Federally-funded computer resources, from supercomputers to clouds, are beneficial, they are often limiting - particularly for deep learning and visualization - as they have few Graphics Processing Units (GPUs). GPUs are at the center of modern high-performance computing and artificial intelligence, efficiently performing mathematical operations that can be massively parallelized, speeding up codes used for deep learning, visualization and image processing, more so than general-purpose microprocessors, or Central Processing Units (CPUs). The University of Illinois at Chicago is acquiring a much-in-demand GPU-based instrument, COMPaaS DLV - COMposable Platform as a Service Instrument for Deep Learning & Visualization, based on composable infrastructure, an advanced architecture that disaggregates the underlying compute, storage, and network resources for scaling needs, but operates as a single cohesive infrastructure for management and workload purposes. We are experimenting with a small system and learning a great deal about composability, and we believe COMPaaS DLV users will benefit from the varied workflow that composable infrastructure allows. 
    more » « less
  2. In today’s Big Data era, data scientists require modern workflows to quickly analyze large-scale datasets using complex codes to maintain the rate of scientific progress. These scientists often rely on available campus resources or off-the-shelf computational systems for their applications. Unified infrastructure or over-provisioned servers can quickly become bottlenecks for specific tasks, wasting time and resources. Composable infrastructure helps solve these problems by providing users with new ways to increase resource utilization. Composable infrastructure disaggregates a computer’s components – CPU, GPU (accelerators), storage and networking – into fluid pools of resources, but typically relies upon infrastructure engineers to architect individual machines. Infrastructure is either managed with specialized command-line utilities, user interfaces or specification files. These management models are cumbersome and difficult to incorporate into data-science workflows. We developed a high-level software API, Composastructure, which, when integrated into modern workflows, can be used by infrastructure engineers as well as data scientists to reorganize composable resources on demand. Composastructure enables infrastructures to be programmable, secure, persistent and reproducible. Our API composes machines, frees resources, supports multi-rack operations, and includes a Python module for Jupyter Notebooks. 
    more » « less
  3. The goal of a robust cyberinfrastructure (CI) ecosystem is to catalyse discovery and innovation. Tapis does this through offering a sustainable production-quality set of API services to support modern science and engineering research, which increasingly span geographically distributed data centers, instruments, experimental facilities, and a network of national and regional CI. Leveraging frameworks, such as Tapis, enables researchers to accomplish computational and data-intensive research in a secure, scalable, and reproducible way and allows them to focus on their research instead of the technology needed to accomplish it. This project aims to enable the integration of the Google Cloud Platform (GCP) and CloudyCluster resources into Tapis- supported science gateways to provide on-demand scaling needed by computational workflows. The new functionality uses Tapis event-driven Abaco Actors and CloudyCluster to create an elastic distributed cloud computing system on demand. This integration allows researchers and science gateways to augment cloud resources on top of existing local and national computing resources. 
    more » « less
  4. Background The digitization of biological specimens has revolutionized morphology, generating massive 3D datasets such as microCT scans. While open-source platforms like 3D Slicer and SlicerMorph have democratized access to advanced visualization and analysis software, a significant “compute gap” persists. Processing high-resolution 3D data requires high-end GPUs and substantial RAM, resources that are frequently unavailable at Primarily Undergraduate Institutions (PUIs) and other educational settings. This “digital divide” prevents many researchers and students from utilizing the very data and software that have been made open to them. Methods We present MorphoCloud, a platform designed to bridge this hardware barrier by providing on-demand, research-grade computing environments via a web browser. MorphoCloud utilizes an “IssuesOps” architecture, where users manage their remote workstations entirely through GitHub Issues using natural-language commands (e.g., /create, /unshelve). The technology stack leverages GitHub Issues and Actions for front-end and orchestration respectively, JetStream2 for backend compute, and Apache Guacamole to deliver a high-performance, GPU-accelerated desktop experience to any modern browser. Results The platform enables a streamlined lifecycle for remote instances, which come pre-configured with the SlicerMorph ecosystem, R/RStudio, and AI-assisted segmentation tools like NNInteractive and MEMOs. Users have access to a persistent storage volume that is decoupled from the instance. For educational purposes, MorphoCloud supports “Workshop” instances that allow for bulk provisioning and stay online continuously for short-term events. This identical environment ensures that instructors can conduct complex 3D workflows without the typical troubleshooting delays caused by heterogeneous student hardware. Conclusion MorphoCloud demonstrates that true scientific accessibility requires not just open data and software, but also open infrastructure. By abstracting the complexities of cloud administration into a simple, command-driven interface, MorphoCloud empowers researchers at under-resourced institutions to engage in high-performance morphological analysis and AI-assisted segmentation. 
    more » « less
  5. Wastewater surveillance for the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is an emerging approach to help identify the risk of a coronavirus disease (COVID-19) outbreak. This tool can contribute to public health surveillance at both community (wastewater treatment system) and institutional (e.g., colleges, prisons, and nursing homes) scales. This paper explores the successes, challenges, and lessons learned from initial wastewater surveillance efforts at colleges and university systems to inform future research, development and implementation. We present the experiences of 25 college and university systems in the United States that monitored campus wastewater for SARS-CoV-2 during the fall 2020 academic period. We describe the broad range of approaches, findings, resources, and impacts from these initial efforts. These institutions range in size, social and political geographies, and include both public and private institutions. Our analysis suggests that wastewater monitoring at colleges requires consideration of local information needs, sewage infrastructure, resources for sampling and analysis, college and community dynamics, approaches to interpretation and communication of results, and follow-up actions. Most colleges reported that a learning process of experimentation, evaluation, and adaptation was key to progress. This process requires ongoing collaboration among diverse stakeholders including decision-makers, researchers, faculty, facilities staff, students, and community members. 
    more » « less