We describe progress on building the SLATE (Services Layer at the Edge) platform. The high level goal of SLATE is to facilitate creation of multi-institutional science computing systems by augmenting the canonical Science DMZ pattern with a generic, "programmable", secure and trusted underlayment platform. This platform permits hosting of advanced container-centric services needed for higher-level capabilities such as data transfer nodes, software and data caches, workflow services and science gateway components. SLATE uses best-of-breed data center virtualization and containerization components, and where available, software defined networking, to enable distributed automation of deployment and service lifecycle management tasks by domain experts. As such it will simplify creation of scalable platforms that connect research teams, institutions and resources to accelerate science while reducing operational costs and development cycle times.
SLATE and the Mobility of Capability
SLATE (Services Layer at the Edge) is a new project that, when complete, will implement “cyberinfrastructure as code” by augmenting the canonical Science DMZ pattern with a generic, programmable, secure and trusted underlayment platform. This platform will host advanced container-centric services needed for higher-level capabilities such as data transfer nodes, software and data caches, workflow services and science gateway components. SLATE will use best-of-breed data center virtualization components, and where available, software defined networking, to enable distributed automation of deployment and service lifecycle management tasks by domain experts. As such it will simplify creation of scalable platforms that connect research teams, institutions and resources to accelerate science while reducing operational costs and development cycle times. Since SLATE will be designed to require only commodity components for its functional layers, its potential for building distributed systems should extend across all data center types and scales, thus enabling creation of ubiquitous, science-driven cyberinfrastructure. By providing automation and programmatic interfaces to distributed HPC backends and other cyberinfrastructure resources, SLATE will amplify the reach of science gateways and therefore the domain communities they support.
- Award ID(s):
- Publication Date:
- NSF-PAR ID:
- Journal Name:
- Science Gateways 2017
- Sponsoring Org:
- National Science Foundation
More Like this
The Deep Learning Epilepsy Detection Challenge: Design, Implementation, and Test of a New Crowd-Sourced AI Challenge EcosystemThe DeepLearningEpilepsyDetectionChallenge: design, implementation, andtestofanewcrowd-sourced AIchallengeecosystem Isabell Kiral*, Subhrajit Roy*, Todd Mummert*, Alan Braz*, Jason Tsay, Jianbin Tang, Umar Asif, Thomas Schaffter, Eren Mehmet, The IBM Epilepsy Consortium◊ , Joseph Picone, Iyad Obeid, Bruno De Assis Marques, Stefan Maetschke, Rania Khalaf†, Michal Rosen-Zvi† , Gustavo Stolovitzky† , Mahtab Mirmomeni† , Stefan Harrer† * These authors contributed equally to this work † Corresponding authors: email@example.com, firstname.lastname@example.org, email@example.com, firstname.lastname@example.org, email@example.com ◊ Members of the IBM Epilepsy Consortium are listed in the Acknowledgements section J. Picone and I. Obeid are with Temple University, USA. T. Schaffter is with Sage Bionetworks, USA. E. Mehmet is with the University of Illinois at Urbana-Champaign, USA. All other authors are with IBM Research in USA, Israel and Australia. Introduction This decade has seen an ever-growing number of scientific fields benefitting from the advances in machine learning technology and tooling. More recently, this trend reached the medical domain, with applications reaching from cancer diagnosis  to the development of brain-machine-interfaces . While Kaggle has pioneered the crowd-sourcing of machine learning challenges to incentivise data scientists from around the world to advance algorithm and model design, the increasing complexity of problem statements demands of participants to be expert datamore »
Collections Management and High-Throughput Digitization using Distributed Cyberinfrastructure ResourcesCollections digitization relies increasingly upon computational and data management resources that occasionally exceed the capacity of natural history collections and their managers and curators. Digitization of many tens of thousands of micropaleontological specimen slides, as evidenced by the effort presented here by the Indiana University Paleontology Collection, has been a concerted effort in adherence to the recommended practices of multifaceted aspects of collections management for both physical and digital collections resources. This presentation highlights the contributions of distributed cyberinfrastructure from the National Science Foundation-supported Extreme Science and Engineering Discovery Environment (XSEDE) for web-hosting of collections management system resources and distributed processing of millions of digital images and metadata records of specimens from our collections. The Indiana University Center for Biological Research Collections is currently hosting its instance of the Specify collections management system (CMS) on a virtual server hosted on Jetstream, the cloud service for on-demand computational resources as provisioned by XSEDE. This web-service allows the CMS to be flexibly hosted on the cloud with additional services that can be provisioned on an as-needed basis for generating and integrating digitized collections objects in both web-friendly and digital preservation contexts. On-demand computing resources can be used for the manipulation of digitalmore »
Science Gateway Development to aid Cyber and Software Automation for Neuroscience Researchers and EducatorsNeuroscientists are increasingly relying on parallel and distributed computing resources for analysis and visualization of their neuron simulations. This requires expert knowledge of programming and cyberinfrastructure configuration, which is beyond the repertoire of most neuroscience programs. This paper presents early experiences from a one-credit graduate research training course titled ECE 8001 “Software and Cyber Automation in Neuroscience” at the University of Missouri for engendering multi-disciplinary collaborations between computational neuroscience and cyberinfrastructure students and faculty. Specifically, we discuss the course organization and exemplar outcomes involving a next-generation science gateway for training novice users on exemplar neuroscience use cases that involve using tools such as NEURON and MATLAB on local as well as Neuroscience Gateway resources. We also discuss our vision towards a course sequence curriculum for graduate/undergraduate students from biological/psychological sciences and computer science/engineering to jointly build “self- service” training modules using Jupyter Notebook platforms. Thus, our efforts show how we can create scalable and sustainable cyber and software automation for fulfilling a broad set of neuroscience research and education use cases.
One of the most costly factors in providing a global computing infrastructure such as the WLCG is the human effort in deployment, integration, and operation of the distributed services supporting collaborative computing, data sharing and delivery, and analysis of extreme scale datasets. Furthermore, the time required to roll out global software updates, introduce new service components, or prototype novel systems requiring coordinated deployments across multiple facilities is often increased by communication latencies, staff availability, and in many cases expertise required for operations of bespoke services. While the WLCG (and distributed systems implemented throughout HEP) is a global service platform, it lacks the capability and flexibility of a modern platform-as-a-service including continuous integration/continuous delivery (CI/CD) methods, development-operations capabilities (DevOps, where developers assume a more direct role in the actual production infrastructure), and automation. Most importantly, tooling which reduces required training, bespoke service expertise, and the operational effort throughout the infrastructure, most notably at the resource endpoints (sites), is entirely absent in the current model. In this paper, we explore ideas and questions around potential NoOps models in this context: what is realistic given organizational policies and constraints? How should operational responsibility be organized across teams and facilities? What are the technicalmore »