skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Opportunities for enhancing MLCommons efforts while leveraging insights from educational MLCommons earthquake benchmarks efforts
MLCommons is an effort to develop and improve the artificial intelligence (AI) ecosystem through benchmarks, public data sets, and research. It consists of members from start-ups, leading companies, academics, and non-profits from around the world. The goal is to make machine learning better for everyone. In order to increase participation by others, educational institutions provide valuable opportunities for engagement. In this article, we identify numerous insights obtained from different viewpoints as part of efforts to utilize high-performance computing (HPC) big data systems in existing education while developing and conducting science benchmarks for earthquake prediction. As this activity was conducted across multiple educational efforts, we project if and how it is possible to make such efforts available on a wider scale. This includes the integration of sophisticated benchmarks into courses and research activities at universities, exposing the students and researchers to topics that are otherwise typically not sufficiently covered in current course curricula as we witnessed from our practical experience across multiple organizations. As such, we have outlined the many lessons we learned throughout these efforts, culminating in the need forbenchmark carpentryfor scientists using advanced computational resources. The article also presents the analysis of an earthquake prediction code benchmark while focusing on the accuracy of the results and not only on the runtime; notedly, this benchmark was created as a result of our lessons learned. Energy traces were produced throughout these benchmarks, which are vital to analyzing the power expenditure within HPC environments. Additionally, one of the insights is that in the short time of the project with limited student availability, the activity was only possible by utilizing a benchmark runtime pipeline while developing and using software to generate jobs from the permutation of hyperparameters automatically. It integrates a templated job management framework for executing tasks and experiments based on hyperparameters while leveraging hybrid compute resources available at different institutions. The software is part of a collection calledcloudmeshwith its newly developed components, cloudmesh-ee (experiment executor) and cloudmesh-cc (compute coordinator).  more » « less
Award ID(s):
2210266 2204115 2200409 2151597
PAR ID:
10473591
Author(s) / Creator(s):
; ; ; ; ; ;
Publisher / Repository:
Frontiers
Date Published:
Journal Name:
Frontiers in High Performance Computing
Volume:
1
ISSN:
2813-7337
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Summary High performance computing (HPC) has led to remarkable advances in science and engineering and has become an indispensable tool for research. Unfortunately, HPC use and adoption by many researchers is often hindered by the complex way these resources are accessed. Indeed, while the web has become the dominant access mechanism for remote computing services in virtually every computing area, HPC is a notable exception. Open OnDemand is an open source project negating this trend by providing web‐based access to HPC resources (https://openondemand.org). This article describes the challenges to adoption and other lessons learned over the 3‐year project that may be relevant to other science gateway projects. We end with a description of future plans the project team has during the Open OnDemand 2.0 project including specific developments in machine learning and GPU monitoring. 
    more » « less
  2. null (Ed.)
    Jetstream2 will be a category I production cloud resource that is part of the National Science Foundation’s Innovative HPC Program. The project’s aim is to accelerate science and engineering by providing “on-demand” programmable infrastructure built around a core system at Indiana University and four regional sites. Jetstream2 is an evolution of the Jetstream platform, which functions primarily as an Infrastructure-as-a-Service cloud. The lessons learned in cloud architecture, distributed storage, and container orchestration have inspired changes in both hardware and software for Jetstream2. These lessons have wide implications as institutions converge HPC and cloud technology while building on prior work when deploying their own cloud environments. Jetstream2’s next-generation hardware, robust open-source software, and enhanced virtualization will provide a significant platform to further cloud adoption within the US research and education communities. 
    more » « less
  3. Wastewater surveillance for the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is an emerging approach to help identify the risk of a coronavirus disease (COVID-19) outbreak. This tool can contribute to public health surveillance at both community (wastewater treatment system) and institutional (e.g., colleges, prisons, and nursing homes) scales. This paper explores the successes, challenges, and lessons learned from initial wastewater surveillance efforts at colleges and university systems to inform future research, development and implementation. We present the experiences of 25 college and university systems in the United States that monitored campus wastewater for SARS-CoV-2 during the fall 2020 academic period. We describe the broad range of approaches, findings, resources, and impacts from these initial efforts. These institutions range in size, social and political geographies, and include both public and private institutions. Our analysis suggests that wastewater monitoring at colleges requires consideration of local information needs, sewage infrastructure, resources for sampling and analysis, college and community dynamics, approaches to interpretation and communication of results, and follow-up actions. Most colleges reported that a learning process of experimentation, evaluation, and adaptation was key to progress. This process requires ongoing collaboration among diverse stakeholders including decision-makers, researchers, faculty, facilities staff, students, and community members. 
    more » « less
  4. The Billion Oyster Project and Curriculum and Community Enterprise for the Restoration of New York Harbor with New York City Public Schools (BOP-CCERS) program is a National Science Foundation (NSF) supported initiative and collaboration of multiple institutions and organizations led by Pace University. The NSF project, Innovative Technology Experiences for Students and Teachers (ITEST), had generated a large amount of data through engagement with teachers and students throughout New York City public schools. This article presents the second part to a large data collection study with focus on Underrepresented Minority (URM) student interest in STEM and engagement with teachers to support them in teaching science through experiential learning and lessons that connect science to the real world, particularly through science in the New York Harbor. The first component of the study focused on URM student interest in STEM. This second component of the study focuses on teacher engagement in the program, and what the researchers had learned in the process. Overall, teachers reported very favorable options on the impact of the BOP-CCERS activities as ways to generate student interest in STEM majors and careers. Teacher participants were generally positive about the amount of support and resources they received as members of the project, as well as the oyster-related knowledge and practices they learned to use with their own students in oyster field research. Data from the study provided evidence that the teacher activities were successful and met the project’s goals to provide support and resources for teachers to engage students in oyster restoration research. 
    more » « less
  5. The Nation's research enterprise faces a shortage of data scientists. Expanding the pipeline of data science students, particularly from underrepresented populations, requires educational institutions to increase awareness of data science and inspire a passion for data in students as they begin their academic careers. In this tutorial we discuss the development and delivery of a free seminar designed to provide hands-on lessons in the use of both Apache Spark and Jupyter notebooks to students from any academic background in an approachable, no-risk environment. An explanation of the seminar resources, exercises, and implementation guidelines are included, as are lessons learned from several successful seminars held both in-person and virtually at two institutions of high education. 
    more » « less