skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Software Challenges to Exascale Computing
Supercomputers are used to power discoveries and to reduce the time-to-results in a wide variety of disciplines such as engineering, physical sciences, and healthcare. They are globally considered as vital for staying competitive in defense, the financial sector, several mainstream businesses, and even agriculture. An integral requirement for enabling the usage of the supercomputers, like any other computer, is the availability of the software. Scalable and efficient software is typically required for optimally using the large-scale supercomputing platforms, and thereby, effectively leveraging the investments in the advanced CyberInfrastructure (CI). However, developing and maintaining such software is challenging due to several factors, such as, (1) no well-defined processes or guidelines for writing software that can ensure high-performance on supercomputers, and (2) shortfall of trained workforce having skills in both software engineering and supercomputing. With the rapid advancement in the computer architecture discipline, the complexity of the processors that are used in the supercomputers is also increasing, and, in turn, the task of developing efficient software for supercomputers is further becoming challenging and complex. To mitigate the aforementioned challenges, there is a need for a common platform that brings together different stakeholders from the areas of supercomputing and software engineering. To provide such a platform, the second workshop on Software Challenges to Exascale Computing (SCEC) was organized in Delhi, India, during December 13–14, 2018. The SCEC 2018 workshop informed participants about the challenges in large-scale HPC software development and steered them in the direction of building international collaborations for finding solutions to those challenges. The workshop provided a forum through which hardware vendors and software developers can communicate with each other and influence the architecture of the next-generation supercomputing systems and the supporting software stack. By fostering cross-disciplinary associations, the workshop served as a stepping-stone towards innovations in the future. We are very grateful to the Organizing and Program Committees (listed below), the sponsors (US National Science Foundation, Indian National Supercomputing Mission, Atos, Mellanox, Centre for Development of Advanced Computing, San Diego Supercomputing Center, Texas Advanced Computing Center), and the participants for their contributions to making the SCEC 2018 workshop a success.  more » « less
Award ID(s):
1849519
PAR ID:
10116991
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Second Workshop, SCEC 2018 Delhi, India, December 13–14, 2018 Proceedings
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Volunteer Computing (VC) is a computing model that uses donated computing cycles on the devices such as laptops, desktops, and tablets to do scientific computing. BOINC is the most popular software framework for VC and it helps in connecting the projects needing computing cycles with the volunteers interested in donating the computing cycles on their resources. It has already enabled projects with high societal impact to harness several PetaFLOPs of donated computing cycles. Given its potential in elastically augmenting the capacity of existing supercomputing resources for running High-Throughput Computing (HTC) jobs, we have extended the BOINC software infrastructure and have made it amenable for integration with the supercomputing and cloud computing environments. We have named the extension of the BOINC software infrastructure as BOINC@TACC, and are using it to route *qualified* HTC jobs from the supercomputers at the Texas Advanced Computing Center (TACC) to not only the typically volunteered devices but also to the cloud computing resources such as Jetstream and Chameleon. BOINC@TACC can be extremely useful for those researchers/scholars who are running low on allocations of compute-cycles on the supercomputers, or are interested in reducing the turnaround time of their HTC jobs when the supercomputers are over-subscribed. We have also developed a web-application for TACC users so that, through the convenience of their web-browser, they can submit their HTC jobs for running on the resources volunteered by the community. An overview of the BOINC@TACC project is presented in this paper. The BOINC@TACC software infrastructure is open-source and can be easily adapted for use by other supercomputing centers that are interested in building their volunteer community and connecting them with the researchers needing multi-petascale (and even exascale) computing power for their HTC jobs. 
    more » « less
  2. Volunteer Computing (VC) is a computing model that uses donated computing cycles on the devices such as laptops, desktops, and tablets to do scientific computing. BOINC is the most popular software framework for VC and it helps in connecting the projects needing computing cycles with the volunteers interested in donating the computing cycles on their resources. It has already enabled projects with high societal impact to harness several PetaFLOPs of donated computing cycles. Given its potential in elastically augmenting the capacity of existing supercomputing resources for running High-Throughput Computing (HTC) jobs, we have extended the BOINC software infrastructure and have made it amenable for integration with the supercomputing and cloud computing environments. We have named the extension of the BOINC software infrastructure as BOINC@TACC, and are using it to route *qualified* HTC jobs from the supercomputers at the Texas Advanced Computing Center (TACC) to not only the typically volunteered devices but also to the cloud computing resources such as Jetstream and Chameleon. BOINC@TACC can be extremely useful for those researchers/scholars who are running low on allocations of compute-cycles on the supercomputers, or are interested in reducing the turnaround time of their HTC jobs when the supercomputers are over-subscribed. We have also developed a web-application for TACC users so that, through the convenience of their web-browser, they can submit their HTC jobs for running on the resources volunteered by the community. An overview of the BOINC@TACC project is presented in this paper. The BOINC@TACC software infrastructure is open-source and can be easily adapted for use by other supercomputing centers that are interested in building their volunteer community and connecting them with the researchers needing multi-petascale (and even exascale) computing power for their HTC jobs 
    more » « less
  3. Welcome to the 4 th Workshop on Education for High Performance Computing (EduHiPC 2022). The EduHiPC 2022 workshop, held in conjunction with the IEEE International Conference on High Performance Computing Data & Analytics (HiPC 2022), is devoted to the development and assessment of educational and curricular innovations and resources for undergraduate and graduate education in Parallel and Distributed Computing (PDC) and High Performance Computing (HPC). EduHiPC brings together individuals from academia, industry, and other educational and research institutes to explore new ideas, challenges, and experiences related to PDC pedagogy and curricula. The workshop is designed in coordination with the IEEE TCPP curriculum initiative on parallel and distributed computing ( hitps://tcpp.cs.gsu .edu/curriculum/) for undergraduates majoring in computer science and computer engineering. It is supported by C-DAC, India and the US National Science Foundation (NSF) supported Center for Parallel and Distributed Computing Curriculum Development and Educational Resources (CDER). Details for attending the workshop are available on the HiPC webpage (HiPC). The effect of pandemic on academic and research community seems now to be globally receding as was evident from the enthusiastic in-person participation of conference delegates. Please visit the EduHiPC-22 webpage for the complete online proceedings, including copies of papers and presentation slides: EduHiPC 2022 | NSF/IEEE-TCPP Curriculum Initiative. 
    more » « less
  4. The CSSI 2019 workshop was held on October 28-29, 2019, in Austin, Texas. The main objectives of this workshop were to (1) understand the impact of the CSSI program on the community over the last 9 years, (2) engage workshop participants in identifying gaps and opportunities in the current CSSI landscape, (3) gather ideas on the cyberinfrastructure needs and expectations of the community with respect to the CSSI program, and (4) prepare a report summarizing the feedback gathered from the community that can inform the future solicitations of the CSSI program. The workshop participants included a diverse mix of researchers and practitioners from academia, industry, and national laboratories. The participants belonged to diverse domains such as quantum physics, computational biology, High Performance Computing (HPC), and library science. Almost 50% participants were from computer science domain and roughly 50% were from non-computer science domains. As per the self-reported statistics, roughly 27% of the participants were from the different underrepresented groups as defined by the National Science Foundation (NSF). The workshop brought together different stakeholders interested in provisioning sustainable cyberinfrastructure that can power discoveries impacting the various fields of science and technology and maintaining the nation's competitiveness in the areas such as scientific software, HPC, networking, cybersecurity, and data/information science. The workshop served as a venue for gathering the community-feedback on the current state of the CSSI program and its future directions. Before they arrived at the workshop, the participants were encouraged to take an online survey on the challenges that they face in using the current cyberinfrastructure and the importance of the CSSI program in enabling cutting-edge research. The workshop included 16 brain-storming sessions of one hour each. Additionally, the workshop program included 16 lightning talks and an extempore session. The information collected from the survey, brainstorming sessions, lightning talks, and the extempore session are summarized in this report and can potentially be useful for the NSF in formulating the future CSSI solicitations. The workshop fostered an environment in which the participants were encouraged to identify gaps and opportunities in the current cyberinfrastructure landscape, and develop thoughts for proposing new projects. 
    more » « less
  5. Computing landscape is evolving rapidly. Exascale computers have arrived, which can perform 10^18 mathematical operations per second. At the same time, quantum supremacy has been demonstrated, where quantum computers have outperformed these fastest supercomputers for certain problems. Meanwhile, artificial intelligence (AI) is transforming every aspect of science and engineering. A highly anticipated application of the emerging nexus of exascale computing, quantum computing and AI is computational design of new materials with desired functionalities, which has been the elusive goal of the federal materials genome initiative. The rapid change in computing landscape resulting from these developments has not been matched by pedagogical developments needed to train the next generation of materials engineering cyberworkforce. This gap in curricula across colleges and universities offers a unique opportunity to create educational tools, enabling a decentralized training of cyberworkforce. To achieve this, we have developed training modules for a new generation of quantum materials simulator, named AIQ-XMaS (AI and quantum-computing enabled exascale materials simulator), which integrates exascalable quantum, reactive and neural-network molecular dynamics simulations with unique AI and quantum-computing capabilities to study a wide range of materials and devices of high societal impact such as optoelectronics and health. As a singleentry access point to these training modules, we have also built a CyberMAGICS (cyber training on materials genome innovation for computational software) portal, which includes step-by-step instructions in Jupyter notebooks and associated tutorials, while providing online cloud service for those who do not have access to adequate computing platform. The modules are incorporated into our open-source AIQ-XMaS software suite as tutorial examples and are piloted in classroom and workshop settings to directly train many users at the University of Southern California (USC) and Howard University—one of the largest historically black colleges and universities (HBCUs), with a strong focus on underrepresented groups. In this paper, we summarize these educational developments, including findings from the first CyberMAGICS Workshop for Underrepresented Groups, along with an introduction to the AIQ-XMaS software suite. Our training modules also include a new generation of open programming languages for exascale computing (e.g., OpenMP target) and quantum computing (e.g., Qiskit) used in our scalable simulation and AI engines that underlie AIQ-XMaS. Our training modules essentially support unique dual-degree opportunities at USC in the emerging exa-quantum-AI era: Ph.D. in science or engineering, concurrently with MS in computer science specialized in high-performance computing and simulations, MS in quantum information science or MS in materials engineering with machine learning. The developed modular cyber-training pedagogy is applicable to broad engineering education at large. 
    more » « less