Scientific workflows drive most modern large-scale science breakthroughs by allowing scientists to define their computations as a set of jobs executed in a given order based on their data dependencies. Workflow management systems (WMSs) have become key to automating scientific workflows-executing computational jobs and orchestrating data transfers between those jobs running on complex high-performance computing (HPC) platforms. Traditionally, WMSs use files to communicate between jobs: a job writes out files that are read by other jobs. However, HPC machines face a growing gap between their storage and compute capabilities. To address that concern, the scientific community has adopted a new approach called in situ, which bypasses costly parallel filesystem I/O operations with faster in-memory or in-network communications. When using in situ approaches, communication and computations can be interleaved. In this work, we leverage the Decaf in situ dataflow framework to accelerate task-based scientific workflows managed by the Pegasus WMS, by replacing file communications with faster MPI messaging. We propose a new execution engine that uses Decaf to manage communications within a sub-workflow (i.e., set of jobs) to optimize inter-job communications. We consider two workflows in this study: (i) a synthetic workflow that benchmarks and compares file- and MPI-based communication; andmore »
DFMan: A Graph-based Optimization of Dataflow Scheduling on High-Performance Computing Systems
Scientific research and development campaigns are materialized by workflows of applications executing on high-performance computing (HPC) systems. These applications con-sist of tasks that can have inter- or intra-application flows of data to achieve the research goals successfully. These dataflows create dependencies among the tasks and cause resource con-tention on shared storage systems, thus limiting the aggregated I/O bandwidth achieved by the workflow. However, these I/O performance issues are often solved by tedious and manual efforts that demand holistic knowledge about the data dependencies in the workflow and the information about the infrastructure being utilized. Taking this into consideration, we design DFMan, a graph-based dataflow management and optimization framework for maximizing I/O bandwidth by leveraging the powerful storage stack on HPC systems to manage data sharing optimally among the tasks in the workflows. In particular, we devise a graph-based optimization algorithm that can leverage an intuitive graph representation of dataflow- and system-related information, and automatically carry out co-scheduling of task and data placement. According to our experiments, DFMan optimizes a wide variety of scientific workflows such as Hurricane 3D on Cloud Model 1 (CM1), Montage Carina Nebula (NGC3372), and an emulated dataflow kernel of the Multiscale Machine-learned Modeling Infrastructure (MuMMI I/O) more »
- Publication Date:
- NSF-PAR ID:
- 10414632
- Journal Name:
- 2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS)
- Page Range or eLocation-ID:
- 368 to 378
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
The prevalence of scientific workflows with high computational demands calls for their execution on various distributed computing platforms, including large-scale leadership-class high-performance computing (HPC) clusters. To handle the deployment, monitoring, and optimization of workflow executions, many workflow systems have been developed over the past decade. There is a need for workflow benchmarks that can be used to evaluate the performance of workflow systems on current and future software stacks and hardware platforms. We present a generator of realistic workflow benchmark specifications that can be translated into benchmark code to be executed with current workflow systems. Our approach generates workflow tasks with arbitrary performance characteristics (CPU, memory, and I/O usage) and with realistic task dependency structures based on those seen in production workflows. We present experimental results that show that our approach generates benchmarks that are representative of production workflows, and conduct a case study to demonstrate the use and usefulness of our generated benchmarks to evaluate the performance of workflow systems under different configuration scenarios.
-
Parallel I/O is an effective method to optimize data movement between memory and storage for many scientific applications. Poor performance of traditional disk-based file systems has led to the design of I/O libraries which take advantage of faster memory layers, such as on-node memory, present in high-performance computing (HPC) systems. By allowing caching and prefetching of data for applications alternating computation and I/O phases, a faster memory layer also provides opportunities for hiding the latency of I/O phases by overlapping them with computation phases, a technique called asynchronous I/O. Since asynchronous parallel I/O in HPC systems is still in the initial stages of development, there hasn't been a systematic study of the factors affecting its performance.In this paper, we perform a systematic study of various factors affecting the performance and efficacy of asynchronous I/O, we develop a performance model to estimate the aggregate I/O bandwidth achievable by iterative applications using synchronous and asynchronous I/O based on past observations, and we evaluate the performance of the recently developed asynchronous I/O feature of a parallel I/O library (HDF5) using benchmarks and real-world science applications. Our study covers parallel file systems on two large-scale HPC systems: Summit and Cori, the former with amore »
-
As large-scale scientific simulations and big data analyses become more popular, it is increasingly more expensive to store huge amounts of raw simulation results to perform post-analysis. To minimize the expensive data I/O, “in-situ” analysis is a promising approach, where data analysis applications analyze the simulation generated data on the fly without storing it first. However, it is challenging to organize, transform, and transport data at scales between two semantically different ecosystems due to the distinct software and hardware difference. To tackle these challenges, we design and implement the X-Composer framework. X-Composer connects cross-ecosystem applications to form an “in-situ” scientific workflow, and provides a unified approach and recipe for supporting such hybrid in-situ workflows on distributed heterogeneous resources. X-Composer reorganizes simulation data as continuous data streams and feeds them seamlessly into the Cloud-based stream processing services to minimize I/O overheads. For evaluation, we use X-Composer to set up and execute a cross-ecosystem workflow, which consists of a parallel Computational Fluid Dynamics simulation running on HPC, and a distributed Dynamic Mode Decomposition analysis application running on Cloud. Our experimental results show that X-Composer can seamlessly couple HPC and Big Data jobs in their own native environments, achieve good scalability, and providemore »
-
The Twitter-Based Knowledge Graph for Researchers project is an effort to construct a knowledge graph of computation-based tasks and corresponding outputs. It will be utilized by subject matter experts, statisticians, and developers. A knowledge graph is a directed graph of knowledge accumulated from a variety of sources. For our application, Subject Matter Experts (SMEs) are experts in their respective non-computer science fields, but are not necessarily experienced with running heavy computation on datasets. As a result, they find it difficult to generate workflows for their projects involving Twitter data and advanced analysis. Workflow management systems and libraries that facilitate computation are only practical when the users of these systems understand what analysis they need to perform. Our goal is to bridge this gap in understanding. Our queryable knowledge graph will generate a visual workflow for these experts and researchers to achieve their project goals. After meeting with our client, we established two primary deliverables. First, we needed to create an ontology of all Twitter-related information that an SME might want to answer. Secondly, we needed to build a knowledge graph based on this ontology and produce a set of APIs to trigger a set of network algorithms based on themore »