Assessing Resource Provisioning and Allocation of Ensembles of In Situ Workflows

Do, Tu Mai; Pottier, Loïc; Ferreira da Silva, Rafael; Caíno-Lores, Silvina; Taufer, Michela; Deelman, Ewa

doi:10.1145/3458744.3474051

Scientific breakthroughs in biomolecular methods and improvements in hardware technology have shifted from a single long-running simulation to a large set of shorter simulations running simultaneously, called an ensemble. In an ensemble, each independent simulation is usually coupled with several analyses that apply identical or distinct algorithms on data produced by the corresponding simulation. Today, In situ methods are used to analyze large volumes of data generated by scientific simulations at runtime. This work studies the execution of ensemble-based simulations paired with In situ analyses using in-memory staging methods. Because simulations and analyses forming an ensemble typically run concurrently, deploying an ensemble requires efficient co-location-aware strategies, making sure the data flow between simulations and analyses that form an In situ workflow is efficient. Using an ensemble of molecular dynamics In situ workflows with multiple simulations and analyses, we first show that collecting traditional metrics such as makespan, instructions per cycle, memory usage, or cache miss ratio is not sufficient to characterize the complex behaviors of ensembles. Thus, we propose a method to evaluate the performance of ensembles of workflows that captures resource usage (efficiency), resource allocation, and component placement. Experimental results demonstrate that our proposed method can effectively capture the performance of different component placements in an ensemble. By evaluating different co-location scenarios, our performance indicator demonstrates improvements of up to four orders of magnitude when co-locating simulation and coupled analyses within a single computational host.

More Like this