Solving the Container Explosion Problem for Distributed High Throughput Computing

Shaffer, Tim; Hazekamp, Nicholas; Blomer, Jakob; Thain, Douglas

doi:10.1109/IPDPS47924.2020.00048

Citation Details

Solving the Container Explosion Problem for Distributed High Throughput Computing

Container technologies are seeing wider use at advanced computing facilities for managing highly complex applications that must execute at multiple sites. However, in a distributed high throughput computing setting, the unrestricted use of containers can result in the container explosion problem.If a new container image is generated for each variation of a job dispatched to a site, shared storage is soon exceeded. On the other hand, if a single large container image is used to meet multiple needs, the size of that container may become a problem for storage and transport. To address this problem, we observe that many containers have an internal structure generated by a structured package manager, and this information could be used to strategically combine and share container images. We develop LANDLORD to exploit this property and evaluate its performance through a combination of simulation studies and empirical measurement of high energy physics applications. more »

Award ID(s):: 1642409

PAR ID:: 10210593

Author(s) / Creator(s):: Shaffer, Tim; Hazekamp, Nicholas; Blomer, Jakob; Thain, Douglas

Date Published:: 2020-05-01

Journal Name:: International Parallel and Distributed Processing Symposium

Page Range / eLocation ID:: 388 to 398

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.1109/IPDPS47924.2020.00048

More Like this