The wide adoption of Docker containers for supporting agile and elastic enterprise applications has led to a broad proliferation of container images. The associated storage performance and capacity requirements place a high pressure on the infrastructure ofcontainer registriesthat store and distribute images andcontainer storage systemson the Docker client side that manage image layers and store ephemeral data generated at container runtime. The storage demand is worsened by the large amount of duplicate data in images. Moreover, container storage systems that use Copy-on-Write (CoW) file systems as storage drivers exacerbate the redundancy. Exploiting the high file redundancy in real-world images is a promising approach to drastically reduce the growing storage requirements of container registries and improve the space efficiency of container storage systems. However, existing deduplication techniques significantly degrade the performance of both registries and container storage systems because of data reconstruction overhead as well as the deduplication cost. We propose DupHunter, an end-to-end deduplication scheme that deduplicates layers for both Docker registries and container storage systems while maintaining a high image distribution speed and container I/O performance. DupHunter is divided into three tiers: registry tier, middle tier, and client tier. Specifically, we first build a high-performance deduplication engine at the registry tier that not only natively deduplicates layers for space savings but also reduces layer restore overhead. Then, we use deduplication offloading at the middle tier to eliminate the redundant files from the client tier and avoid bringing deduplication overhead to the clients. To further reduce the data duplicates caused by CoWs and improve the container I/O performance, we utilize a container-aware storage system at the client tier that reserves space for each container and arranges the placement of files and their modifications on the disk to preserve locality. Under real workloads, DupHunter reduces storage space by up to 6.9× and reduces theGETlayer latency up to 2.8× compared to the state-of-the-art. Moreover, DupHunter can improve the container I/O performance by up to 93% for reads and 64% for writes.
more »
« less
Solving the Container Explosion Problem for Distributed High Throughput Computing
Container technologies are seeing wider use at advanced computing facilities for managing highly complex applications that must execute at multiple sites. However, in a distributed high throughput computing setting, the unrestricted use of containers can result in the container explosion problem.If a new container image is generated for each variation of a job dispatched to a site, shared storage is soon exceeded. On the other hand, if a single large container image is used to meet multiple needs, the size of that container may become a problem for storage and transport. To address this problem, we observe that many containers have an internal structure generated by a structured package manager, and this information could be used to strategically combine and share container images. We develop LANDLORD to exploit this property and evaluate its performance through a combination of simulation studies and empirical measurement of high energy physics applications.
more »
« less
- Award ID(s):
- 1642409
- PAR ID:
- 10210593
- Date Published:
- Journal Name:
- International Parallel and Distributed Processing Symposium
- Page Range / eLocation ID:
- 388 to 398
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Scalability and flexibility of modern cloud application can be mainly attributed to virtual machines (VMs) and containers, where virtual machines are isolated operating systems that run on a hypervisor while containers are lightweight isolated processes that share the Host OS kernel. To achieve the scalability and flexibility required for modern cloud applications, each bare-metal server in the data center often houses multiple virtual machines, each of which runs multiple containers and multiple containerized applications that often share the same set of libraries and code, often referred to as images. However, while container frameworks are optimized for sharing images within a single VM, sharing images across multiple VMs, even if the VMs are within the same bare-metal server, is nearly non-existent due to the nature of VM isolation, leading to repetitive downloads, causing redundant added network traffic and latency. This work aims to resolve this problem by utilizing SmartNICs, which are specialized network hardware that provide hardware acceleration and offload capabilities for networking tasks, to optimize image retrieval and sharing between containers across multiple VMs on the same server. The method proposed in this work shows promise in cutting down container cold start time by up to 92%, reducing network traffic by 99.9%. Furthermore, the result is even more promising as the performance benefit is directly proportional to the number of VMs in a server that concurrently seek the same image, which guarantees increased efficiency as bare metal machine specifications improve.more » « less
-
Application containers, such as those provided by Docker, have recently gained popularity as a solution for agile and seamless software deployment. These light-weight virtualization environments run applications that are packed together with their resources and configuration information, and thus can be deployed across various software platforms. Unfortunately, the ease with which containers can be created is oftentimes a double-edged sword, encouraging the packaging of logically distinct applications, and the inclusion of significant amount of unnecessary components, within a single container. These practices needlessly increase the container size - sometimes by orders of magnitude. They also decrease the overall security, as each included component - necessary or not - may bring in security issues of its own, and there is no isolation between multiple applications packaged within the same container image. We propose algorithms and a tool called Cimplifier, which address these concerns: given a container and simple user-defined constraints, our tool partitions it into simpler containers, which (i) are isolated from each other, only communicating as necessary, and (ii) only include enough resources to perform their functionality. Our evaluation on real-world containers demonstrates that Cimplifier preserves the original functionality, leads to reduction in image size of up to 95%, and processes even large containers in under thirty seconds.more » « less
-
null (Ed.)Large-scale, high-throughput computational science faces an accelerating convergence of software and hardware. Software container-based solutions have become common in cloud-based datacenter environments, and are considered promising tools for addressing heterogeneity and portability concerns. However, container solutions reflect a set of assumptions which complicate their adoption by developers and users of scientific workflow applications. Nor are containers a universal solution for deployment in high-performance computing (HPC) environments which have specialized and vertically integrated scheduling and runtime software stacks. In this paper, we present a container design and deployment approach which uses modular layering to ease the deployment of containers into existing HPC environments. This layered approach allows operating system integrations, support for different communication and performance monitoring libraries, and application code to be defined and interchanged in isolation. We describe in this paper the details of our approach, including specifics about container deployment and orchestration for different HPC scheduling systems. We also describe how this layering method can be used to build containers for two separate applications, each deployed on clusters with different batch schedulers, MPI networking support, and performance monitoring requirements. Our experience indicates that the layered approach is a viable strategy for building applications intended to provide similar behavior across widely varying deployment targets.more » « less
-
null (Ed.)This paper focuses on the need for emerging domains such as serverless and in-network computing, where applications are often hosted on virtualized compute instances (e.g., containers and unikernels), to have applications startup as quickly as possible. We provide a qualitative and quantitative analysis of containers and unikernels with regard to the startup time. We analyze these in-depth and identify the key components and their impact under scale on the startup latency. We study how startup time scales as we launch multiple instances concurrently. We study the contribution of popular Container Networking Interfaces (CNIs), to the startup time.more » « less
An official website of the United States government

