skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Insight from a Containerized Kubernetes Workload Introspection
Developments in virtual containers, especially in the cloud infrastructure, have led to diversification of jobs that containers are being used to support, particularly in the big data and machine learning spaces. The diversification has been powered by the adoption of orchestration systems that marshal fleets of containers to accomplish complex programming tasks. The additional components in the vertical technology stack, plus the continued horizontal scaling have led to questions regarding how to forensically analyze complicated technology stacks. This paper proposed a solution through the use of introspection. An exploratory case study has been conducted on a bare-metal cloud that utilizes Kubernetes, the introspection tool Prometheus, and Apache Spark. The contribution of this research is two-fold. First, it provides empirical support that introspection tools can acquire forensically viable data from different levels of a technology stack. Second, it provides the ground work for comparisons between different virtual container platforms.  more » « less
Award ID(s):
1726069
PAR ID:
10210489
Author(s) / Creator(s):
Date Published:
Journal Name:
Proceedings of the 54th Hawaii International Conference on System Sciences
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Large-scale adoption of virtual containers has stimulated concerns by practitioners and academics about the viability of data acquisition and reliability due to the decreasing window to gather relevant data points. These concerns prompted the idea that introspection tools, which are able to acquire data from a system as it is running, can be utilized as both an early warning system to protect that system and as a data capture system that collects data that would be valuable from a digital forensic perspective. An exploratory case study was conducted utilizing a Docker engine and Prometheus as the introspection tool. The research contribution of this research is two-fold. First, it provides empirical support for the idea that introspection tools can be utilized to ascertain differences between pristine and infected containers. Second, it provides the ground work for future research conducting an analysis of large-scale containerized applications in a virtual cloud. 
    more » « less
  2. The HPC community is actively researching and evaluating tools to support execution of scientific applications in cloud-based environ- ments. Among the various technologies, containers have recently gained importance as they have significantly better performance compared to full-scale virtualization, support for microservices and DevOps, and work seamlessly with workflow and orchestration tools. Docker is currently the leader in containerization technology because it offers low overhead, flexibility, portability of applications, and reproducibility. Singularity is another container solution that is of interest as it is designed specifically for scientific applications. It is important to conduct performance and feature analysis of the container technologies to understand their applicability for each application and target execution environment. This paper presents a (1) performance evaluation of Docker and Singularity on bare metal nodes in the Chameleon cloud (2) mecha- nism by which Docker containers can be mapped with InfiniBand hardware with RDMA communication and (3) analysis of mapping elements of parallel workloads to the containers for optimal re- source management with container-ready orchestration tools. Our experiments are targeted toward application developers so that they can make informed decisions on choosing the container tech- nologies and approaches that are suitable for their HPC workloads on cloud infrastructure. Our performance analysis shows that sci- entific workloads for both Docker and Singularity based containers can achieve near-native performance. Singularity is designed specifically for HPC workloads. However, Docker still has advantages over Singularity for use in clouds as it provides overlay networking and an intuitive way to run MPI applications with one container per rank for fine-grained resources allocation. Both Docker and Singularity make it possible to directly use the underlying network fabric from the containers for coarse- grained resource allocation. 
    more » « less
  3. In cloud-native environments, containers are often deployed within lightweight virtual machines (VMs) to ensure strong security isolation and privacy protection. With the growing demand for customized cloud services, third-party vendors are turning to infrastructure-as-a-service (IaaS) cloud providers to build their own cloud-native platforms, necessitating the need to run a VM or a guest that hosts containers inside another VM instance leased from an IaaS cloud. State-of-the-art nested virtualization in the x86 architecture relies heavily on the host hypervisor to expose hardware virtualization support to the guest hypervisor, not only complicating cloud management but also raising concerns about an increased attack surface at the host hypervisor. This paper presents the design and implementation of PVM, a high-performance guest hypervisor for KVM that is transparent to the host hypervisor and assumes no hardware virtualization support. PVM leverages two key designs: 1) a minimal shared memory region between the guest and guest hypervisor to facilitate state transition between different privilege levels and 2) an efficient shadow page table design to reduce the cost of memory virtualization. PVM has been adopted by a major IaaS cloud provider for hosting tens of thousands of secure containers on a daily basis. Our experiments demonstrate that PVM significantly outperforms current nested virtualization in KVM for memory virtualization, particularly for concurrent workloads, while maintaining comparable performance in CPU and I/O virtualization. 
    more » « less
  4. In a new effort to make our research transparent and reproducible by others, we developed a workflow to run and share computational studies on the public cloud Microsoft Azure. It uses Docker containers to create an image of the application software stack. We also adopt several tools that facilitate creating and managing virtual machines on compute nodes and submitting jobs to these nodes. The configuration files for these tools are part of an expanded "reproducibility package" that includes workflow definitions for cloud computing, input files and instructions. This facilitates re-creating the cloud environment to re-run the computations under identical conditions. We also show that cloud offerings are now adequate to complete computational fluid dynamics studies with in-house research software that uses parallel computing with GPUs. We share with readers what we have learned from nearly two years of using Azure cloud to enhance transparency and reproducibility in our computational simulations. 
    more » « less
  5. This paper describes the deployment of a private cloud and the development of virtual laboratories and companion material to teach and train engineering students and Information Technology (IT) professionals in high-throughput networks and cybersecurity. The material and platform, deployed at the University of South Carolina, are also used by other institutions to support regular academic courses, self-pace training of professional IT staff, and workshops across the country. The private cloud is used to deploy scenarios consisting of high-speed networks (up to 50 Gbps), multi-domain environments emulating internetworks, and infrastructures under cyber-attacks using live traffic. For regular academic courses, the virtual laboratories have been adopted by institutions in different states to supplement theoretical material with hands-on activities in IT, electrical engineering, and computer science programs. Topics include Local Area Networks (LANs), congestion-control algorithms, performance tools used to emulate wide area networks (WANs) and their attributes (packet loss, reordering, corruption, latency, jitter, etc.), data transfer applications for high-speed networks, queueing delay and buffer size in routers and switches, active monitoring of multi-domain systems, high-performance cybersecurity tools such as Zeek’s intrusion detection systems, and others. The training platform has been also used by IT professionals from more than 30 states, for self-pace training. The material provides training on topics beyond general-purpose network, which are usually overlooked by practitioners and researchers. The virtual laboratories and companion material have also been used in workshops organized across the country. Workshops are co-organized with organizations that operate large backbone networks connecting research centers and national laboratories, and colleges and universities conducting teaching and research activities. 
    more » « less