<?xml-model href='http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng' schematypens='http://relaxng.org/ns/structure/1.0'?><TEI xmlns="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<fileDesc>
			<titleStmt><title level='a'>Querying Container Provenance</title></titleStmt>
			<publicationStmt>
				<publisher></publisher>
				<date>04/30/2023</date>
			</publicationStmt>
			<sourceDesc>
				<bibl> 
					<idno type="par_id">10416713</idno>
					<idno type="doi">10.1145/3543873.3587568</idno>
					<title level='j'>WWW '23 Companion: Companion Proceedings of the ACM Web Conference 2023</title>
<idno></idno>
<biblScope unit="volume"></biblScope>
<biblScope unit="issue"></biblScope>					

					<author>Aniket Modi</author><author>Moaz Reyad</author><author>Tanu Malik</author><author>Ashish Gehani</author>
				</bibl>
			</sourceDesc>
		</fileDesc>
		<profileDesc>
			<abstract><ab><![CDATA[Containers are lightweight mechanisms for the isolation of operating system resources. They are realized by activating a set of namespaces. Since use of containers is rising in scientific computing, tracking and managing provenance within and across containers is required for debugging and reproducibility. In this work, we examine the properties of container provenance graphs that result from auditing containerized scientific computing experiments. We observe that the generated container provenance graphs are hypergraphs because one resource may belong to one or more namespaces. We examine the behavior of three namespaces, namely the PID, mount, and user namespaces, that are prominently used in scientific computing and show that operations over namespaces do not result in cycles in the resulting container provenance graphs. Thus we can identify container boundaries, distinguish container processes from host processes, and answer conjunctive lineage queries in polynomial time. We experiment with complex lineage queries on container provenance graphs and show the hypergraph formulation helps us answer those queries more efficiently than a non-hypergraph formulation.]]></ab></abstract>
		</profileDesc>
	</teiHeader>
	<text><body xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">INTRODUCTION</head><p>The conduct of reproducible science improves when computations are both portable and evaluable. A container provides an isolated environment for running computations and thus is useful for porting applications on new machines. Managing an array of virtualized containers is becoming increasingly typical for data and code sharing platforms such as Binder <ref type="bibr">[2]</ref>, Hydroshare <ref type="bibr">[4]</ref>, WholeTale <ref type="bibr">[8]</ref> that enable users to port applications and execute them repeatedly on the platform.</p><p>Despite isolation, applications may fail to reproduce, especially as containerized applications are run repeatedly with different input datasets and parameters <ref type="bibr">[19]</ref>. Since application evaluation for reproducibility may happen at different points in time, it is essential to track provenance of applications within containers to provide insights and comprehend the causes of failure <ref type="bibr">[13,</ref><ref type="bibr">23]</ref>. Tracking the provenance of containerized applications, however, raises some unique research challenges. Containers are ephemeral with a limited lifetime <ref type="bibr">[15]</ref>. Once an execution completes, the container runtime frees up resources. This necessitates that provenance records are archived on persistent storage so we can reuse them during assessment and subsequent evaluations.</p><p>One possible design policy is to securely share these records with the shared-host substrate, which provides a centralized platform and is aware of the array of containers running on it. Consider a shared substrate that stores the system level provenance graph of an application run at time &#119905; and then subsequently at time &#119905; &#8242; (Figure <ref type="figure">1</ref>). Resolving cross-container provenance records is challenging, as the same physical resource may appear differently within isolated contexts and at different points in time. As shown in Figure <ref type="figure">1</ref> the same file at path /&#8462;&#119900;&#119898;&#119890;/&#119908;&#119900;&#119903;&#119896;/&#119889;&#119886;&#119905;&#119886;&#119904;&#119890;&#119905;/&#119869;&#119886;&#119899;.&#8462;&#119889; &#119891; 5 is visible as /&#119905;&#119898;&#119901;/&#119889;&#119886;&#119905;&#119886;&#119904;&#119890;&#119905;/&#119869;&#119886;&#119899;.&#8462;&#119889; &#119891; 5 first time but gets mounted as /&#119889;&#119886;&#119905;&#119886;&#119904;&#119890;&#119905;/&#119869;&#119886;&#119899;.&#8462;&#119889; &#119891; 5 next time. An alternative approach is for the shared substrate to be container-aware and collect records so that only the host's view (top view in Figure ) is persisted. However, users of containerized applications are not aware of resource specification from the host's view, which in the case of Figure <ref type="figure">1</ref> is the path /&#8462;&#119900;&#119898;&#119890;/&#119908;&#119900;&#119903;&#119896;/&#119889;&#119886;&#119905;&#119886;&#119904;&#119890;&#119905;/&#119869;&#119886;&#119899;.&#8462;&#119889; &#119891; 5. Consequently, tracking records from both the host substrate and the container-specific execution becomes necessary. This also necessitates that the host substrate effectively maintains the mapping (grey lines) between the host view and the isolated contexts.</p><p>In this paper, we consider issues in maintaining this mapping of cross-container records at the shared substrate for container provenance analysis. Container-awareness results in processes mapping to different isolated contexts, such as &#119875; 1 mapping to &#119875; &#8242; 1 and &#119875; &#8242;&#8242; 1 . In addition, processes such as &#119875; 2 and &#119875; 3 in Figure <ref type="figure">1</ref> may unshare at a later point in time &#119905; &#8242;&#8242; and form a new isolated context but continue to read the file from the same path. Thus the resulting container provenance graph has not only pair-wise edges between files and processes in the same isolated context, but must also maintain nonpairwise relationships between files and the process identifiers in different isolated contexts. We show that such higher-order relationships are easily modeled as hyper graphs at the shared substrate. Hypergraphs are multigraphs that allow edges between multiple nodes of a traditional graph. We consider different lineage queries on container hyper graphs. We show that despite being namespace-aware the resulting container hypergraph is acyclic and therefore conjunctive lineage querying on namespace-aware container records terminates in polynomial time. Our experiments show that our hyper graph formulation applies to records collected from Docker containers and related benchmarks, and that our queries terminate in reasonable time.</p><p>The rest of this paper is structured as follows. We provide a basic overview of namespaces, containers, and provenance tracking in containerized hosts in Section 2. We then show how resources across namespaces map to nodes in a hypergraph 3. We formulate a directed hypergraph and describe forward lineage queries are acyclic in Section 3.1. Section 4 describes an efficient implementation of hypergraphs with experiments. We discuss related work in Section 5, and we conclude in Section 6.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">BACKGROUND</head><p>We provide basic background information on Linux containers and namespaces, and their provenance tracking.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1">Namespaces</head><p>An operating system namespace provides a set of processes the illusion that they have complete control of a resource. The kernel ensures that different instances of the same namespace are isolated, allowing a global resource to be shared without any changes to the application's interfaces to the system. The Linux kernel wraps various global system resources such as PIDs, hostnames, mount points, user identifiers, time, network devices and ports, interprocess-communication, and resource accounting information into namespaces. Each of the namespaces provides an isolated view of the particular global resource to the set of processes that are members of that namespace. Figure <ref type="figure">2</ref> shows an example of the mount namespace. On a Linux operating system that has just been booted, every process runs in an initial mount namespace, accesses the same set of mount points, and has the same view of the filesystem. Once a new mount namespace is created, the processes inside the new mount namespace can mount and alter the filesystems on its mount points without affecting the filesystem in other mount namespaces. One of the significant uses of namespaces is to support the implementation of containers, a tool for lightweight virtualization. Within containers, our examples focus on PIDs and mount point resources, since data flow tracking heavily relies on these resources, but our approach of modeling provenance graphs over namespaces applies to all kinds of system resources. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2">Containers</head><p>Linux containers may be viewed as a set of running processes that collectively share common namespaces and system setup. In practice, containers are usually created by a container engine using its container runtime. The container runtime will specify the namespaces to be shared among processes running inside the container. In general there are several runtimes such as LXC <ref type="bibr">[6]</ref>, rkt[7], Mesos <ref type="bibr">[1]</ref>, Docker <ref type="bibr">[3]</ref>, Singularity <ref type="bibr">[17]</ref>, and Charlie Cloud <ref type="bibr">[21]</ref>. Each of these runtimes differ in their application programming interface (API) and how they manage creation, destruction and persistence of namespaces. Our treatment of provenance tracking is at the system level and thus while we respect the same container boundary that all engines recognize, our formalism is independent of the specific APIs used by the specific runtime.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3">Namespace and Container-awareness in Provenance Systems</head><p>Figure <ref type="figure">3</ref> shows a provenance graph of a containerized application running on a host system that is also executing the same application.</p><p>The graph is obtained from provenance systems that track data flows at the operating system level <ref type="bibr">[14,</ref><ref type="bibr">20]</ref>. We particularly note that the Linux auditing mechanisms such as Linux Audit, SysDig, and Lttng do not automatically generate such sound provenance graphs. Current provenance tracking systems rely on a combination of host-container mapping view and namespace-labeling approaches that disambiguate and map virtual nodes with host nodes on the provenance graph to generate sound provenance graphs. This soundness property is demonstrated in the Figure as it shows for a process its real identifier in the host namespace and virtual identifiers in the containerized namespace. Similarly the virtualized file path is different from the real file path even though the underlying inode is the same. From a querying perspective, however, the representation of namespace information within the audited provenance graph is sub-optimal. Consider the process id 3030 which is mentioned in namespace 4026532270 but is truly in namespace 4026531836. Thus queries such as "what are the processes running in namespace 4026531836?" will not return accurately. While one may argue that all processes are in the host namespace and such a query is easily modeled by querying all process nodes, the argument is not true in the case of nested containers or shared mount trees, which </p><p>Figure <ref type="figure">4</ref>: Container queries are often created for performance purposes. Thus determining all the processes in a namespace will be incorrect and so will be any query that determines the forward or backward lineage path on which such nodes lie. Similarly the provenance semantics that file paths of the file used by processes appear distinct in the graph, but to the same physical file are not captured.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">QUERYING CONTAINER PROVENANCE</head><p>Sound provenance records collected by provenance tracking systems are typically maintained at the host substrate. These records include pairwise edges between process and files, but maintain namespace relationships as properties of the node and not as a graph relationship. Consider a provenance query on container graphs such as find which processes identifiers wrote to a file visible across namespaces 1-3, will return all processes identifiers across all namespaces since identifiers in different namespaces are not separated. Figure <ref type="figure">4</ref>(a) distinguishes the returned query result from the expected one in Figure <ref type="figure">4</ref>(b). Similarly, consider another provenance query on container graphs such as find all resources that were derived from each other in namespaces 1, 2, and 3. Figure <ref type="figure">4(c)</ref> shows the directed subgraph that is obtained as a result of this query. However, the directed graph represents derivation across all namespaces. It does not reflect the grouped derivation between individual namespaces as shown in Figure <ref type="figure">4(d)</ref>.</p><p>We observe that to answer the above queries correctly, simple graph edges do not capture the higher-order relation which connects these multiple objects, but is more appropriately captured by a hypergraph which is a generalized graph data structure, where an edge can connect any number of vertices. In general, a hypergraph is a couple &#119867; = (&#119881; , &#119864;) consisting of a finite set &#119881; and a set &#119864; of non-empty subsets of &#119881; . The elements of &#119881; are called vertices and those of &#119864; are called hyperedges. While a regular graph edge is a pair of nodes, a hyperedge &#119890; &#8712; &#119864; connects a set of vertices {&#119907; } &#8838; &#119881; .</p><p>A primary concern with graph-based querying is ensuring that underlying graphs are acyclic, so conjunctive queries do not take exponential time. In non-container system graphs, this is obtained via versioning of process and file nodes: every write to a file after close is versioned, and every read by a process leads to versioning of the process nodes. With process and file nodes arising due to namespaces as explicit nodes in a graph, we ensure that the resulting graph is acyclic. In the following subsection we define a path in a directed container hypergraph, and show that such a path will never be cyclic based on namespace system calls. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1">Acyclicity in container provenance</head><p>We show for different namespaces such path cycles do not exist.</p><p>&#8226; PID namespace. Cycles do not occur in PID namespaces because, while processes may freely descend into child PID namespaces (e.g., using setns(2) with a PID namespace file descriptor), they may not move in the other direction. That is to say, processes may not enter any ancestor namespaces (parent, grandparent, etc.). Changing PID namespaces is a one-way operation. This remains true irrespective of the type of namespace call such as clone, unshare, setns. Thus a process's PID namespace membership is determined when the process is created and cannot be changed thereafter. This means that the parental relationship between processes mirrors the parental relationship between PID namespaces: the parent of a process is either in the same namespace or resides in the immediate parent PID namespace. &#8226; Mount namespace. Mount namespaces are not nested and yet cycles do not occur because use of system calls such as chroot and pivot_root lead to unmounting of the host filesystem, making it impossible to access any file within it in a child namespace.</p><p>This acyclicity is true irrespective of the mount flags used during propagation of mount points. &#8226; User, network and UTS namespaces. These namespaces do not create cycles as these namespaces create one-one mapping between resources in the parent and child namespaces. For example, cycles do not occur in user namespace since uid and gid mappings are only set in the parent namespace for the child namespace. While the same user can be mapped to different identifiers in child namespaces, the mapping only leads to a hierarchical structure and thus avoids cycles.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">HYPERGRAPH IMPLEMENTATION AND EXPERIMENTS</head><p>We have mimiced hyperedge structure using n-ary relationships in a Postgres database. Our basic objective was to identify hypergraph structure in available container provenance graphs. We store the incidence matrix of the hypergraph, which stores the vertices that each hyperedge contains (rows correspond to vertices, columns correspond to hyperedges, and nonzeros &#119894;,&#119895; designate that hyperedge &#119895; contains vertex &#119894;). The incidence matrix allows for quickly determine if two processes are in the same namespace. We used three container provenance graphs that were generated in <ref type="bibr">[9]</ref> which were on Docker benchmarks and Kubernetes CVEs. Table <ref type="table">1</ref> shows basic details about container provenance graphs. In #processes and #files, the number outside bracket is the total number, including all versions of all files/processes and the number in bracket is the number ignoring versions. Table <ref type="table">2</ref> shows our result. The analysis ignores file versioning and if a file was introduced in multiple namespaces in a later version, and pathnames dont exist for that version, then that is not counted.  </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">RELATED WORK</head><p>Containers are implemented with Linux namespaces <ref type="bibr">[16]</ref>. Both containers and namespaces create challenges for provenance collection. Clarion <ref type="bibr">[12]</ref> solves the provenance clarity and soundness challenges that exist in Linux Audit framework <ref type="bibr">[5]</ref>. Tracing the execution provenance of containers became an interesting problem in the security domain. There are systems that uses provenance to solve security challenges such as Container Escape Detection <ref type="bibr">[9]</ref>. PROV <ref type="bibr">[18]</ref> defines a provenance model and its serializations. The PROV data model (PROV-DM) does not define the concept of container or even a more generic concept like context that can be used to model containers. The most close concept is collection If we model computer resources (e.g. files, processes, users) as entities, we can add them to named collections through the HadMember relation. This is a very simple representation of a namespace as a collection that had members which are resources that belong to it. A more advanced form of context should be added to PROV if we need to model containers. PROV can be serialized with RDF or OWL. Both of them lack the support for contexts. For RDF, PaCE <ref type="bibr">[22]</ref> aims to add context to provenance as a special entity. For OWL, C-owl <ref type="bibr">[11]</ref> defines a context by its local contents which is not shared. This is similar to namespaces local IDs of resources. Hyper graphs <ref type="bibr">[10]</ref> model various types of objects and the relations between them. RDF data model can be represented as hyper graph natively in System &#928; <ref type="bibr">[24]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6">CONCLUSIONS</head><p>The increasing interest in containers and their wide usage in numerous applications inspired a careful study of their provenance. We presented the problem of querying provenance hyper graphs in containerized applications. We formalized the definition of hypergraphs and identified hypernodes and hyperedges in real world datasets. In the future, we plan to efficiently query large provenance hypergraphs.</p></div></body>
		</text>
</TEI>
