<?xml-model href='http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng' schematypens='http://relaxng.org/ns/structure/1.0'?><TEI xmlns="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<fileDesc>
			<titleStmt><title level='a'>Prefetching and caching for minimizing service costs: Optimal and approximation strategies</title></titleStmt>
			<publicationStmt>
				<publisher></publisher>
				<date>01/01/2021</date>
			</publicationStmt>
			<sourceDesc>
				<bibl> 
					<idno type="par_id">10229901</idno>
					<idno type="doi">10.1016/j.peva.2020.102149</idno>
					<title level='j'>Performance Evaluation</title>
<idno>0166-5316</idno>
<biblScope unit="volume">145</biblScope>
<biblScope unit="issue">C</biblScope>					

					<author>Guocong Quan</author><author>Atilla Eryilmaz</author><author>Jian Tan</author><author>Ness Shroff</author>
				</bibl>
			</sourceDesc>
		</fileDesc>
		<profileDesc>
			<abstract><ab><![CDATA[Strategically prefetching data has been utilized in practice to improve caching performance. Apart from caching data items upon requests, they can be prefetched into the cache before requests actually occur. The caching and prefetching operations compete for the limited cache space, whose size is typically much smaller than the number of data items. A key question is how to design an optimal prefetching and caching policy, assuming that the future requests can be predicted to certain extend. This question is non-trivial even under an idealized assumption that the future requests are precisely known.To investigate this problem, we propose a cost-based service model. The objective is to find the optimal offline prefetching and caching policy that minimizes the accumulated cost for a given request sequence. By casting it as a min-cost flow problem, we are able to find the optimal policy for a data trace of length N in expected time O(N 3/2 ) via flow-based algorithms. However, this requires the entire trace for each request and cannot be applied in real time. To this end, we analytically characterize the optimal policy by obtaining an optimal cache eviction mechanism. We derive conditions under which proactive prefetching is a better choice than passive caching. Based on these insights, we propose a lightweight approximation policy that only exploits predictions in the near future. Moreover, the approximation policy can be applied in real time and processes the entire trace in O(N) expected time. We prove that the competitive ratio of the approximation policy is less than √ 2. Extensive simulations verify its near-optimal performance, for both heavy and light-tailed popularity distributions.]]></ab></abstract>
		</profileDesc>
	</teiHeader>
	<text><body xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>Proactively prefetching data items instead of passively caching them has been utilized in practice to accelerate data access, e.g, for content data networks <ref type="bibr">[1,</ref><ref type="bibr">2]</ref>. This strategy becomes even more appealing given that the advances in learning techniques provide effective tools to predict various data request patterns <ref type="bibr">[3,</ref><ref type="bibr">4,</ref><ref type="bibr">5,</ref><ref type="bibr">6]</ref>. For certain applications, the prediction can be reasonably accurate <ref type="bibr">[7,</ref><ref type="bibr">8]</ref>. To design an optimal strategy that combines prefetching and caching demands careful investigation. Proactively prefetching data brings the data into the cache before the actual requests occur. Passively caching data, on the other hand, only fetches the missed data from the backend storage after the requests arrive.</p><p>There is a trade-off between prefetching and caching. Due to competing the limited cache space, loading a prefetched data item into the cache typically has to trigger cache evictions, which may potentially introduce more cache misses for future requests. Although great efforts have been put to approximate the short-term and long-term data statistics to prefetch the most popular data <ref type="bibr">[9,</ref><ref type="bibr">10,</ref><ref type="bibr">11,</ref><ref type="bibr">12,</ref><ref type="bibr">13,</ref><ref type="bibr">14]</ref>, a fundamental question remains to be answered: even with a perfect knowledge of future requests, how to optimally prefetch data items beforehand instead of caching them upon requests?</p><p>Knowing the entire request sequence defines an offline algorithm, which nevertheless can help an online deployment. Using predictions has successfully made optimal offline policies practical in real applications <ref type="bibr">[4,</ref><ref type="bibr">5,</ref><ref type="bibr">6]</ref>. Theoretically, the optimal offline policy provides an effective performance bound for online cases <ref type="bibr">[15]</ref>. It can also be used to guide the online design. For example, by leveraging machine learning, an optimal offline policy can be used to train an online decision model using history information <ref type="bibr">[16]</ref>.</p><p>Previous studies on optimal offline prefetching and caching policies mainly focus on file systems <ref type="bibr">[17,</ref><ref type="bibr">18,</ref><ref type="bibr">19,</ref><ref type="bibr">20,</ref><ref type="bibr">21,</ref><ref type="bibr">22]</ref>, where prefetching and caching are not clearly distinguished. However, in many other important scenarios, e.g., for CDNs, prefetching and caching have significant differences. First, prefetching and caching can design separate cache update rules. For prefetching, the prefetched data remain in the cache until the future requests arrive. For caching, when a miss occurs, the requested data item is directly fetched from the backend, but whether to put it into the cache or not is up to the cache policy <ref type="bibr">[15]</ref>. Second, prefetching incurs lower costs than caching. If fetching a missed data item is scheduled after the arrival of a request, it must be performed urgently to satisfy delay requirements. Prefetching, on the other hand, is performed beforehand and is not time-sensitive. It can be performed at a lower rate and avoid congestions, which allows more flexibility for scheduling. For example, data can be prefetched at off-peak times to reduce costs <ref type="bibr">[23,</ref><ref type="bibr">24]</ref>. Third, prefetching can achieve a lower service delay by loading the data into the cache in advance. Therefore, the existing analysis on prefetching and caching for file systems does not directly apply in the above-mentioned scenarios.</p><p>To address this issue, we propose a cost-based service model to jointly optimize prefetching and caching, by assuming that prefetching incurs a lower cost than caching, due to less I/O consumption, more flexibility in scheduling and lower service delays. If the requested data is not cached, it can be served by fetching the data from the backend data storage after the request arrives, by paying a fetching cost. And the fetched data can be either loaded into the cache, or discarded to save space. Based on the predictions, we can prefetch the data items before they are requested. The prefetched data items need to remain in the cache until the requests arrive. Otherwise, they should not be prefetched at the first place, assuming that we know the future requests. With the goal to minimize the accumulated cost, we decide whether to cache a missed data item when a request occurs or prefetch it before the request arrives. We propose flowbased algorithms to find the optimal offline policy, as well as a lightweight "look-ahead" approximation policy that only knows the request information in the near future. These new designs not only reveal the fundamental trade-off between prefetching and caching, but also provide useful insights to improve real applications.</p><p>Our contributions are summarized as follows.</p><p>&#8226; We propose a cost-based caching model where different costs will be incurred depending on whether a missed data item is prefetched or fetched. With the objective to understand the fundamental trade-off between prefetching and caching, we investigate the optimal offline policy that minimizes the accumulated cost (see Section 2).</p><p>&#8226; We reformulate the optimal prefetching and caching problem as a min-cost flow problem. For a given request sequence of length N, the optimal policy can be obtained by flow-based algorithms in O(N 3/2 ) expected time (see Section 4).</p><p>&#8226; We analytically characterize the optimal policy by providing sufficient conditions under which prefetching the missed data is the optimal choice (see <ref type="bibr">Section 5)</ref>. Moreover, we prove that consistently prefetching is not always optimal, with a competitive ratio as high as 2, depending on the future requests and the prefetching cost (see <ref type="bibr">Section 6)</ref>.</p><p>&#8226; We propose a lightweight "look-ahead" approximation policy based on the insights revealed by the characteristics of the optimal policy. The approximation policy can be executed in real time and processes the entire trace in O(N) expected time. Performance guarantees are provided by deriving the competitive ratio (see Section 6).</p><p>&#8226; We conduct extensive experiments using real CDN traces and synthetic data requests that are generated from both heavy and light-tailed popularity distributions. The approximation policy always achieves near-optimal average performance (see <ref type="bibr">Section 8)</ref>.</p><p>Related Works: Caching algorithms have been extensively studied. It is known that Belady's algorithm <ref type="bibr">[25]</ref> is an optimal offline eviction policy that minimizes the number of misses, assuming that the data items have identical sizes.</p><p>Specifically, it evicts the data item that is requested farthest in the future. When the data sizes are not identical, Belady's algorithm is no longer optimal, and finding the optimal offline policy is NP-hard. A few approximation policies have been proposed with different complexities and performance bounds <ref type="bibr">[26,</ref><ref type="bibr">27,</ref><ref type="bibr">28,</ref><ref type="bibr">20]</ref>. One recent work <ref type="bibr">[15]</ref> provides an asymptotically optimal solution and practical approximation algorithms with tight performance bounds for real traces. It leverages a flow-based representation, and shows that the optimal offline caching policy can be obtained by solving a min-cost flow problem. These offline policies have successfully guided the design of online algorithms <ref type="bibr">[16]</ref>.</p><p>Prefetching strategies, together with caching algorithms, have been widely explored in real applications, including processor architectures <ref type="bibr">[29,</ref><ref type="bibr">30]</ref>, file systems <ref type="bibr">[31,</ref><ref type="bibr">32]</ref> and networks <ref type="bibr">[33,</ref><ref type="bibr">34,</ref><ref type="bibr">35,</ref><ref type="bibr">24,</ref><ref type="bibr">36]</ref>. The offline optimal strategies have been studied for disk systems with an objective to minimize the stall time <ref type="bibr">[17,</ref><ref type="bibr">18,</ref><ref type="bibr">19,</ref><ref type="bibr">20,</ref><ref type="bibr">21,</ref><ref type="bibr">22]</ref>. It is shown in <ref type="bibr">[37]</ref> that the optimal offline solution can be found in polynomial time for single-disk systems. This problem is reformulated as a min-cost multi-commodity flow problem in <ref type="bibr">[21]</ref>. For disk systems, the existing work does not distinguish the costs caused by caching and prefetching, which makes proactive prefetching almost always a better choice than passive caching. In this paper, we consider the scenarios where prefetching and caching can have different costs. Interestingly, we show that consistently prefetching is not always optimal. The new insights can be used to further improve the design of prefetching and caching mechanisms.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Problem Formulation</head><p>Consider a set of data items D = {d i : 1 &#8804; i &#8804; M} of unit sizes, and a sequence of data requests that arrive at the time points {&#964; n , 1 &#8804; n &#8804; N}. Let R n (R n &#8712; D) denote the data item that is requested at time &#964; n and {R n } N n=1 denote the entire request sequence. We assume that {R n } N n=1 is known. If the requested data item is already in the cache, then the request can be served without paying a cost. However, if it is not cached, we have two options to serve the corresponding data request at different costs. The first option is to fetch the data from the backend after the request arrives, paying a fetching cost 1. We can decide whether to load the fetched item into the cache or not. The second option is to prefetch the data item before it is requested, paying a prefetching cost c, 0 &#8804; c &#8804; 1. Note that the prefetched data item has to be loaded into the cache. If the cache space is full, other items must be evicted before storing a new one. For ease of analysis, we first assume a best-case scenario where the prefetched data item is loaded into the cache right before it is requested. In Section 7, we will show that this assumption could be waived to some extent.</p><p>We observe that the optimal offline policy shall satisfy the following two properties. Property 1 (Interval caching decisions): As illustrated in <ref type="bibr">[15]</ref>, the optimal offline policy will not evict a cached data item d i between two requests for d i . Consider an example where d i is initially cached and requested at &#964; 1 and &#964; 5 . It is suboptimal to evict d i at some time (e.g., &#964; 3 ) between &#964; 1 and &#964; 5 , because storing d i in (&#964; 1 , &#964; 3 ] does not serve any requests and is a waste of caching resource. A wiser decision will be evicting d i right after serving R 1 or caching it at least until R 5 arrives. Therefore, we focus on caching policies that only evict a cached data item immediately after serving a request for it. Property 2 (Fetching without caching): When the optimal policy fetches a missed request, it must not load the fetched data into the cache and trigger evictions, because otherwise prefetching will be a better option. Therefore, it is sufficient to consider the policies such that data items will only be loaded into the cache by prefetching.</p><p>In the rest of the paper, we restrict the design space by only considering the policies that satisfy these two properties, and call them feasible policies. An optimal offline policy is guaranteed to exist in this design space.</p><p>Let l n = max {i &lt; n : R i = R n }, which indicates that &#964; l n is the most recent time when R n is requested. If R n is the first request for that item, then set l n = 0. Formally, we define three decision variables for each request R n , 1 &#8804; n &#8804; N:</p><p>x n 1(Store R n in the cache during (&#964; l n , &#964; n ]),</p><p>where 1(E) is an indicator function that takes value 1 if the event E occurs, and 0 otherwise. The optimal offline policy for a cache of size b is the solution of the following optimization problem.</p><p>x n + f n + p n &#8805; 1 for &#8704;n</p><p>x n , f n , p n &#8712; {0, 1} for &#8704;n (4)</p><p>If x n = 0, i.e., R n is not stored in cache at &#964; n , then a cost f n + cp n will be induced depending on whether R n is fetched or prefetched as shown in the objective function <ref type="bibr">(1)</ref>. The cache capacity constraint is described by <ref type="bibr">(2)</ref>. If R n is prefetched (i.e., p n = 1), then the cache should have an available space to accommodate the prefetched data at &#964; n . Furthermore, Constraints (3) and (4) guarantee that the request R n must be either directly served from the cache or prefetched/fetched from the backend data storage.</p><p>Solving the optimal solution is equivalent to answering the following two questions: Q1: If a request is not cached, should we prefetch it beforehand or fetch it upon the request? Q2: If a data item is prefetched and the cache is full, which item should be evicted?</p><p>We will analytically answer these two questions in Section 5. And here are the short answers: A1: Prefetching the requested data is a better choice, if</p><p>&#8226; there exist requests for popular items in the near future, and the popular items are not cached currently, or</p><p>&#8226; the prefetching cost is sufficiently low.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A2:</head><p>The farthest-in-future data item should be evicted, if we choose to prefetch an item and the cache is full.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Motivating Example</head><p>In this Section, we will introduce a motivating example to show that 1) prefetching is not always beneficial; 2) the optimal policy is non-trivial and depends on the prefetching cost. We always fetch the missed data after the request arrives, and make caching decisions based on Belady's algorithm which is optimal for caching without prefetching <ref type="bibr">[38]</ref>. For this specific example, Belady's algorithm will never update the cache content. As shown in Figure <ref type="figure">2</ref>, a total cost 3 will be incurred for the given sequence. Strategy 3 (Combination of fetching and prefetching): As shown in Figure <ref type="figure">3</ref>, the first request is not stored in the cache and the requested data d 3 is fetched. Then, R 4 , R 5 , R 7 are prefetched with d 1 , d 4 , d 5 being evicted, respectively. A total cost 3c + 1 is incurred by this strategy.  We have the following two observations from this motivating example. Observation 1: Prefetching is not always beneficial. If c &gt; 3/5, Strategy 1 (always prefetching) will even incur a larger accumulated cost than Strategy 2 (always fetching). If the prefetching cost c is considerably small, then prefetching the data item before the request will be a better choice than fetching it upon the request. Observation 2: The optimal policy is non-trivial and highly depends on the prefetching costs. Actually, all the three strategies described above are optimal policies for the given trace and some specific c values. Specifically, Strategies 1, 2, 3 are optimal for c &#8712; (0, 1/2], (2/3, 1], (1/2, 2/3], respectively.</p><p>Therefore, the optimal prefetching and caching decision depends on the joint effect of future requests and the prefetching cost c. Even for the same given trace, a fixed policy cannot work uniformly well for different c values. When the trace is long, the design space can be considerably large, since the possible combinations of fetching and prefetching will increase exponentially. How to efficiently find the optimal policy is a challenging task.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Optimal Policy via Min-Cost Flow</head><p>Instead of directly exploring the optimization problem (1), we leverage the underlying structure of prefetching and caching, and reformulate it as a min-cost flow problem which aims to send a certain amount of flow through a flow network at a smallest cost. We will show that the optimal prefetching and caching policy can be constructed from the min-cost flow.</p><p>In this paper, we use the flow notations that are shown in Figure <ref type="figure">4</ref>. A flow network is represented by a directed graph where each edge is associated with a parameter tuple (capacity, cost). The total amount of flow going through an edge must not exceed its capacity. And a cost per unit flow will be charged for the flow going through the edge.</p><p>The node i is associated with a number &#946; i representing its surplus/demand. If &#946; i &gt; 0, then the node is a source node. If &#946; i &lt; 0, then the node is a sink node. We will not label &#946; i in the graph if it is zero.</p><p>In <ref type="bibr">[15]</ref>, the min-cost flow representation is used to solve optimal offline caching without prefetching, where each request is represented by a node. By constructing a proper flow network, the optimal offline caching can be constructed from the min-cost flow solution. However, the flow network constructed in <ref type="bibr">[15]</ref> does not support prefetching operations, and therefore cannot be applied to our settings. To this end, we propose a more general min-cost flow representation, which supports both proactive prefetching and passive caching. In Section 4.1, we will show how to construct a flow network for a given sequence of requests. Then, we will prove that the optimal prefetching and caching policy can be obtained by finding the min-cost flow in Section 4.2.</p><p>Trace: </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.">Flow Network Construction</head><p>For a given sequence of requests, a corresponding flow network can be constructed, where each request is represented by four or five nodes and the nodes are connected by five types of edges including caching edges, fetching edges, prefetching edges, eviction edges and auxiliary edges. Detailed construction steps are presented as follows. The result of each step is illustrated in Figure <ref type="figure">6</ref> for the request sequence shown in Table <ref type="table">1</ref> and a cache of size 2.</p><p>Step 1 (Generate nodes): For each request in the trace, if it is the first time to request the data, then generate five nodes where three of them are placed in the first row and two in the second, as shown in Figure <ref type="figure">5</ref>. To facilitate the description of the following steps, we denote these nodes by n 0 , n 1 , n 2 , n 3 and n 4 . If the request is not the first request for the data, then we will only generate n 1 , n 2 , n 3 and n 4 nodes for that request. Moreover, for each data item, let the n 0 node of its first request be the source node, and the n 2 node of its last request be the sink node. These nodes are added for the example trace in Figure <ref type="figure">6a</ref>. Step 2 (Add caching edges): As shown in Figure <ref type="figure">6a</ref>. link the nodes in the second row by edges with capacity = b, cost = 0, where b is the cache size. These edges are named caching edges. A flow going through these edges represents that the corresponding request is stored in the cache.</p><p>Step 3 (Add prefetching edges): For each request, add an edge that is directed from the n 2 node of the last request for the same data and to the n 3 node of the request. If the request is the first request for that data, then add an edge directed from its source node n 0 to the n 3 node. The capacity of the edge is 1 and the cost is c. The flow going through these edges means that the corresponding requested data is prefetched. We add these prefetching edges in Figure <ref type="figure">6b</ref>.</p><p>Step 4 (Add eviction edges): For each request, add an edge directed from its n 4 node to its n 1 node with capacity = 1 and cost = 0, as shown in Figure <ref type="figure">6c</ref>. The flow going through these edges indicates that the corresponding requested data is evicted.</p><p>Step 5 (Add fetching edges): For each request, if it is not the last request for that data item, add an edge directed from the n 2 node of the request and to the n 1 node of the next request for the same data item, as shown in Figure <ref type="figure">6d</ref>. Moreover, if it is the first request for the corresponding data item, add an edge directed from its n 0 to its n 1 node. The parameter tuple (capacity, cost) for these edges is set to be (1, 1). The flow going through these edges indicates that the corresponding request is a miss and the requested data is fetched.</p><p>Step 6 (Add auxiliary edges): For each request, add an auxiliary edge directed from its n 1 node to the n 2 node, as shown in Figure <ref type="figure">6e</ref>. The parameter tuple (capacity, cost) is set to be (1, 0). The capacity of the auxiliary edge guarantees that the amount of flow going through the prefetching and the fetching edges must not exceed 1. The function of these auxiliary edges is to ensure that an integer flow routing solution can correspond to a feasible prefetching and caching policy (see Section 4.2).</p><p>According to the proposed six steps, a flow network can be constructed for a given data trace. See Figure <ref type="figure">6e</ref> for the flow network constructed for the example trace in Table <ref type="table">1</ref>. In Section 4.2, we will demonstrate how the constructed flow network can be leveraged to solve the caching problem.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.">Optimal Prefetching and Caching Policy</head><p>In this section, we will leverage the constructed flow network to find the optimal offline prefetching and caching policy. Specifically, we will show that there is an one-to-one correspondence between feasible prefetching and caching policies and integer flow routing solutions in the flow network.</p><p>Theorem 1. For a given data trace, a feasible prefetching and caching policy corresponds to an integer flow routing solution in the constructed flow network, and vice versa. An optimal prefetching and caching policy corresponds to an integer min-cost flow, and vice versa.  The proof of Theorem 1 is straightforward. For a feasible prefetching and caching policy, we can construct an integer flow routing solution in the constructed flow network where the amount of flow going through each edge is an integer. Specifically, any cache operation (i.e., fetching, prefetching or eviction) corresponds to an edge (i.e., fetching, prefetching or eviction edge) in the flow network. Based on how the request is served, we can route the flow through the corresponding edges. Similarly, given an integer flow routing solution, a corresponding feasible prefetching and caching policy can be constructed from it.</p><p>Furthermore, the cost achieved by a prefetching and caching policy is the same as the cost of its corresponding flow routing solution. Therefore, the optimal prefetching and caching policy can be obtained by finding the integer min-cost flow solution.</p><p>Theorem 1 shows that an optimal prefetching and caching policy can be obtained by finding an integer min-cost flow. Formally, we define the flow-based optimal offline policy as follows. Flow-based optimal offline policy (&#960; OPT ): Given a sequence of data requests, solve the integer min-cost flow for the flow network constructed according to the steps introduced in Section 4.1. Then prefetch, fetch or evict the data item if the min-cost flow is routed through the corresponding prefetching, fetching or eviction edges, respectively.</p><p>Note that the integer min-cost flow always exists, since the capacities, surpluses and demands are all integers in the flow network <ref type="bibr">[39]</ref>. Moreover, the integer min-cost flow can be efficiently solved, if the prefetching cost c is assumed to be a rational number. Given a trace with N requests, there are at most 5N nodes and 6N -1 edges in the corresponding flow network. For rational prefetching costs, the problem is solvable in O(N 3/2 ) expected time <ref type="bibr">[40,</ref><ref type="bibr">41,</ref><ref type="bibr">42]</ref>.</p><p>Although the proposed flow-based policy &#960; OPT is offline optimal, the problem is not satisfactorily solved for the following reasons:</p><p>&#8226; The flow-based algorithm cannot reveal the underlying insights of the optimal decision. It does not provide analytical answers to the two questions proposed in Section 2, i.e., whether to prefetch or fetch the missed data, and which data item to evict.</p><p>&#8226; The flow-based algorithm requires the knowledge about all future requests to find the optimal policy. Moreover, the policy cannot be executed in real time, since the optimal decision for a request is unavailable unless the process for the entire data trace is completed. In real practice, the request sequence could be too long to make this method practical <ref type="bibr">[15]</ref>.</p><p>These unanswered questions motivate us to</p><p>&#8226; analytically characterize the properties of the optimal policy and explicitly answer the questions proposed in Section 2 (see Section 5);</p><p>&#8226; design a lightweight approximation policy that requires only near-future information, can be executed in real time and achieves near-optimal performance (see Section 6).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Characteristics of Optimal Policy</head><p>In this section, we will analyze the characteristics of the optimal policy and explicitly answer the two questions proposed in Section 2. Notably, the questions will be addressed in reverse order. We will first show that evicting the farthest-in-future item is optimal when prefetching. Then, we will provide sufficient conditions under which prefetching is a better choice than fetching.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.1.">Optimal Eviction Mechanism</head><p>The farthest-in-future item is defined as the data item that is stored in the cache and will be reused after the other cached items. It is known that evicting the farthest-in-future item minimizes the number of misses for caching without prefetching <ref type="bibr">[38]</ref>. In this section, we will generalize this result and show that evicting the farthest-in-future item is the optimal choice to minimize the accumulated costs for caching with prefetching. We start by proving the following lemma.</p><p>Lemma 1. Assume that a policy &#960; n evicts the farthest-in-future items for the first n prefetching operations. Then, there exists a policy &#960; n+1 that evicts the farthest-in-future items for the first n + 1 prefetching operations and does not incur a larger cost than &#960; n .</p><p>Proof. Assume that &#964; m 1 is the first time when &#960; n prefetches a request (i.e., R m 1 ) but does not evict the farthest-in-future item. Let d i be the item that is evicted by &#960; n when prefetching R m 1 . We will construct a policy &#960; n+1 that prefetches R m 1 and evicts the farthest-in-future item without incurring a larger cost than &#960; n . The idea is to let &#960; n+1 eventually have the same cache content as &#960; n at some time point without introducing additional costs before that.</p><p>Assume that the next request for the farthest-in-future item arrives at &#964; m 2 , m 2 &gt; m 1 . We may interchangeably use R m 2 to denote the farthest-in-future item and the request for it. Next, we will prove the lemma by considering two possible cases. Case 1: Consider the case where R m 2 is evicted by &#960; n before &#964; m 2 . For each request, let &#960; n+1 make the same prefetching/fetching decision as &#960; n if the data is not cached by &#960; n+1 , and evict the same data as &#960; n if the data is stored in the cache. Then, there will be at most one item that is cached by &#960; n+1 but not cached by &#960; n . When &#960; n evicts R m 2 , which is not cached by &#960; n+1 according to the described update rule, let &#960; n+1 evict the only item that is not cached by &#960; n . Then the two policies lead to the same cache content, and no additional cost is introduced by &#960; n+1 . Case 2: Consider the case where &#960; n does not evict R m 2 before &#964; m 2 . Similar to Case 1, let &#960; n+1 always make the same decisions as &#960; n when possible. There is only one item that is cached by &#960; n+1 but not cached by &#960; n before &#964; m 2 . Then, at &#964; m 2 , let &#960; n+1 prefetch R m 2 and evict the item that is not cached by &#960; n . So far, the two policies will have the same cache content. Next, we will show that this prefetching operation by &#960; n+1 will not result in a larger accumulated cost. Since R m 2 is the farthest-in-future item at &#964; m 1 , there must exist a request for d i between R m 1 and R m 2 . Note that d i should be prefetched or fetched by &#960; n , but can be directly served by &#960; n+1 from the cache. This prefetching/fetching operation by &#960; n compensates for the prefetching operation performed by &#960; n+1 at &#964; m 2 . Therefore, &#960; n+1 will not incur a larger cost than &#960; n .</p><p>Next, we will prove the optimal eviction mechanism by leveraging Lemma 1.</p><p>Theorem 2. There exists an optimal policy that evicts the farthest-in-future item for all prefetching operations.</p><p>Proof. Let &#960; * be an optimal prefetching and caching policy. Assume that &#960; * evicts the farthest-in-future items for the first n prefetch operations. Applying Lemma 1, we can construct a new policy which evicts the farthest-in-future items for the first n + 1 prefetching operations without increasing the accumulated cost. Using an induction argument, we can conclude that there must exists an optimal policy that evicts the farthest-in-future item for all prefetching operations.</p><p>Theorem 2 shows that evicting the farthest-in-future item is optimal when a data item is prefetched and the cache is full, which provides an explicit answer to the question Q2 proposed in Section 2. In the rest of the paper, we always follow the farthest-in-future eviction principle unless other specific mechanisms are stated.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.2.">Optimal Conditions for Prefetching</head><p>In this section, we will analytically answer the question: whether we should prefetch or fetch an item if it is not stored in cache, assuming that the cache content is updated according to the farthest-in-future principle. In particular, we will provide sufficient conditions, under which prefetching is the optimal choice.</p><p>Let S n denote the set of cached data items before serving R n . If R n S n , then R n should be prefetched or fetched. Without loss of generality, assume that the cache is initially full and let S 1 denote the initial cache content. It suffices to analyze whether R 1 should be prefetched or fetched given that R 1 S 1 . Define</p><p>R &#963; is the farthest-in-future item, and &#964; &#963; is the first time when R &#963; is requested after &#964; 1 . Define &#969; = min{n &gt; 1 : R n &#8712; S 1 and R n is not requested in (&#964; n , &#964; &#963; )}.</p><p>Note that R &#963; is always the farthest-in-future item in the time interval [&#964; 1 , &#964; &#969; ], if no prefetching operation is performed.</p><p>We start by proving the following lemma.</p><p>Lemma 2. Assume c &lt; 1. For any optimal policy &#960; * , if &#960; * decides to fetch R 1 , then it must also fetch all requests R n ,</p><p>Proof. Suppose for the sake of contradiction that there exists an optimal policy &#960; * which fetches R 1 and prefetches R n , 1 &#8804; n &#8804; &#969;. Assume without loss of generality that R n is the first prefetched item after R 1 . Therefore, we have S 1 = S i for all 1 &#8804; i &#8804; n. Moreover, the farthest-in-future eviction principle yields S n+1 = S n &#8746; {R n }\{R &#963; }. To introduce a contradiction, we will design a new policy &#960; &#8226; which achieves the same cache state S n+1 as &#960; * , but incurs a lower cost.</p><p>For the same request sequence and initial cache state S 1 , let &#960; &#8226; prefetches R 1 and evicts R &#963; at &#964; 1 . For the requests arrive in the interval (&#964; 1 , &#964; n ), let &#960; &#8226; make the same caching decision as &#960; * . We have S n = S 1 &#8746; {R 1 }\{R &#963; }, since there is no prefetching operations by &#960; &#8226; in (&#964; 1 , &#964; n ). Then, let &#960; &#8226; prefetch R n and evict R 1 . Consequently, it achieves the same cache state S n+1 as &#960; * . Notably, the caching decisions of &#960; * and &#960; &#8226; in [&#964; 1 , &#964; n ] are all identical, except that &#960; * fetches R 1 but &#960; &#8226; prefetches R 1 . As a result, the cost of &#960; &#8226; is lower than the cost of &#960; * since c &lt; 1, which contradicts to our assumption that &#960; * is the optimal policy. Therefore, we prove the lemma.</p><p>Leveraging the property of the optimal policy illustrated in Lemma 2, we will show that prefetching before the request is the optimal choice under some insightful conditions. Theorem 3. Assume that the upcoming request R 1 is not stored in the cache. Prefetching R 1 is the optimal choice, if any of the following two conditions is satisfied:</p><p>&#8226; C1: There is a request R n , 1 &#8804; n &#8804; &#969;, such that R n S 1 and R n is requested at least twice in the time interval</p><p>[&#964; 1 , &#964; &#963; ];</p><p>&#8226; C2: The prefetching cost c satisfies c &#8804; L/(L + 1), where L = &#969; n=1 1(R n S 1 ) is the number of requests that arrive in the time interval [&#964; 1 , &#964; &#969; ] but do not belong to S 1 .</p><p>Proof. Assuming that the upcoming request R 1 is a miss, we will prove that under any of the proposed conditions, prefetching R 1 is the optimal choice to minimize the accumulated cost.</p><p>First, we will show that under Condition C1, prefetching R 1 is a better choice than fetching. Assume that R 1 is requested at least twice in the time interval [&#964; 1 , &#964; &#969; ]. Suppose for the sake of contradiction that the optimal policy &#960; * fetches R 1 . Next, we will construct a new policy &#960; &#8226; that prefetches R 1 and incurs a lower accumulated cost than &#960; * .</p><p>Consider the case where &#960; * evicts R &#963; when prefetching some missed request R k for 1 &lt; k &lt; &#963;. Then let &#960; &#8226; prefetches R 1 and evict R &#963; , and then prefetch R k and evict R 1 . Furthermore, let &#960; &#8226; perform the same operation as &#960; * for other requests. Then, &#960; &#8226; will reduce the cost of &#960; * by 1c.</p><p>Then, consider the case where &#960; * keeps R &#963; in the cache until &#964; &#963; . Let R n 1 = R 1 for some n 1 &#8712; (1, &#963;). If R 1 is fetched by &#960; * , then let &#960; &#8226; prefetches R 1 and evict R &#963; , and then prefetch R s igma and evict R 1 . For other requests, let &#960; &#8226; perform the same operation as &#960; * . Note that R n 1 will be a hit for &#960; &#8226; , and therefore, &#960; &#8226; reduces the accumulated cost of &#960; * by 2 -2c. Instead, if R n 1 is prefetched by &#960; * and meanwhile some cached data d i is evicted, then the next request for d i must arrive after &#964; &#963; due to the farthest-in-future eviction principle. Let &#960; &#8226; prefetches R 1 and evict R &#963; , and then prefetch R &#963; and evict d i . For other requests, let &#960; &#8226; make the same decisions as &#960; * . Then, &#960; &#8226; reduces the cost of &#960; * by 1c. Therefore, we prove that prefetching is the optimal choice if R 1 is requested twice before &#964; &#963; .</p><p>Using a similar argument, we can prove that if there is a request R n , 1 &lt; n &lt; &#969;, such that Condition C1 holds, then prefetching R n is the optimal choice. Moreover, since n &lt; &#969;, we can conclude that prefetching R 1 is also the optimal by applying Lemma 2.</p><p>Next, we will show that if Condition C2 holds, then prefetching R 1 is optimal. Let R n i , 1</p><p>denote the L requests that are not in the set S 1 . Suppose for the sake of contradiction that the optimal policy &#960; * fetches R 1 . Then, Lemma 2 indicates that &#960; * must also fetch all the L missed data items before &#964; &#969; . Let &#960; &#8226; prefetch R 1 and evict R &#963; , prefetch R n i+1 and evict R n i for 1 &#8804; i &#8804; L -1, and then prefetch R &#963; and evict R n L . For these requests, &#960; &#8226; pays a total cost (L + 1)c and &#960; * pay a total cost L. For other requests, let &#960; &#8226; make the same decisions as &#960; * . Since c &#8804; L/(L + 1), &#960; &#8226; induces a lower cost than &#960; * . Therefore, we prove that prefetching R 1 is the optimal choice if Condition C2 holds.</p><p>Theorem 3 provides sufficient conditions under which prefetching is the optimal choice. And these conditions reveal the following useful insights that can be leveraged to guide practical designs:</p><p>&#8226; Prefetching the upcoming request is optimal, if there exist popular items that will be requested in the near future and are not stored in the cache currently. Note that the upcoming request may not necessarily be popular. The reason is that the popular data will be prefetched and trigger evictions. Thus, evicting these items earlier can be even more beneficial, because more prefetching opportunities (e.g., prefetching the upcoming request) will be provided. This insight is characterized by Condition C1.</p><p>&#8226; Prefetching is optimal if the prefetching cost c is sufficiently low. This insight is straightforward. Condition C2 characterizes the critical value of c to make prefetching a better choice than fetching. Note that, since 1/2 &#8804; L/(L + 1) for L &#8805; 1, prefetching is always optimal for any data traces if c &#8804; 1/2.</p><p>Note that the proposed conditions only depend on the current cache content S 1 and the request information between R 1 and R &#963; . No information after &#964; &#963; or before &#964; 1 is required. In Section 6, we will leverage this nice property to propose an approximation policy that only requires near-future information. However, if neither C1 nor C2 is satisfied, the optimal decision will depend on the future requests after R &#963; .</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.">Approximation Using Near-Future Information</head><p>In this section, we propose an approximation policy using near-future information, and show that it is close to optimal by deriving the competitive ratio.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.1.">Lightweight Approximation Policy</head><p>Based on the analytical results obtained in Section 5, we propose an approximation policy as follows. Approximation Policy (&#960; A ): Prefetch the missed request and evict the farthest-in-future item, if c &#8804; &#8730; 2/2, or any of the conditions C1 and C2 introduced in Theorem 3 is satisfied. Otherwise, fetch the missed item but do not store it into the cache.</p><p>Applying Theorem 3, we know that, for c</p><p>, &#960; A prefetches a data item only when prefetching is the optimal choice. The threshold &#8730; 2/2 is chosen to achieve a better competitive ratio, as shown in Section 6.2. Notably, the proposed approximation policy makes caching and prefetching decisions merely based on the request information before &#964; &#963; . If the data requests are generated independently from a popularity distribution (e.g., Zipf's distribution as observed in real practice <ref type="bibr">[43]</ref>), then &#964; &#963; is independent of the trace length N. Although &#964; &#963; can depend on the cache size b, considering the fact that the trace length is typically far larger than the cache size in real practice, N is the dominant term in the time complexity and the impact of b is negligible. Therefore, &#960; A has an expected time complexity O(1) to make decisions for a single request and O(N) to process the entire trace. In Section 8, we verify through simulations that the required information for &#960; A to make decisions for a single request does not scale with the trace length. In summary, unlike the flow-based algorithm that need all future requests to make optimal decisions, &#960; A is lightweight and practical since</p><p>&#8226; it only exploits near-future information to make decisions;</p><p>&#8226; it does not have to process the entire trace to find the optimal decision for a single request and can be executed in real time.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.2.">Competitive Ratio Analysis</head><p>In this section, we will show that the proposed approximation policy achieves near-optimal performance by characterizing its competitive ratio.</p><p>We introduce two additional policies (i.e., always prefetching policy and always fetching policy) as benchmarks. Always Prefetching Policy (&#960; P ): Always prefetch the missed item. If the cache is full, evict the farthest-in-future item. Always Fetching Policy (&#960; F ): Always fetch the missed item. If the next request for the fetched item arrives before the farthest-in-future item, then evict the farthest-in-future item and store the fetched one in cache. Otherwise, do not load the fetched item into the cache.</p><p>&#960; P and &#960; F represent two extreme policies that always prefetch or fetch the missed data. To emphasize different consequences of fetching and prefetching, we eliminate the impact of eviction by applying the optimal eviction to both &#960; F and &#960; P . Notably, &#960; F is also known as the Belady's algorithm <ref type="bibr">[38,</ref><ref type="bibr">44]</ref>, which is the optimal caching policy when prefetching is disabled and data sizes are all identical. We will use &#960; F and &#960; P as benchmarks to show that a wise combination of prefetching and fetching (e.g., policy &#960; A ) can yield better performance. Specifically, we will prove that &#960; A always achieves the smallest competitive ratio among the three policies.</p><p>Let cost(&#960;, {R n } N n=1 ) denote the accumulated cost achieved by a policy &#960; for a given trace {R n } N n=1 . The competitive ratio for a given prefetching policy &#960; is defined as</p><p>, where &#960; * represents the optimal policy. The competitive ratio evaluates the worst-case performance of a policy. From the definition, we have r &#960; &#8805; 1 for any policy &#960;. Let r A , r P , r F denote the competitive ratio of the proposed approximation policy, always prefetching policy and always fetching policy. Theorem 4. Given the prefetching cost c, the competitive ratios of &#960; F , &#960; P and &#960; A can be computed as r F = 1/c for c &#8712; (0, 1],</p><p>In addition, for c &#8712; (</p><p>where b is the cache size.</p><p>Proof. For compactness, we omit the term {R n } N n=1 in the cost expression cost(&#960;, {R n } N n=1 ). The competitive ratio results for &#960; F , &#960; P and &#960; A are proven as follows.</p><p>Competitive ratio for &#960; F : For the &#960; F policy, we will first show that its competitive ratio is upper bounded by 1/c. Consider a new setting where fetching cost is the same as the prefetching cost c, 0 &#8804; c &#8804; 1. Let &#960; &#8226; and cost &#8226; denote the optimal policy and the minimum cost under the new setting. Then, for any request sequence, we have cost &#8226; &#8804; cost(&#960; * ). Note that there is no additional benefits to prefetch in the new scenario. Therefore, there must exist a &#960; &#8226; that always makes the same decision as &#960; F for the same request sequence. As a result, we have, for any request sequence</p><p>Next, we will show that for any &gt; 0, &#960; F can achieve a competitive ratio larger than 1/c -. For a given cache size b, assume without loss of generality that the set of items {d i : 1 &#8804; i &#8804; b} is initially stored in the cache. Consider the request sequence</p><p>and achieves a cost k. In contrast, if we choose to prefetch every missed item and evict the farthest-in-future item, a cost (k + 1)c will be introduced. Thus, we have</p><p>.</p><p>By choosing the k &#8805; 1/( c), a lower bound 1/ccan be achieved for &#8704; &gt; 0. Combining the upper and lower bounds, we prove the tight competitive ratio for &#960; F .</p><p>Competitive ratio for &#960; P : For c &#8712; [0, 1/2], we will show that &#960; P is the optimal policy. Assume that R 1 is a miss. A cost 1 will be induced if R 1 is fetched. However, the request can be served at a lower cost 2c, if we choose to prefetch R 1 and evict some cached data saying d i , and then after serving R 1 , prefetch d i back to the cache and evict R 1 . Therefore, when c &#8712; [0, 1/2], the optimal policy always choose to prefetch the miss requests. Moreover, according to Theorem 2, we can conclude that &#960; P is the optimal policy for c &#8712; [0, 1/2]. For c &#8712; (1/2, 1], we will first show that r P is upper bounded by 2c. We know that &#960; P is the optimal policy for c = 1/2. For a given request sequence, let cost &#8226; denote the cost achieved by &#960; P with c = 1/2. Then, we have, for the same request sequence and c &gt; 1/2 cost &#8226; &#8804; cost(&#960; * ) &#8804; cost(&#960; P ), and therefore</p><p>Next, we will show that there exists a request sequence such that a competitive ratio 2c is achieved, if M &#8805; b + 2, where M is the total number of data items and b is the cache size. Assume without loss of generality that d i , 1 &#8804; i &#8804; b, are initially stored in the cache. Consider a periodic request sequence that repeats the request pattern</p><p>. &#960; P will induce a cost 4c in each period. And the optimal policy is always fetching the missed data, which induces a cost 2 in each period. Therefore, a competitive ratio 2c is achieved and we can conclude that r P = 2c. Competitive ratio for &#960; A : For c &#8804; &#8730; 2/2, &#960; A always makes the same decision as &#960; P , and therefore, achieve the same competitive ratio. We have</p><p>For &#8730; 2/2 &lt; c &#8804; 1, we will first show that r A is upper bounded by 1/c. According to Theorem 3, we know that if &#960; A decides to prefetch the missed item, then prefetching must be the optimal choice. Therefore, &#960; A only yields worse performance than &#960; * , if &#960; A decides to fetch the missed item while &#960; * decides to prefetch.</p><p>We will introduce a new policy &#960; &#8226; to bound the competitive ratio. Given a request sequence, let &#960; &#8226; make the same decision as &#960; A if the decision is optimal. However, when the decision of &#960; A is not optimal (i.e., when &#960; A decides to fetch, while &#960; * decides to prefetch), let &#960; &#8226; prefetch the missed data without evicting any cached items. Then, after serving the request, let &#960; &#8226; evict the prefetched data. Note that &#960; &#8226; breaks the cache capacity constraint and is not a feasible solution. We will use &#960; &#8226; to provide a lower bound for the cost achieved by the optimal policy &#960; * . Specifically, for a given request sequence, let cost &#8226; denote the cost induced by &#960; &#8226; . We will show cost &#8226; &#8804; cost(&#960; * ) for any given request sequence.</p><p>Assume without loss of generality that R 1 is a miss and &#960; A decides to fetch R 1 , while the optimal choice is to prefetch. According to Theorem 2, the optimal policy &#960; * will evict the farthest-in-future item R &#963; and serve R 1 at a cost c. In contrast, the policy &#960; &#8226; will keep R &#963; in cache and serve R 0 at the same cost c. Since &#960; A chooses to fetch R 1 , according to Condition C1, the next request for R 1 must arrive after the farthest-in-future item R &#963; . Based on the farthest-in-future principle, the cache content of &#960; &#8226; is more beneficial than that of &#960; * . Therefore, we have cost &#8226; &#8804; cost(&#960; * ). In addition, since &#960; A and &#960; &#8226; always have the same cache content before serving each request, we have cost(&#960; A )/cost(&#960; &#8226; ) &#8804; 1/c, which yields</p><p>for any request sequence. Next, we will show the lower bound for r A when &#8730; 2/2 &lt; c &#8804; 1. Assume without loss of generality that {d i , 1 &#8804; i &#8804; b} are stored in the cache initially. The lower bound can be achieved by a periodic trace that repeats the request pattern {d b+1 , d 1 , d 2 , &#8226; &#8226; &#8226; , d b }. &#960; A will always choose to fetch the missed item. And the cache content under &#960; A is also updated periodically with period b(b + 1). For the first b(b + 1) requests, the approximation policy achieves a total cost b, and the optimal policy is to always prefetch the missed item which yields a cost (b + 1)c. Therefore, we have</p><p>In Theorem 4, we provide upper and lower bounds for the competitive ratio achieved by &#960; A , as well as the exact competitive ratios for &#960; F and &#960; P . Moreover, the upper and lower bounds for r A are asymptotically tight as the cache size goes to infinity. We plot r F , r P and the upper bound of r A in Figure <ref type="figure">7</ref>. It is easy to conclude from Theorem 4 and observe in Figure <ref type="figure">7</ref> that</p><p>for &#8704;c &#8712; [0, 1]. The proposed approximation policy always achieves the smallest competitive ratio which is at most &#8730; 2 &#8776; 1.414. Therefore, &#960; A is near optimal in terms of the worst-case performance. Furthermore, in Section 8, we will verify that &#960; A also achieves near-optimal average performance for both synthetic and real data traces.</p><p>Note that applying &#960; P for c &#8804; &#8730; 2/2 and &#960; F for c &gt; &#8730; 2/2 can achieve the same competitive ratio as &#960; A . However, experiments in Section 8 show that simply switching &#960; F and &#960; P at the threshold c = &#8730; 2/2 will incur considerably larger average costs than the proposed approximation policy &#960; A . </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7.">Discussion and Generalization</head><p>In this section, we discuss the possibility of waiving the perfect prefetching time assumption, and the generalization of the proposed min-cost flow representation to support heterogeneous prefetching and fetching costs and variable data sizes.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7.1.">Imperfect Prefetching Time</head><p>In our previous model, a perfect prefetching time is assumed to simplify the analysis. Specifically, we assume that the prefetched data item is loaded into the cache right before it is requested. In this section, we will argue that all the previous results will still hold, even when the prefetched data is loaded some time ahead of the request.</p><p>Recall that R 1 is the upcoming request and R &#963; is the farthest-in-future item. Let {R -n } n&#8805;1 be the sequence of historical requests as shown in Figure <ref type="figure">8</ref>. Define</p><p>R -&#952; is the most recent request that is identical to R &#963; . Assume that the policy decides to prefetch R 1 . We claim that, it is equivalent to load R 1 into the cache at any time point in the interval (&#964; -&#952; , &#964; 1 ), as long as the farthest-in-future eviction policy is adopted. According to the definition of &#952;, the farthest-in-future item is not requested in the time interval (&#964; -&#952; , &#964; 1 ). Consequently, evicting R &#963; (i.e., R -&#952; ) and loading R 1 at any time between &#964; -&#952; and &#964; 1 does not impact the cost or the future decisions. In contrast, if R 1 is prefetched and loaded into the cache before &#964; -&#952; with R &#963; evicted, then R -&#952; will be a miss and can incur additional costs.</p><p>Since the pre-mentioned policies (including &#960; OPT , &#960; A and &#960; P ) all adopt the farthest-in-future eviction, the main results of this paper still hold, if the prefetched data is loaded into the cache in the time interval (&#964; -&#952; , &#964; 1 ). Specifically,</p><p>&#8226; The proposed policies (including &#960; OPT , &#960; A and &#960; P ) still work. In particular, &#960; OPT , &#960; A and &#960; P will first make the decision of whether a data item should be prefetched or not. Then, we can find the &#964; -&#952; for each prefetched item and schedule the prefetching time accordingly.</p><p>&#8226; The &#960; OPT is still optimal, and the competitive ratio analysis for &#960; A , &#960; P and &#960; F still hold, since the optimal decision and the incurred cost will not be impacted if the prefetched data is loaded between &#964; -&#952; and &#964; 1 .</p><p>The value of &#952; indicates the flexibility of choosing the prefetching time. When the requests are independently generated from a popularity distribution, &#952; is a random variable that depends on the cache size and is independent of the trace length and the prefetching cost. In Section 8, we will show that &#952; could be considerably large through simulations.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7.2.">Heterogeneous Costs</head><p>In the previous setting, we assume that the prefetching/fetching costs are identical for every requests. However, in real practice, the cost to fetch a data item can depend on the traffic load and be time-varying. Similarly, the prefetching costs may also take different values to support more flexible model settings. Our flow-based method can be easily generalized to heterogeneous prefetching and fetching costs by setting the costs for prefetching and fetching edges in the flow network as the corresponding values. The cycle-canceling algorithm can still find the optimal solution as long as the costs are rational numbers.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7.3.">Variable Data Sizes</head><p>For real CDN traces, data sizes can take disparate values ranging from a few bytes to gigabytes <ref type="bibr">[45,</ref><ref type="bibr">46]</ref>. The constructed flow network can be modified to accommodate variable data sizes. Specifically, for data item, we can set its surplus/demand and the capacities of prefetching, fetching, eviction and auxiliary edges as the data size. However, we may not be able to find the optimal policy via the min-cost flow, since it is possible that the min-cost flow prefetches/fetches a fraction of the data item, which is not feasible for a caching policy. Instead, by leverage the min-cost flow, we can construct the upper and lower bounds for the performance of the optimal offline policy.</p><p>The cost achieved by the min-cost flow is a lower bound for the cost achieved by the optimal policy, since fractional solutions are allowed for the min-cost problem. Additionally, we can construct a feasible caching policy from the mincost flow by rounding up the fractional solution. Specifically, if a fraction of some request goes through the fetching edge in the flow network, then we will fetch the whole item. Otherwise, we will perform the same operation as the min-cost flow. The rounded solution is a feasible prefetching and caching policy and therefore provides an upper bound for the performance of the optimal policy. Notably, if the lower and upper bounds coincide, the rounded solution becomes optimal. It is shown in <ref type="bibr">[15]</ref> that for caching without prefetching, the bounds are asymptotically tight under mild assumptions. Whether similar asymptotic results hold for prefetching deserves future investigations.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="8.">Evaluation</head><p>In this section, we evaluate the average performance for various policies of interest using both synthetic and real data traces. Specifically, in Experiment 1, we evaluate the policies for data requests generated from light-tailed and heavy-tailed popularity distributions. Real CDN traces are used for evaluation in Experiment 2. Moreover, in Experiment 3, we illustrate the amount of future information that is required by the approximation policy.</p><p>In addition to the pre-mentioned policies (&#960; OPT , &#960; A , &#960; P and &#960; F ), we also simulate the optimal static policy (denoted by &#960; S ). The optimal static policy stores the most popular data in the cache, and will neither update the cache content nor prefetch future requests. When the requests are generated independently from a popularity distribution, the optimal static policy can provide a lower bound for the costs incurred by a bunch of statistic-based policies (e.g., LRU, LFU) that only exploit data statistics and are unaware of the exact request sequence. In the following experiments, the cache is initially empty and the data items are prefetched, fetched or evicted based on specific policies.</p><p>Experiment 1. In this experiment, we compare the average performance of the proposed policies under both light-tailed and heavy-tailed data popularity distributions. In particular, we consider three popularity distributions with different tails. The first one is a light-tailed exponential distribution with P</p><p>, where c 1 = 1/ 10 6  i=1 exp(-0.3i) &#8776; 0.3499 is a normalization factor. The second popularity distribution is a heavy-tailed Weibull distribution with P[R n = d i ] = c 2 &#8226; exp(-i 0.6 ), 1 &#8804; n &#8804; 10 5 , 1 &#8804; i &#8804; 10 6 , where c 2 &#8776; 0.8671. The third one is a heavy-tailed Zipf's distribution with P[R n = d i ] = c 3 /i 2 , 1 &#8804; n &#8804; 10 5 , 1 &#8804; i &#8804; 10 6 , c 3 &#8776; 0.6079. For each popularity distribution, we generate a sequence of 10 5 requests independently and test the proposed policies for a cache of size 20. The average cost incurred by a request is plotted in Figure <ref type="figure">9</ref>.</p><p>It can be observed from Figure <ref type="figure">9a</ref> that the flow-based optimal offline policy &#960; OPT always incurs the smallest cost, and the proposed approximation policy &#960; A achieves near-optimal performance. The always-prefetching policy &#960; P 0.5 0.6 0.7 0.8 0.9  achieves the optimal performance when c = 0.5. For c &gt; 0.5, &#960; P always incurs larger costs than &#960; OPT . Note that when c is close to 1, always prefetching can incur as high costs as the static optimal policy &#960; S , which loses the advantage of the sample path. Moreover, the always-fetching policy &#960; F (a.k.a. the Belady's algorithm <ref type="bibr">[38]</ref>) only achieves the optimal performance when the prefetching cost is 1. Although &#960; F and &#960; A have almost the same competitive ratios for c &#8712; [ &#8730; 2/2, 1], &#960; A achieves significantly better average performance when c &lt; 1. Therefore, simply switching &#960; F and &#960; P at the threshold &#8730; 2/2 (&#8776; 0.71) can incur much larger average costs than &#960; A and &#960; OPT . The same trend can be observed for heavy-tailed Weibull distributions in Figure <ref type="figure">9b</ref>. However, the performance difference between &#960; A and &#960; P are not as large as those for exponential distributions. Moreover, when the distribution tail is even heavier (e.g., for Zipf's popularities), both &#960; A and &#960; P achieve almost optimal performance as shown in Figure <ref type="figure">9c</ref>. An intuitive explanation is that when the popularity is heavy-tailed, the requests will be less concentrated. As a result, it is more likely to have an unpopular data item stored in the cache, which provides a harmless eviction opportunity for prefetching. In such scenarios, prefetching is almost always the optimal choice, and therefore, &#960; A , &#960; P and &#960; OPT could achieve similar performance.  Experiment 2. In this experiment, we test the proposed policies using a real CDN data trace <ref type="foot">1</ref> . The trace contains a million requests for 449380 distinct data items where the data sizes are set to be 1. In Figure <ref type="figure">10</ref>, we plot the empirical data popularities which can be approximated by a Zipf's distribution with P[R n = d i ] = 0.0313/i 0.88 , 1 &#8804; i &#8804; 449380, 1 &#8804; n &#8804; 10 6 . We simulate the proposed policies for a cache of size 20000 using the request sequence in the CDN trace. The average costs are plotted in Figure <ref type="figure">11</ref>. It can be observed that &#960; A and &#960; P achieve almost the same performance as &#960; OPT , which coincides with the observation in Experiment 1 for Zipf's popularities. Experiment 3. In this experiment, we will verify that the approximation policy &#960; A only requires near-future information to make decisions for a single request. Let T denote the number of future requests required for making decision for a request. We will evaluate the statistics of T under different trace lengths, prefetching costs and cache sizes. Generate data requests from the Zipf's popularity distribution considered in Experiment 1. Note that we omit the experiments for exponential and Weibull distributions since the obtained results and the revealed insights are similar. First, set c = 0.85, b = 50. We collect the statistics of T for multiple traces with different lengths. The results are presented as box plots in Figure <ref type="figure">12a</ref>, where the central red bar and the green "+" sign represent the median and the mean, respectively. The top and bottom edges of the box indicate the 75 th and 25 th percentiles. The whiskers extend to the extreme values within 1.5 &#215; IQR (interquartile range) <ref type="bibr">[47]</ref>. All these statistics of T do not scale with the trace length N, which verifies the analysis in Section 6.  Furthermore, we investigate how the cache size and the prefetching cost can impact T . Set N = 5 &#215; 10 5 . For a fixed b = 50, we collect the statistics of T under different prefetching costs. The results are plotted in Figure <ref type="figure">12b</ref>. For a fixed c = 0.85 and repeat the experiments for different cache sizes, and present the results in Figure <ref type="figure">12c</ref>. As we expect, the required information will increase with the prefetching cost, since Condition C2 of &#960; A becomes stricter for a larger c. Moreover, T can scale considerably with the cache size, because &#964; &#963; and &#964; &#969; will take larger values, and the miss ratio will decrease. Consequently, more future observations are required to check whether Conditions C1 and C2 hold. For a typical scenario where the policy is applied for a fixed cache size and processes requests that keep arriving, the future information required by &#960; A will not scale up.</p><p>Experiment 4. It is shown in Section 7 that the main results of this paper will still hold as long as the prefetched data is loaded into the cache in the time interval (&#964; -&#952; , &#964; 1 ) for R 1 . And the value of &#952; could represent the flexibility of choosing the prefetching time. In this experiment, we show that &#952; could be considerably large by evaluating its statistics for the Zipf's popularity distribution defined in Experiment 1. Specifically, we first set c = 0.85 and b = 50, and conduct simulations for different trace lengths. Then, we repeat the experiment for N = 5 &#215; 10 5 , b = 50 and different prefetching costs. Next, we conduct simulations for N = 5 &#215; 10 5 , c = 0.85 and different cache sizes. The results are shown as the box plots in Figure <ref type="figure">13</ref>, where the symbols represent the same statistics as those in Experiment 3. It can be observed from Figure <ref type="figure">13a</ref> and 13b that &#952; will not be impacted by the trace length or the prefetching cost. The mean of &#952; is round 250, which is considerably large. Moreover, as shown in Figure <ref type="figure">13c</ref>, &#952; will increase with the cache size. For b = 250, the mean of &#952; could be as large as 5907, which brings even more flexibility to choose the prefetching time. The intuition is that, for large cache sizes, the farthest-in-future item will be less popular. And therefore, the most recent time when the farthest-in-future item was requested (i.e., &#964; -&#952; ) is far in the past.  </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="9.">Conclusion</head><p>To characterize the fundamental trade-off between prefetching and caching, we developed a cost-based service model and investigated the optimal offline policy, assuming that the entire request sequence is known. We casted it as a min-cost flow problem, and found the optimal policy for a data trace of length N via flow-based algorithms in O(N 3/2 ) expected time. To apply this offline algorithm without the precise knowledge on future requests, we utilized the characteristics of the optimal solution and derived non-trivial conditions for an optimal prefetching and eviction policy. Based on these insights, we proposed a lightweight approximation policy using the predicted requests in the near future. The approximation policy can be applied in real time and process the entire trace in O(N) expected time, with a competitive ratio &#8730; 2 &#8776; 1.4. Extensive experiments verified that it achieves near-optimal average performance for both light and heavy-tailed popularity distributions.</p></div><note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0"><p>The trace is originally used for evaluation and labeled as "cdn1" in<ref type="bibr">[15]</ref>. We use the first one million requests in the trace.</p></note>
		</body>
		</text>
</TEI>
