<?xml-model href='http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng' schematypens='http://relaxng.org/ns/structure/1.0'?><TEI xmlns="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<fileDesc>
			<titleStmt><title level='a'>Stochastic Online Metric Matching</title></titleStmt>
			<publicationStmt>
				<publisher></publisher>
				<date>2019 3rd Quarter (CY)</date>
			</publicationStmt>
			<sourceDesc>
				<bibl> 
					<idno type="par_id">10121532</idno>
					<idno type="doi">10.4230/LIPIcs.ICALP.2019.67</idno>
					<title level='j'>International Colloquium on Automata, Languages, and Programming</title>
<idno></idno>
<biblScope unit="volume"></biblScope>
<biblScope unit="issue"></biblScope>					

					<author>Anupam Gupta</author><author>Guru Guruganesh</author><author>Binghui Peng</author><author>David Wajc</author>
				</bibl>
			</sourceDesc>
		</fileDesc>
		<profileDesc>
			<abstract><ab><![CDATA[We study the minimum-cost metric perfect matching problem under online i.i.d arrivals. We are given a fixed metric with a server at each of the points, and then requests arrive online, each drawn independently from a known probability distribution over the points. Each request has to be matched to a free server, with cost equal to the distance. The goal is to minimize the expected total cost of the matching. Such stochastic arrival models have been widely studied for the maximization variants of the online matching problem; however, the only known result for the minimization problem is a tight O(log n)-competitiveness for the random-order arrival model. This is in contrast with the adversarial model, where an optimal competitive ratio of O(log n) has long been conjectured and remains a tantalizing open question. In this paper, we show that the i.i.d model admits substantially better algorithms: our main result is an O((log log log n)^2)-competitive algorithm in this model, implying a strict separation between the i.i.d model and the adversarial and random order models. Along the way we give a 9-competitive algorithm for the line and tree metrics - the first O(1)-competitive algorithm for any non-trivial arrival model for these much-studied metrics.]]></ab></abstract>
		</profileDesc>
	</teiHeader>
	<text><body xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>We study the minimum-cost metric (perfect) matching problem under online i.i.d. arrivals.</p><p>In this problem, we are given a fixed metric (S, d) with a server at each of the n = |S| points.</p><p>Then n requests arrive online, where each request is at a location that is drawn independently from a known probability distribution D over the points. Each such arriving request has to be matched immediately and irrevocably to a free server, whereupon it incurs a cost equal to distance of its location to this server. The goal is to minimize the total expected cost.</p><p>The minimization version of online matching was first considered in the standard adversarial setting by Khuller et al. <ref type="bibr">[25]</ref> and Kalyanasundaram and Pruhs <ref type="bibr">[22]</ref>; both papers showed (2n -1)-competitive deterministic algorithms, and proved that this was tight for, say, the star metric. After about a decade, a randomized algorithm with an O(log 3 n)competitiveness was given by Meyerson et al. <ref type="bibr">[30]</ref>; this was improved to O(log 2 n) by Bansal et al. <ref type="bibr">[4]</ref>, which remains the best result known. (Recall that the maximization version of matching problems have been very widely studied, but they use mostly unrelated techniques.)</p><p>The competitive ratio model with adversarial online arrivals is often considered too pessimistic, since it assumes an all-powerful adversary. One model to level the playing field, and to make the model perhaps closer to practice, is to restrict the adversary's power. Two models have been popular here: the random-order arrivals (or secretary) model, and the i.i.d. model defined above. The random-order model is a semi-random model, in which the worst-case input is subjected to random perturbations. Specifically, the adversary chooses a set of requests, which are then presented to the algorithm in a uniformly random order. The min-cost online matching problem in this random-order model was studied by Raghvendra, who gave a tight O(log n)-competitive algorithm <ref type="bibr">[35]</ref>. The random-order model also captures the i.i.d. setting, so the natural goal is to get a better algorithm for the i.i.d. model. Indeed, our main result for the i.i.d. model gives exactly such a result: &#9710; Theorem 1.1 (Main Theorem). There is an O((log log log n) 2 )-competitive algorithm for online minimum-cost metric perfect matching in the i.i.d. setting.</p><p>Observe that the competitiveness here is better than the lower bounds of &#8486;(log n) known for the worst-case and random-order models.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Matching on the Line and Trees.</head><p>There has also been much interest in solving the problem for the line metric. However, getting better results for the line than for general metrics has been elusive: an O(log n)-competitive randomized algorithm for line metrics (and for doubling metrics) was given by <ref type="bibr">[18]</ref>. In the deterministic setting, recently Nayyar and Raghvendra <ref type="bibr">[34]</ref> gave an O(log 2 n)-competitive algorithm, whose competitive ratio was subsequently proven to be O(log n) by Raghvendra <ref type="bibr">[36]</ref>, improving on the o(n)-competitive algorithm of Antoniadis et al. <ref type="bibr">[2]</ref>. To the best of our knowledge, nothing better is known for tree metrics than for general metrics in both the adversarial and the random-order models. Our second result for the i.i.d. model is a constant-competitive algorithm for tree metrics.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>&#9710; Theorem 1.2 (Algorithm for Trees).</head><p>There is a 9-competitive algorithm for online minimumcost metric perfect matching on tree metrics in the i.i.d. setting.</p><p>Max-Weight Perfect Matching. Recently, Chang et al. <ref type="bibr">[7]</ref> presented a 1 /2-competitive algorithm for the maximum-weight perfect matching problem in the i.i.d. setting. We show that our algorithm is versatile, and that a small change to our algorithm gives us a maximization variant matching this factor of 1 /2. Our approach differs from that of <ref type="bibr">[7]</ref>, in that we match an arriving request based on the realization of free servers, while they do so based on the "expected realization". See the full version for details.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.1">Our Techniques</head><p>Both theorems 1.1 and 1.2 are achieved by the same algorithm. The first observation guiding this algorithm is that we may assume that the distribution D of request locations is just the uniform distribution on the server locations. (In the full version we show how this assumption can be removed with a constant factor loss in the competitiveness.) Our algorithm is inspired by the following two complementary consequences of the uniformity of D.</p><p>Firstly, each of the n -t + 1 free servers' locations at time t are equally likely to get a request in the future, and as such they should be left unmatched with equal probability. Put otherwise, we should match to them with equal probability of 1/(n -t + 1). However, matching any arriving request to any free server with probability 1/(n -t + 1) is easily shown to be a bad choice.</p><p>So instead, we rely on the second observation: the t th request is equally likely to arrive at each of the n server locations. This means we can couple the matching of free server locations with the location of the next request, to guarantee a marginal probability of 1/(n -t + 1) for each free server to be matched at time t. Indeed, the constraints that each location is matched at time t with probability 1/n (i.e., if it arrives) and each of the free servers are matched with marginal probability 1/(n -t + 1) can be expressed as a bipartite flow instance, which guides the coupling used by the algorithm. Loosely speaking, our algorithm is fairly intuitive. It finds a min-cost fractional matching between the current open server locations and the expected arrivals, and uses that to match new requests. The challenge is to bound the competitive ratio -in contrast to previously used approaches (for the maximization version of the problem) it does not just try to match vertices using a fixed template of choices, but rather dynamically recomputes a template after each arrival.</p><p>A major advantage of this approach is that we understand the distribution of the open servers. We maintain the invariant that after t steps, the set of free servers form a uniform random (n-t)-subset of [n] -the randomness being over our choices, and over the randomness of the input. This allows us to relate the cost of the algorithm in the t th step to the expected cost of this optimal flow between the original n points and a uniformly random subset of (n -t) of these points. The latter expected cost is just a statistic based on the metric, and does not depend on our algorithm's past choices. For paths and trees, we bound this quantity explicitly by considering the variance across edge-cuts in the tree -this gives us the proof of Theorem 1.2.</p><p>Since general metrics do not have any usable cut structure, we need a different idea for Theorem 1.1. We show that tree-embedding results can be used either explicitly in the algorithm or just implicitly in the proof, but both give an O(log n) loss. To avoid this loss, we use a different balls-and-bins argument to improve our algorithm's competitiveness to O((log log n)) 2 ). In particular, we provide better bounds on our algorithm's per-step cost in terms of E[OP T ] and the expected load of the k most loaded bins in a balls and bins process, corresponding to the number of requests in the k most frequently-requested servers. Specifically, we show that E[OP T ] is bounded in terms of the expected imbalance between the number of requests and servers in these top k server locations. Coupling this latter uniform k-tuple with the uniform k-tuple of free servers left by our algorithm, we obtain our improved bounds on the per-step cost of our algorithm in terms of E[OP T ] and these bins' load, from which we obtain our improved O((log log n) 2 ) competitive ratio. Interestingly, combining both balls and bins and tree embedding bounds for the per-step cost of step k (appealing to different bounds for different ranges of k) gives us a further improvement: we prove that our algorithm is O((log log log n) 2 ) competitive. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.2">Further Related Work</head><p>I.i.d. stochastic arrivals have been studied for various online problems, e.g., for Steiner tree/forest <ref type="bibr">[15]</ref>, set cover <ref type="bibr">[17]</ref>, and k-server <ref type="bibr">[9]</ref>. Closer to our work, stochastic arrivals have been widely studied in the online matching literature, though so far mostly for maximization variants. Much of this work was motivated by applications to online advertising, for which the worst-case optimal (1 -1 /e)-competitive ratios <ref type="bibr">[24,</ref><ref type="bibr">29,</ref><ref type="bibr">1]</ref> seem particularly pessimistic, given the financial incentives involved and time-learned information about the distribution of requests. Consequently, many stochastic arrival models have been studied, and shown to admit better than 1 -1 /e competitive guarantees. The stochastic models studied for online matching and related problems, in increasing order of attainable competitive ratios, include random order (e.g., <ref type="bibr">[16,</ref><ref type="bibr">23,</ref><ref type="bibr">27]</ref>), unknown i.i.d. -where the request distribution is unknown -(e.g., <ref type="bibr">[10,</ref><ref type="bibr">31]</ref>), and known i.i.d. (e.g., <ref type="bibr">[13,</ref><ref type="bibr">3,</ref><ref type="bibr">6]</ref>). Additional work has focused on interpolating between adversarial and stochastic input (e.g., <ref type="bibr">[11,</ref><ref type="bibr">26]</ref>). See Mehta's survey <ref type="bibr">[28]</ref> and recent work <ref type="bibr">[8,</ref><ref type="bibr">19,</ref><ref type="bibr">21,</ref><ref type="bibr">20,</ref><ref type="bibr">14,</ref><ref type="bibr">33]</ref> for more details. The long line of work on online matching, both under adversarial and stochastic arrivals, have yielded a slew of algorithmic design ideas, which unfortunately do not seem to carry over to minimization problems, nor to perfect matching problems.</p><p>As mentioned above, the only prior work for stochastic online matching with minimization objectives was the random order arrival result of Raghvendra <ref type="bibr">[35]</ref>. We are hopeful that our work will spur further research in online minimum-cost perfect matching under stochastic arrivals, and close the gap between our upper bounds and the (trivial) lower bounds for the problem.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Our Algorithm</head><p>In this section we present our main algorithm, together with some of its basic properties.</p><p>Throughout the paper we assume that the distribution over request locations is uniform over the n servers' locations. We show in the full version that this assumption is WLOG: it increases the competitive ratio by at most a constant. In particular, we show the following.</p><p>&#9710; Lemma 2.1. Given an &#945;-competitive algorithm ALG U for the uniform distribution over server locations, U, we can construct a (2&#945; + 1)-competitive algorithm ALG D for any distribution D.</p><p>Focusing on the uniform distribution over server locations, our algorithm is loosely the following: in each round of the algorithm, we compute an optimal fractional matching between remaining free servers and remaining requests (in expectation). Now when a new request arrives, we just match the newly-arrived request according to this matching.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1">Notation</head><p>Our analysis will consider k-samples from the set S = [n] both with and without replacement. We will set up the following notation to distinguish them: Let I k be the distribution over k-sub-multisets of S = [n] obtained by taking k i.i.d. samples from the uniform distribution over S. (E.g., I n is the request set's distribution.) Let U k be the distribution over k-subsets of S obtained by picking a uniformly random k-subset from S k . In other words, I k is the distribution obtained by picking k elements from S uniformly with replacement, whereas U k is without replacement.</p><p>For a sub-(multi)set T &#8838; S of servers, let M (T ) denote the optimal fractional min-cost b-matching in the bipartite graph induced between T and the set of all locations S, with overall unit capacity on either side. That is, the capacity for each node in T is 1/|T | and the capacity for each node in S is 1/n. So, if we denote by d i,j the distance between locations i and j, we let M (T ) correspond to the following linear program.</p><p>We emphasize that in the above LP, several servers in S (and likewise in T ) may happen to be at the same point in the metric space, and hence there is a separate constraint for each such point j (and likewise i). Slightly abusing notation, we let M (T ) denote both the LP and its optimal value, when there is no scope for confusion.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2">Algorithm Description</head><p>The algorithm works as follows: at each time k, if S k &#8838; S is the current set of free servers, we compute the fractional assignment M (S k ), and assign the next request randomly according to it. As argued above, since each free server location is equally likely to receive a request later (and therefore it is worth not matching it), it seems fair to leave each free server unmatched with equal probability. Put otherwise, it is only fair to match each of these servers with equal probability. Of course, matching any arriving request to a free server chosen uniformly at random can be a terrible strategy. In particular, it is easily shown to be &#8486;( &#8730; n)-competitive for n servers equally partitioned among a two-point metric. Therefore, to obtain good expected matching cost, we should bias servers' matching probability according to the arrived request, and in particular we should bias it according to M (S k ). This intuition guides our algorithm fair-bias, and also inspires its name.</p><p>Algorithm 1 fair-bias.</p><p>compute optimal fractional matching M (S k ), denoted by x S k . randomly choose server s from S k , where s i is chosen w/prob.</p><p>assign r to s.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>7:</head><p>end event 8:</p><p>S k-1 &#8592; S k \ {s}. 9: end for A crucial property of our algorithm is that the set S k of free servers at each time k happens to be a uniformly random k-subset of S. Recall that fair-bias assigns each arriving request according to the assignment M (S k ). This means that to analyze the algorithm, it suffices to relate the optimal assignment cost OPT to the optimal assignment costs for uniformly random subsets S k , as follows.</p><p>I C A L P 2 0 1 9</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>67:6</head><p>Stochastic Online Metric Matching &#9710; Lemma 2.2 (Structure Lemma). For each time k, the set S k is a uniformly-drawn k-subset of S; i.e., S k &#8764; U k . Consequently, the algorithm's cost is</p><p>Proof. The proof of the first claim is a simple induction from n down to 1. The base case of S n is trivial. For any k-subset</p><p>, where the second equality follows from induction and the fact that</p><p>To compute the algorithm's cost, we consider some set S k = T of k free servers. Since the request r k = r is chosen with probability 1/n, following which we match it to some free server s &#8712; S k with probability n &#8226; x S k s,r , we find that the next edge matched by the algorithm has expected cost</p><p>Therefore, the expected cost of the algorithm is indeed</p><p>The structure lemma implies that we may assume from now on that the set of free servers S k is drawn from U k . In what follows, unless stated otherwise, we have S k &#8764; U k . More importantly, Lemma 2.2 implies that to bound our algorithm's competitive ratio by &#945;, it suffices to show that</p><p>. This is exactly the approach we use in the following sections.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Bounds for General Metrics</head><p>In Section 4 we will show that algorithm fair-bias is O(1)-competitive for line metrics (and more generally tree metrics), by relying on variance bounds of the number of matches across tree edges in OP T and M (S k ), our algorithm's guiding LP. For general metrics, if we first embed the metric in a low-stretch tree metric <ref type="bibr">[12]</ref> (blowing up the expected cost of E[OPT] by O(log n)) and run algorithm fair-bias on the obtained metric, we immediately obtain an O(log n)-competitive algorithm. In fact, explicitly embedding the input metric in a tree metric is not necessary in order to obtain this result using our algorithm. By relying on an implicit tree embedding, we obtain the following lemma (mirroring the variance-based bound underlying our result for tree metrics). This lemma's proof is deferred to the full version.</p><p>Summing over all values of k &#8712; [n], we find that fair-bias is O(log n)-competitive on general metrics. While this bound is no better than that of Raghvendra's t-net algorithm for random order arrival <ref type="bibr">[35]</ref> (and therefore for i.i.d arrivals), the result will prove useful in our overall bound for our algorithm. In Sections 3.1 and 3.2, we use a different balls-and-bins argument to decrease our bounds on the algorithm's competitive ratio considerably, to O((log log n)) 2 ), by considering the imbalance between number of requests and servers in the top k most requested locations. (The former quantity corresponds to the load of the k most loaded bins in a balls and bins process -motivating our interest in process.) Finally, in Section 3.3, we combine this improved bound with the one from Lemma 3.1, summing different bounds for different ranges of k, to prove our main result: an O((log log log n) 2 ) bound for our algorithm's competitive ratio.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1">Balls and Bins: The Poisson Paradigm</head><p>For our results, we need some technical facts about the classical balls-and-bins process.</p><p>The following standard lemma from [32, Theorem 5.10] allows us to use the Poisson distribution to approximate monotone functions on the bins. For i &#8712; [n], let X m i be a random variable denoting the number of balls that fall into the i th bin, when we throw m balls into n bins. Let Y m i be independent draws from the Poisson distribution with mean m/n.</p><p>A classic result states that for m = n balls, the maximum bin load is &#920;(log n/ log log n) w.h.p. (see e.g., <ref type="bibr">[32]</ref>). The following lemma is a partial generalization of this result. Its proof, which relies on the Poisson approximation of Lemma 3.2, is deferred to the full version. &#9710; Lemma 3.3. Let n balls be thrown into n bins, each ball thrown independently and uniformly at random. Let L j be the load of the j th heaviest bin, and N k := j&#8804;k L j be the number of balls in the k most loaded bins. There exists a constant C 0 &gt; 0 such that for any k &#8804; C 0 n,</p><p>In the next lemma, whose proof is likewise deferred to the full version, we rely on a simple Chernoff bound to give a weaker lower bound for E[N k ] that holds for all k &#8804; n/2.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>&#9710; Lemma 3.4. For sufficiently large n and any</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2">Relating Balls and Bins to Stochastic Metric Matching</head><p>We now bound the expected cost incurred by fair-bias at time k by appealing to the above balls-and-bins argument; this will give us our stronger bound of O((log log n) 2 ). Specifically, we will derive another lower bound for E[OPT] in terms of E S k &#8764;U k [M (S k )]. In our bounds we will partition the probability space I n (corresponding to n i.i.d. requests) into disjoint parts, based on T k , the top k most frequently requested locations (with ties broken uniformly at random). By symmetry, Pr[T k = T ] = 1/ n k for all T &#8712; S k . By coupling T k with U k , we will lower-bound</p><p>the expected imbalance between number of requests and servers in T k . Here E[N k ] is the expected occupancy of the k most loaded bins in the balls and bins process discussed in Section 3.1.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>I C A L P 2 0 1 9</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>67:8 Stochastic Online Metric Matching</head><p>To relate E[OP T | T k = S k ] to M (S k ), we will bound both these quantities by the cost of a min-cost perfect b-matching between S k and S \ S k ; i.e., each vertex v has some (possibly fractional) demand b v which is the extent to which it must be matched. To this end, we need the following simple lemma, which asserts that for any min-cost metric b-matching instance, there exists an optimal solution which matches co-located servers and requests maximally. We defer the lemma's proof, which follows from a local change argument and triangle inequality, to the full version. &#9710; Lemma 3.5. Let I be a fractional min-cost bipartite metric b-matching instance, with demand &#8467; i and r i for the servers and requests at location i. Then, there exists an optimal solution x for I with x ii = min{&#8467; i , r i } for every point i in the metric.</p><p>We are now ready to prove our main technical lemma, lower-bounding</p><p>in terms of M (S k ) and the imbalance between number of requests of the k most requested locations, N k , and the number of servers in those locations.</p><p>Proof. Applying Lemma 3.5 to M (S k ), we find that the optimal value of M (S k ) is equal to that of a min-cost bipartite perfect b-matching instance with left vertices associated with S k , each with demand </p><p>we note that it implies our desired bound, as</p><p>It remains to lower bound E[OP T | T k = S k ] in terms of such a biregular b-matching instance.</p><p>For the remainder of this proof, for notational simplicity we denote by &#8486; the probability space induced by conditioning on the event T k = S k . To lower bound E &#8486; [OP T ], we will provide a fractional perfect matching x of the expected instance (in &#8486;), and show that</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>we find that the min-cost biregular bipartite perfect b-matching above lower bounds i&#8712;S</head><p>We now turn to producing an x satisfying our desired properties.</p><p>For any two locations i, j &#8712; S, we let (i, j) &#8712; OP T indicate that a request in location i is served by the server in location j. Let p ij := Pr &#8486; [(i, j) &#8712; OP T ]. We will show how small modifications to p will yield a fractional perfect matching x as discussed in the previous paragraph. Let Y i be the number of requests at server i. By Lemma 3.5, we know that (i, i)</p><p>. Consequently, if we let &#8710; in (j) := j &#8242; &#8712;S\{j} p j &#8242; j and &#8710; out (j) := j &#8242; &#8712;S\{j} p jj &#8242; , we have by <ref type="bibr">Lemma 3.5</ref> </p><p>(conditioning on the complementary event is similar), we have by Lemma 3.5 that p ji = 0 for all i &#8712; S k and j &#8712; S \ {i}. Moreover, by symmetry we have &#8710; out (i) = (E[N k ] -k)/k for all k locations i &#8712; S k . We now show how to obtain from p a fractional matching x between S k and S \ S k of no greater cost than p, such that p jj &#8242; = 0 for all j = j &#8242; &#8712; S \ S k and such that the values &#8710; in (j) -&#8710; out (j) are unchanged for all j &#8712; S. Consequently, all (simple) edges in the support of x go between S k and S \ S k , and</p><p>for all j &#8712; S \ S k , yielding our desired lower bound on E &#8486; [OP T ] in terms of a biregular bipartite b-matching instance.</p><p>We start by setting x &#8592; p. While there exists a pair j = j &#8242; &#8712; S \ S k with x j &#8242; j &gt; 0, we pick such a pair. As &#8710; in (j) -&#8710; out (j) &#8805; 0, there must also be some flow coming into j. We follow a sequence of edges j 1 &#8592; j 2 &#8592; j 3 &#8592; . . . with each j r &#8712; S \ S k and with x jrjr-1 &gt; 0 until we either repeat some j r &#8712; S\ or reach some j r with x ijr 0 for some i &#8712; S. (Note that one such case must happen, as &#8710; in (j) -&#8710; out (j) &#8805; 0 for all j &#8712; S \ S k .) If we repeat a vertex, j r , we only consider the sequence of nodes given by the obtained cycle,</p><p>Let &#491; = min r x jrjr-1 be the smallest x jj &#8242; in our trail. If we repeated a vertex, we found a cycle, and we decrease x jj &#8242; by &#491; for all consecutive j, j &#8242; in the cycle. If we found some i &#8712; S and x ijr &gt; 0, we decrease all x jj &#8242; values along the path (including x ijr ) by &#491; and increase x ij1 by &#491;. In both cases, we only decrease the cost of x (either trivially, or by triangle inequality) and we do not change &#8710; in (j) -&#8710; out (j) for any j &#8712; S, while decreasing j =j &#8242; &#8712;S\S k x jj &#8242; . As the initial x-values are all rational, repeating the above terminates, with the above sum equal to zero, which implies a biregular fractional solution x as required. The lemma follows. &#9709; Coupling the distribution of T k and the set of k free servers, we obtain the following.</p><p>Proof. Taking expectations over S k &#8764; U k , we obtain our claimed bound.</p><p>in the lower bounds of Lemmas 3.3 and 3.4 for the top k most loaded bins' loads, E[N k ], we obtain the following bounds on fair-bias's per-step cost in terms of E[OP T ].</p><p>&#9710; Lemma 3.8. For C 0 a constant as in Lemma 3.3, there exists a constant C such that</p><p>The following lemma allows us to leverage Lemma 3.8, as it allows us to focus on E S k &#8764;U k [M (S k )] for k &#8804; n/2. Its proof relies on our characterization of M (S k ) in terms of a balanced b-matching instance between S k and S \ S k as in the proof of Lemma 3.6, which implies that M (S k ) &#8804; M (S n-k ) for all k &#8804; n/2. Its proof is deferred to the full version.</p><p>&#9710; Lemma 3.9.</p><p>Using our upper bound on E S k &#8764;U k [M (S k )] of Lemma 3.8 and summing the two ranges of k &#8804; n/2 in Lemma 3.9 we find that fair-bias is O((log log n) 2 ) competitive. We do not elaborate on this here, as we obtain an even better bound in the following section. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3">Our Main Result</head><p>We are now ready to prove our main result, by combining our per-step cost bounds given by our balls and bins argument (Lemma 3.8) and our implicit tree embedding argument (Lemma 3.1).</p><p>&#9710; Theorem 3.10. Algorithm fair-bias is O((log log log n) 2 )-competitive for the online bipartite metric matching problem under i.i.d arrivals on general metrics.</p><p>Proof. By the structure lemma (Lemma 2.2) and Lemma 3.9, we have that</p><p>We use the three bounds from Lemma 3.1 and Lemma 3.8 for different ranges of k to bound the above sum. Specifically, by relying on Lemma 3.1 for k &#8804; n/ log 2 n, we have that</p><p>Next, by the first bound of Lemma 3.8 applied to k &#8712; [n/ log 2 n, C 0 n], we have that</p><p>Finally, by the second bound of Lemma 3.8 applied to k &#8805; C 0 n, we have that</p><p>Combining all three bounds with Equation (1), the theorem follows. &#9709;</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">A Simple O(1) Bound for Tree Metrics</head><p>In this section we show the power of the structure lemma, by analyzing fair-bias on tree metrics. Recall that a tree metric is defined by shortest-path distances in a tree T = (V, E), with edge lengths d e . By adding zero-length edges, we may assume that the tree has n leaves, and that servers are on the leaves of the tree. For any edge e in the tree, deleting this edge creates two components T </p><p>We defer the proof of this lemma to the full version, where we prove a more general statement regarding stochastic convex optimization with constraints and coefficients determined by elements of a set chosen uniformly with and without replacement. Armed with this lemma, it suffices to bound E S k &#8764;I k [M (S k )] from above, which we do in the following.</p><p>Proof. Fix some edge e and let T 1 (e) be its smaller subtree, containing n e &#8804; n/2 leaves. Let X e &#8764; Bin(k, n e /n) be the random variable denoting the number of servers in T 1 (e) chosen in k i.i.d samples from S. For any given realization of S k (and therefore of X e ) the fractional solution to M (S k ) utilizes edges between the different subtrees of e by exactly |X e /k -n e /n|. Since this is a tree metric, we have</p><p>Taking expectations over S k , and using the fact that the mean average deviation is always upper bounded by the standard deviation (by Jensen's inequality), we find that indeed We can now prove our simple result for tree metrics.</p><p>&#9710; Theorem 4.6. (Tree Bound) Algorithm fair-bias is 4-competitive on tree metrics with n &#8805; 2 nodes, if the requests are drawn from the uniform distribution.</p><p>Proof. We have by the structural lemma (Lemma 2.2) and Lemma 4.5 that</p><p>The above bound holds for all n &#8805; 2 (for n = 1 any algorithm is trivially 1 competitive). For n large, however, our proof yields an improved asymptotic bound of  <ref type="formula">1</ref>)). Combining Theorem 4.6 with our transshipment argument (Lemma 2.1), we obtain a 9-competitive algorithm under any i.i.d. distribution on tree metrics on n &#8805; 2 nodes, and even better than 9-competitive algorithms for large enough n.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>5</head><p>Open Questions</p><p>In this work, we presented algorithm fair-bias and proved that it is O((log log log n) 2 )competitive for general metrics, and 9-competitive for tree metrics. Perhaps the first question is whether our algorithm (or indeed any algorithm) is O(1) competitive for (known or unknown) i.i.d arrivals for general metrics. Indeed, we do not know of any instances where Algorithm fair-bias's performance is worse than O(1) competitive. However, it is not clear how to extend our proofs to establish an O(1) competitive ratio. Another question is the relationship between the known and unknown i.i.d. models and the random order model. The optimal competitive ratios for the various arrival models for online problems can be sorted as follows (see e. For the online metric matching problem the best bounds known for the above are, respectively, O(log 2 n) <ref type="bibr">[4]</ref>, &#920;(log n), O(log n) (both <ref type="bibr">[35]</ref>), and O((log log log n) 2 ) (this work). Given the lower bound of <ref type="bibr">[35]</ref>, our work implies that one or both of the inequalities in C.R.(Random Order) &#8805; C.R.(U nknown IID) &#8805; C.R.(Known IID) is strict (and asymptotically so). It would be interesting to see which of these inequalities is strict, by either presenting a o(log n)-competitive algorithm for unknown i.i.d or a &#969;((log log log n) 2 ) lower bound for this model. For the line metric, we give the first constant-competitive algorithm for this well-studied metric under any non-trivial arrivals. Extending this result, and more generally understanding the exact relationships between these arrival models for this simple metric may prove useful in understanding the relationships between the different stochastic arrival models more broadly. Moreover, it would be interesting to study these questions for other combinatorial optimization problems with online stochastic arrivals.</p></div></body>
		</text>
</TEI>
