<?xml-model href='http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng' schematypens='http://relaxng.org/ns/structure/1.0'?><TEI xmlns="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<fileDesc>
			<titleStmt><title level='a'>SRPT Scheduling Discipline in Many-Server Queues with Impatient Customers</title></titleStmt>
			<publicationStmt>
				<publisher></publisher>
				<date>10/21/2021</date>
			</publicationStmt>
			<sourceDesc>
				<bibl> 
					<idno type="par_id">10317892</idno>
					<idno type="doi">10.1287/mnsc.2021.4110</idno>
					<title level='j'>Management Science</title>
<idno>0025-1909</idno>
<biblScope unit="volume">67</biblScope>
<biblScope unit="issue">12</biblScope>					

					<author>Jing Dong</author><author>Rouba Ibrahim</author>
				</bibl>
			</sourceDesc>
		</fileDesc>
		<profileDesc>
			<abstract><ab><![CDATA[The shortest-remaining-processing-time (SRPT) scheduling policy has been extensively studied, for more than 50 years, in single-server queues with infinitely patient jobs. Yet, much less is known about its performance in multiserver queues. In this paper, we present the first theoretical analysis of SRPT in multiserver queues with abandonment. In particular, we consider the M/GI/s+GI queue and demonstrate that, in the many-sever overloaded regime, performance in the SRPT queue is equivalent, asymptotically in steady state, to a preemptive two-class priority queue where customers with short service times (below a threshold) are served without wait, and customers with long service times (above a threshold) eventually abandon without service. We prove that the SRPT discipline maximizes, asymptotically, the system throughput, among all scheduling disciplines. We also compare the performance of the SRPT policy to blind policies and study the effects of the patience-time and service-time distributions.      This paper was accepted by Baris Ata, stochastic models & simulation.]]></ab></abstract>
		</profileDesc>
	</teiHeader>
	<text><body xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>In this paper, we study scheduling decisions in multiserver queues with processing-time information.</p><p>In particular, we assume that the service requirements of individual jobs (customers) are known upon arrival and may be used to determine the order in which these jobs are processed. When processing times are known, it seems natural to give priority to the jobs with the shorter remaining processing times in order to minimize the mean sojourn time, i.e., the time from arrival until departure from the system. Indeed, a large body of literature shows that the shortest-remainingprocessing-time (SRPT) policy has, in general, superior performance; see, for example, <ref type="bibr">Schrage and Miller (1966)</ref>, <ref type="bibr">Lin et al. (2011)</ref>, and <ref type="bibr">Banerjee et al. (2020)</ref>, to name a few. The SRPT policy has been extensively studied for over 50 years, yet exclusively in single-server queues with infinitely patient jobs. There is only one exception: The recent paper by <ref type="bibr">Grosof et al. (2018)</ref> studies the performance of the SRPT policy in a multiserver queueing system with Poisson arrivals, general service times, and no abandonment. <ref type="bibr">Grosof et al. (2018)</ref> prove that the SRPT policy achieves an asymptotically optimal mean sojourn time in the conventional heavy traffic regime, i.e., where the number of servers in a sequence of multiserver queues is held fixed, while the arrival rates along that sequence approach the total service capacity.</p><p>In settings where jobs are human customers to be scheduled, e.g., in service systems, it is important to account for finite customer patience times. That is, customers do not wait indefinitely for service, and they abandon the queue if their waiting time exceeds their patience time. For example, there is empirical evidence substantiating finite customer patience in emergency departments <ref type="bibr">(Batt and Terwiesch 2015)</ref> and call centers <ref type="bibr">(Brown et al. 2005)</ref>. Moreover, it is well known that incorporating customer impatience strongly affects performance in the system <ref type="bibr">(Garnett et al. 2002)</ref>.</p><p>Thus, there is a need to investigate whether the superior performance of SRPT continues to hold in multiserver queues where patience times are finite. This investigation is the focus of our paper.</p><p>What is this paper about? To the best of our knowledge, the performance of SRPT in multiserver queues with abandonment is entirely open. In this paper, we take a step towards filling that gap in the literature, by studying the steady-state performance of SRPT in the M/GI/s + GI queue. We adopt a many-server asymptotic mode of analysis and focus on the overloaded regime, which is also known as the efficiency-driven regime. This regime is appropriate because queueing times are negligible, in large systems with abandonment, under moderate or light load (i.e., critically loaded or underloaded regimes) <ref type="bibr">(Garnett et al. 2002)</ref>. In the many-server overloaded regime, a non-negligible proportion of customers abandon the queue. Thus, carefully designing the scheduling policy to optimize the throughput is crucial in this setting.</p><p>For multiserver queues with abandonment, under SRPT scheduling, we demonstrate a statespace collapse in the many-server overloaded limit. In particular, we prove that only customers with long service times (above a threshold) wait in the queue, and eventually abandon, whereas customers with short service times are immediately served. We also derive closed-form expressions for key steady-state performance measures in the limit. We prove that, asymptotically, among all scheduling policies, SRPT maximizes the throughput in the system, minimizes the expected waiting time conditional on being served, and maximizes the expected waiting time conditional on abandoning. We focus on such measures, rather than the mean sojourn time as is common in the extant literature, because not all customers are served in queues with abandonment. We also show that performance in the SRPT queue is, asymptotically, insensitive to the patience-time distribution beyond its mean, which is unlike performance under first-come-first-served (FCFS).</p><p>We compare SRPT to blind policies that do not use the processing-time information, such as FCFS and last-come-first-served (LCFS). In addition to throughput, we also compare the expected waiting times. Based on fluid approximations, we show that, when the patience-time distribution has a non-decreasing failure rate, SRPT yields a smaller expected waiting time than any blind policy. On the other hand, when the patience-time distribution has a decreasing failure rate, SRPT yields a smaller expected waiting time than LCFS, but it may lead to longer expected waiting time than FCFS. Either way, even when SRPT beats blind policies, it does not offer any order of magnitude improvement in waiting times over those policies. This lies in contrast to the asymptotic system performance in the conventional heavy-traffic regime, as the traffic intensity increases <ref type="bibr">(Lin et al. 2011</ref><ref type="bibr">, Puha et al. 2015</ref><ref type="bibr">, Chen and Dong 2020)</ref>.</p><p>Why is this problem difficult? In general, analyzing SRPT is complicated because it requires keeping track of the remaining processing time of each customer in the system. Even in single-server queues, where closed-form expressions have been known for a while, comparing SRPT to other scheduling disciplines is difficult because those closed-form expressions, e.g., for the mean sojourn time, have complex forms and involve nested integrals. Asymptotic analysis, e.g., under heavy traffic, generally allows for simpler descriptions of the system. However, the asymptotic analysis of SRPT involves studying suitably scaled measure-valued system state descriptors, which imposes substantial technical challenges <ref type="bibr">(Banerjee et al. 2020)</ref>.</p><p>Even without abandonment, when moving from a single server to multiple servers, there is a main challenge in extending the existing single-server arguments. The difficulty arises from the fact that multiserver queues are not work-conserving. Specifically, this makes the analysis of busy periods and steady-state workload, both of which are central to the "tagged job approach" of analysing SRPT single-server queues <ref type="bibr">(Schrage and Miller 1966)</ref>, difficult to extend to a multiserver setting; see section 4.2 in <ref type="bibr">Grosof et al. (2018)</ref>.</p><p>In this paper, we allow for multiple servers, general service times, and general patience times, which complicates the analysis even more. Scheduling decisions in systems with abandonment is notoriously difficult, because the optimal scheduling policy can be complex and depends on the patience-time distribution <ref type="bibr">(Puha and Ward 2019)</ref>. For example, when the system is critically loaded, the optimal diffusion control may no longer follow a simple fixed priority rule <ref type="bibr">(Kim and</ref><ref type="bibr">Ward 2013, Kim et al. 2018)</ref>. In this paper, we circumvent the difficulty of doing direct analysis on the SRPT queue by relying on a properly coupled loss queueing system; see section 3 for details.</p><p>Literature review. Because of its optimality properties, the study of SRPT in single-server queues has been the topic of hundreds of papers. Given the richness of that literature, we do not attempt to be comprehensive in our review, and only mention a few key references instead. <ref type="bibr">Schrage and Miller (1966)</ref> and <ref type="bibr">Schrage (1968)</ref> demonstrate optimality properties of SRPT in the M/G/1 system. There is a notable stream of works that studies SRPT under heavy-traffic <ref type="bibr">(Down et al. 2009</ref><ref type="bibr">, Gromoll et al. 2011</ref><ref type="bibr">, Puha et al. 2015)</ref>. <ref type="bibr">Scully et al. (2018)</ref> develops a unified framework to analyze several age-based scheduling policies in the M/G/1 queue.</p><p>Recently, <ref type="bibr">Chen and Dong (2020)</ref> demonstrate that in the GI/GI/1 queue, under heavy traffic, a preemptive two-class priority rule achieves asymptotically comparable performance to the SRPT policy. In the two-class priority rule, customers whose service times are shorter than a certain threshold are given preemptive priority over customers whose service times are above that threshold. Chen and Dong (2020) establishes state-space collapse, under which only the low-priority customers (with long service times) occupy the queue in the heavy-traffic limit. Similar results are common in scheduling multi-class priority queues; e.g., see <ref type="bibr">Reiman (1984)</ref>, <ref type="bibr">Bramson (1998)</ref>, and <ref type="bibr">Dai and Tezcan (2011)</ref>.</p><p>In stark contrast to the single-server setting, much less is known about the performance of SRPT in a multiserver queueing model. With multiple servers, SRPT is not necessarily optimal <ref type="bibr">(Leonardi and Raz 2007)</ref>. However, <ref type="bibr">Grosof et al. (2018)</ref> recently show that it is asymptotically optimal under heavy load. In <ref type="bibr">Grosof et al. (2018)</ref>, jobs are assumed to be infinitely patient. In this paper, we consider finite patience times. To the best of our knowledge, we are the first to derive theoretical results on the performance of SRPT in multiserver queues with abandonment.</p><p>2. Modeling Setup: The SRPT M/GI/s + GI Queue</p><p>In this section, we set the stage for our subsequent theoretical development by describing our modeling framework and defining our many-server asymptotic mode of analysis.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1.">Model Description</head><p>We consider the M/GI/s + GI queue in steady state, i.e., we assume that the arrival process is Poisson with rate &#955;, service times are independent and identically distributed (i.i.d.) with a general cumulative distribution function (cdf) G and mean 1/&#181;, and times to abandon are i.i.d. with cdf F and mean 1/&#952;. Let S denote a generic service time and T a generic patience time. In addition, let S i and T i denote the service time and patience time of the i-th arriving customer. We assume that the service-time distribution and the patience-time distribution are continuous with probability density functions g and f , respectively. There are s homogeneous servers working in parallel.</p><p>We consider the SRPT queueing discipline. Specifically, a customer who arrives to find an empty server goes to service immediately upon arrival. If all servers are busy at the arrival epoch, and there exists at least one customer in service whose remaining processing time is longer than the new arriving customer's, then the customer in service with the longest remaining processing time is preempted and joins the queue (this preempted customer may resume service later with the corresponding remaining processing time). Otherwise, the new arrival joins the queue directly. Customers have finite patience times, generated at the arrival epoch of the customer. If the cumulative amount of the time that the customer spends in the queue exceeds her patience time, then the customer abandons the system. In particular, if a customer enters service and is later preempted back to queue, then we assume that her initial patience time (which had not fully elapsed since she did not abandon previously) continues to elapse, i.e., we do not generate a new patience time for the preempted customer at every preemption epoch. We assume that the arrival, service, and abandonment processes are mutually independent. We define the traffic intensity &#961; &#8801; &#955;/s&#181;.</p><p>Because abandonment is allowed in the system, it is not necessary to assume &#961; &lt; 1 for the system to reach a steady state. To elaborate, with general service-time or patience-time distributions, there is no finite-dimensional Markovian representation of the queue <ref type="bibr">(Dai and He 2013)</ref>. Indeed, a Markovian description of the state of the system would require keeping track of the remaining or elapsed patience times and the remaining or elapsed service times of each customer present in the system (in service or in queue). We now present an infinite-dimensional state representation which leads to a Markovian description of the dynamics of the system (with respect to a suitable filtration). For t &#8805; 0, let X(t) &#8712; N 0 := {0, 1, . . . } denote the number of customers in the system and R(t) &#8712; R 3&#215;&#8734; denote the remaining service times, remaining patience times, and initial service times of customers in the system. Let R j (t) be the j-th column of R(t). When X(t) &gt; 0, for j &#8712; {1, . . . , X(t)}, R j (t) is a column vector whose first element is the remaining service time, the second element is the remaining patience time, and the third element is the initial service time of the j-th earliest arriving customer, among all customers currently in the system. For j &gt; X(t), R j (t) is a column vector with zero entries. Then, the process R(t) is a Markov process which describes system dynamics. Note that, in order to describe how the system evolves, we only need to know the first two elements in R j (t), i.e., the remaining service time and the remaining patience time of each customer. We add a third element, the initial service time, to facilitate the development of the state-space collapse result. The invariant measure of the Markov process R(t) is referred to as the steady-state distribution of the system.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2.">Many-Server Overloaded Regime</head><p>To derive theoretical insights on the performance of SRPT, we consider a sequence of M/GI/s &#955; + GI queues, indexed by the arrival rate &#955;. We fix the traffic intensity in system &#955; to &#961; &#955; = &#955;/(s &#955; &#181;) &#8801; &#961; &gt; 1, i.e., we consider an overloaded setting. We hold the service-time and patience-time distributions fixed, independently of &#955;, and let &#955; and s &#955; increase without bound.</p><p>Define the threshold, &#964; , satisfying</p><p>where 1(&#8226;) denotes the indicator function. That is, we choose &#964; such that the total workload of customers with service times smaller than or equal to &#964; matches the service capacity of the system.</p><p>In the following section, we prove state-space collapse, i.e., that the SRPT queue is asymptotically equivalent to a two-class priority queue. The high-priority class, defined as jobs whose service times are smaller than or equal to &#964; , has preemptive priority over the low-priority class, defined as jobs whose service times are larger than &#964; . We emphasize that &#964; , as defined in (1), does not depend on &#955; since s &#955; /&#955; = 1/(&#961;&#181;) is held fixed under our scaling.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Asymptotic Analysis</head><p>Direct analysis on the SRPT M/GI/s &#955; + GI queue is complicated, which partly explains why it has eluded theoretical analysis for decades. Here, we overcome the technical challenges by proposing a coupling argument. Specifically, we derive asymptotic results quantifying performance in the SRPT queue by constructing a coupled two-class preemptive priority M/GI/s &#955; /s &#955; (loss) queue, with the same arrival process and service-time distribution. Under the coupling, both systems see the same arriving customers, i.e., with the same arrival times and service requirements. However, the service disciplines in the two systems are different. In contrast to multiserver queues with abandonment under SRPT, much more is known about loss queues.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">A Sequence of Coupled Loss Systems</head><p>The coupled M/GI/s &#955; /s &#955; loss queue operates under the following preemptive two-class priority rule. Recall the threshold &#964; defined in (1). In the loss queue, customers whose service times are less than or equal to &#964; are grouped into the high-priority class, which we refer to as class 1. The remaining customers are grouped into the low-priority class, which we refer to as class 2. There are s &#955; servers, and no waiting room.</p><p>Upon arrival, a high-priority, class 1, customer enters service immediately if there is an empty server, or if there is at least one low-priority, class 2, customer in service (the class 2 customer with the longest remaining processing time will be preempted). Otherwise, i.e., if all servers are busy with class 1 customers, the newly arriving customer is lost. A class 2 customer enters service only if there is an idle server upon arrival and, if all servers are busy, the class 2 customer is lost. A preempted class 2 customer is also lost as there is no waiting room in the system. Because of preemption, the high-priority customers do not "see" the low-priority customers: For class 1 customers, the system behaves as a single-class M/GI/s &#955; /s &#955; queue with arrival rate &#955;G(&#964; ), resulting from thinned Poisson arrivals, and service time distribution</p><p>In what follows, we refer to the M/GI/s &#955; + GI SRPT queue as system O, where O stands for original, and the two-class priority M/GI/s &#955; /s &#955; loss queue as system L, where L stands for loss.</p><p>By a slight abuse of notation, in both systems, we refer to customers whose service times are less than or equal to &#964; as class 1 customers, and customers whose service times are longer than &#964; as class 2 customers.</p><p>For M &#8712; {L, O}, i = 1, 2, and t &#8805; 0, we let N M,i (t) denote the number of class i customers served (have successfully finished service) in system M by time t. We also define</p><p>Th M,i is the long-run departure rate (throughput) of class i customers in system M. Let Th O = Th O,1 + Th O,2 to be the total throughput in system O.</p><p>The following proposition builds the foundation of our coupling argument and is the key to deriving limiting performance measures in section 3.2.</p><p>Proposition 1. For the coupled SRPT and loss queues, and the throughput defined as in (2):</p><p>Proof. We consider two coupled systems which are empty initially and see exactly the same customers. We next introduce a mechanism to match each class 1 customer served in system L to a customer in system O who finishes service no later than the customer in system L. In particular, we will show that, at any time t, for each class 1 customer who is in service in system L, there is a matched customer who is either in service with an equal or shorter remaining processing time in system O, or has already finished service in system O. In addition, each customer in system O is matched with at most one class 1 customer in system L. We prove this claim and construct the matching by induction on the consecutive arrival epochs of class 1 customers.</p><p>The above claim is trivially true before the arrival of the first class 1 customer. For the inductive step, we suppose that it is true before the arrival of the k-th class 1 customer, and proceed to show that it holds after that arrival. We refer to the k-th class 1 customer as customer k. If customer k is lost in system L, then the claim is trivially true. If customer k enters service in system L (by either joining an empty server or preempting a class 2 customer in service), then we consider the following three scenarios that can happen in system O upon the arrival of customer k.</p><p>Case I. In system O, customer k joins an empty server or preempts a customer who is not matched with any customer in system L. Then, we can match customer k in system L with customer k in system O.</p><p>Case II. In system O, customer k preempts a customer who is already matched with a class 1 customer in system L. We refer to this preempted customer in system O and the matched customer in system L as customer k O and customer k L respectively. Note that in this case, before the arrival of customer k, system L has strictly fewer than s &#955; class 1 customers in service, while there are s &#955; customers in service in system O. Based on the inductive assumption, in system O, there must be a customer in service that has not been matched with any class 1 customer in system L yet.</p><p>We refer to this customer as customer k O . As customer k O is preempted instead of customer k O , customer k O must have a shorter remaining processing time than customer k O . First, we rematch customer k L in system L with customer k O in system O. Note that as the remaining processing time of customer k L is larger than customer k O based on our inductive assumption, customer k O will finish service before customer k L . Then, we match customer k in system L with customer k in system O.</p><p>Case III. In system O, customer k waits in the queue. Similar to Case II, in this case, before the arrival of customer k, system L has strictly fewer than s &#955; class 1 customers in service, while there are s &#955; customers in service in system O. Based on the inductive assumption, in system O, there must be a customer who has not been matched with any class 1 customer in system L yet. We refer to this customer as customer k O . In addition, this customer must have a shorter remaining processing time than customer k. In this case, we match customer k in system L with customer</p><p>Under the matching mechanism described above, each class 1 customer who gets served in system L is matched with a customer who gets served in system O, and this matched customer in system O is not matched with any other customers in system L. Thus, N L,1 (t) &#8804; N O,1 (t) + N O,2 (t) for each t &#8805; 0, sample path by sample path. Then, we have</p><p>i.e., Th L,1 &#8804; Th O , as desired.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">Asymptotic Performance in the SRPT M/GI/s + GI Queue</head><p>In what follows, we consider a particular "tagged" customer c arriving to a random system state drawn from the system's steady-state distribution. We denote by "Serv c " the event that customer c is served and "Aband c " the event that customer c abandons the system. We also write V c as the virtual waiting time assuming customer c has infinite patience, and W c as her actual waiting time, i.e., W c = min{V c , T c }, where we recall that T c is the patience time of customer c.</p><p>In Theorem 1, we derive limits for several key performance measures of the M/GI/s &#955; + GI SRPT queue in steady state. The main observation is the state-space collapse. In particular, in the many-server limit, all class 1 customers are served immediately upon arrival and no class 2 customers are served, i.e., class 2 customers all abandon the queue.</p><p>Theorem 1. For the sequence of M/GI/s &#955; + GI queues under SRPT with &#961; &#955; = &#955;/s &#955; &#181; &gt; 1 held fixed and the threshold, &#964; , as defined in (1):</p><p>The proof of Theorem 1 can be found in Appendix A. We note from the theorem that the steadystate probability of abandonment and various expected waiting-time measures are insensitive to the patience-time distribution beyond its mean, and depend solely on the service-time distribution.</p><p>This lies in contrast to performance in the M/GI/s + GI queue under FCFS, where the system's performance depends on the patience-time distribution beyond its mean <ref type="bibr">(Whitt 2005</ref><ref type="bibr">(Whitt , 2006))</ref>.</p><p>The following corollary follows directly from Theorem 1 parts (b) and (c). It demonstrates the desirable performance of SRPT.</p><p>Corollary 1. For the sequence of M/GI/s &#955; + GI queues, SRPT asymptotically minimizes the steady-state waiting time conditional on being served, and asymptotically maximizes the steady-state waiting time conditional on abandoning</p><p>Let Th &#955; M denote the maximum throughput of the M/GI/s &#955; + GI queue. While the maximum throughput is not necessarily achieved by SRPT, we show in the following proposition that SRPT maximizes the throughput asymptotically.</p><p>Proposition 2. For the sequence of M/GI/s &#955; + GI queues,</p><p>i.e., SRPT asymptotically maximizes the throughput among all service disciplines.</p><p>The proof of Proposition 2 can be found in Appendix B. We note from the proposition and the definition of &#964; in (1) that for a fixed value of &#961;, the asymptotically maximal throughput depends on the service-time distribution. This is in contrast to the throughput of the M/GI/s &#955; + GI queue under FCFS (or LCFS) where, for a fixed value of &#961; &gt; 1, the throughput scaled by &#955; is equal to 1/&#961; <ref type="bibr">(Whitt 2006)</ref>. Furthermore, from the definition of &#964; in (1), we have</p><p>which implies that</p><p>. Then, we must have the inequality:</p><p>i.e., the throughput of SRPT is strictly larger than the throughput of FCFS in the many-server overloaded regime.</p><p>We will discuss the effect of the service-time distribution on the throughput of SRPT in more detail in section 4.2: We will illustrate through numerical examples that for fixed values of E[S]</p><p>and &#961;, the heavier the tail of the service-time distribution, the larger the throughput that SRPT can achieve.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Comparison to Blind Policies</head><p>In this section, we compare the performance of SRPT to blind policies that do not use the servicetime information, such as FCFS and LCFS. We focus on the effect of patience-time and service-time distributions on system performance. In addition to throughput, we also consider the steady-state expected waiting time. The waiting time is defined as the cumulative amount of time that the customer spends in queue until departure, either due to service or abandonment. We do so because</p><p>(1) waiting time measures are generally of interest in the management of service systems, and (2)</p><p>while SRPT asymptotically maximizes the throughput in the system (Proposition 2), it does not necessarily minimize waiting times, and it is important to shed further light on this point.</p><p>To compare the performance of SRPT to blind policies, we rely on steady-state fluid approximations for systems under blind policies as described in <ref type="bibr">Whitt (2006)</ref>. Fluid approximations are known to be remarkably accurate in large-scale overloaded systems <ref type="bibr">(Kang et al. 2010)</ref>, which is the regime that we consider (section 2.2). Throughout this section, we fix the traffic intensity &#961; = 1.4.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.">The Effect of the Patience-Time Distribution</head><p>We consider the asymptotic throughput, scaled by &#955;, under the many-server overloaded scaling. In this case, the patience-time distribution has no effect on the throughput under any of the scheduling disciplines. For the SRPT queue, the throughput is equal to G(&#964; ) by Proposition 2. For all blind policies (including FCFS and LCFS), it is equal to 1/&#961; <ref type="bibr">(Whitt 2006)</ref>.</p><p>As for the steady-state waiting time, even though the patience-time distribution beyond its mean has no effect on the limiting steady-state waiting time in the SRPT queue (part (d) in Theorem 1), it plays a central role in the performance of the FCFS queue <ref type="bibr">(Whitt 2006)</ref>. In particular, in the many-server overloaded limit, the steady-state fluid waiting time of the FCFS queue is</p><p>We first consider patience-time distributions with strictly increasing-hazard-rate (IHR) or strictly decreasing-hazard-rate (DHR). The hazard rate of the patience-time distribution is defined as</p><p>. The distribution has IHR (DHR) if h is monotonically increasing (decreasing) in x on (0, &#8734;). For blind policies, if h is decreasing in x (i.e., DHR), waiting customers become increasingly patient with time. To minimize the waiting time, we should process customers who waited more first. Analogously, if h(x) is increasing in x (i.e., IHR), then we should process customers who waited less first.</p><p>For IHR or exponential patience-time distributions (which has a constant hazard rate), LCFS minimizes the steady-state fluid waiting among all blind policies (Proposition 3 in <ref type="bibr">Bassamboo and Randhawa (2015)</ref>). The fluid LCFS queue is described by two classes: The high-priority class is entirely served and does not wait for service, whereas the low priority class abandons entirely. Thus, the steady-state fluid waiting time under LCFS is given by (1 -1/&#961;)&#952;. On the other hand, Theorem 1 shows that the limiting steady-state expected waiting time under SRPT is (1 -G(&#964; ))/&#952;. Because</p><p>This implies that SRPT outperforms all blind policies for overloaded systems with IHR or exponential patience-time distributions, when the system is large enough. For DHR patience-time distributions, FCFS minimizes the steady-state fluid waiting time among all blind policies (Proposition 3 in <ref type="bibr">Bassamboo and Randhawa (2015)</ref>).</p><p>In this case, SRPT may lead to a larger expected waiting time than FCFS.</p><p>In Figure <ref type="figure">1</ref>, we compare the performance of FCFS, LCFS, and SRPT under Weibull (left-side figure) or Pareto (right-side figure) patience-time distributions. For both distributions, we vary the shape parameter and adjust the scale parameter accordingly so that the mean time to abandon is equal to 1. We fix the service-time distribution to be exponential with mean equal to 1 as well. For LCFS and FCFS, we present the fluid limit. For SRPT, we present (1 -G(&#964; ))/&#952;.</p><p>For the Weilbull distribution, F (x) = (1 -exp(-(x/m) &#945; ))1(x &#8805; 0), where &#945; &gt; 0 is referred to as the shape parameter and m &gt; 0 is the scale parameter. When the shape parameter is smaller than 1, it has DHR; when the shape parameter is equal to 1, it is an exponential distribution; when the shape parameter is larger than 1, it has IHR. We observe from the right plot in Figure <ref type="figure">1</ref> that for small enough values of the shape parameter (i.e., &lt; 0.6), FCFS can achieve a shorter expected waiting time than SRPT.</p><p>For the Pareto distribution, F (x) = (x/m) &#945; 1(x &#8805; m), where &#945; &gt; 1 is referred to as the shape parameter and m &gt; 0 is the scale parameter. The hazard rate function is no longer monotone on (0, &#8734;). In particular, h(x) = 0 for x &lt; m, and h(x) &gt; 0 and is decreasing in x for x &#8805; m. In this case, FCFS may not be optimal among blind policies (Proposition 5 in <ref type="bibr">Bassamboo and Randhawa (2015)</ref>). We observe from the right plot in Figure <ref type="figure">1</ref> that FCFS only achieves a shorter expected waiting time than LCFS for small enough values of the shape parameter (i.e., &lt; 1.3). Moreover, SRPT leads to shorter expected waiting times than both FCFS and LCFS in this case. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.">The Effect of the Service-Time Distribution</head><p>It has been observed that in single-server queues, the advantage of SRPT is especially pronounced with a heavy tailed service-time distribution; see, for example, <ref type="bibr">Lin et al. (2011)</ref> and <ref type="bibr">Chen and Dong (2020)</ref>. We next investigate whether the same holds in the multiserver setting with abandonment. For service times, we consider the family of Weibull distributions and the family of Pareto distributions, both with varying shape parameters. We set the patience-time distribution to be Weibull. Note that the service-time distribution has no effect on the throughput and steady-state fluid waiting times for FCFS and LCFS queues, but it plays an important role in the performance of the SRPT queue.</p><p>In Figure <ref type="figure">2</ref> we compare the throughput for SRPT, FCFS, and LCFS when service times have Weibull (left-side figure) or Pareto (right-side figure) distributions. We vary the value of the shape parameter in the Weibull and Pareto service-time distributions and adjust the scale parameter accordingly so that the mean service time is fixed at 1. Note that as the shape parameter decreases, the tails of the Weibull or Pareto distributions become heavier (i.e., 1 -F (x) decays to zero at a slower rate as x increases). The patience-time distribution is fixed as a Weibull distribution with shape 0.4 and mean 1. We observe from the figure that SRPT always yields the highest throughput (as we proved in Proposition 2). More importantly, the throughput of SRPT decreases as the shape  i.e., in stationarity, the rate at which customers finish service is the same as the rate at which they enter service. In addition,</p><p>i.e., the stationary number of customers in service is less than the service capacity. Lastly, define:</p><p>From ( <ref type="formula">5</ref>), and defining s &#8801; (&#961;&#181;) -1 , we have that:</p><p>We will need the following two lemmas, Lemma 1 and Lemma 2, which we state and prove.</p><p>Lemma 1. For the sequence of two-class M/GI/s &#955; /s &#955; queues under the preemptive priority rule,</p><p>Proof. We first show that for a fixed &#955;,</p><p>.</p><p>As class 1 customers have preemptive priority over class 2 customers: For a class 1 customer, the system operates like a single-class loss queue with arrival rate &#955;G(&#964; ) and service-time distribution [S|S &#8804; &#964; ]. As the steady-state blocking probability depends on the service-time distribution only through its mean, and r = &#955;G(&#964; )E[S|S &#8804; &#964; ] = s &#955; , we have the steady-state blocking probability:</p><p>.</p><p>Next, class 1 customers enter service in the loss queue at rate &#955;G(&#964; )(1 -Pb &#955; ). By rate conservation, we have</p><p>Lemma 2. For the sequence of M/GI/s &#955; + GI queues under SRPT, &#947;(x) = 1{x &#8804; &#964; }.</p><p>Proof. We first note that Lemma 1 implies that</p><p>Second, consider the optimization problem</p><p>The constraint</p><p>For the objective function, we note that We have thus proved part (a) in Theorem 1.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A.2. Other Performance Measures</head><p>We now turn to proving parts (b) -(d) in the theorem.</p><p>Proof. For customers with service time less than or equal to &#964; , by part (a) in Theorem 1,</p><p>We note from the proof of part (a) that the convergence holds regardless of patience time distribution. This implies that &#947; * is an optimal solution to (8) with the optimal objective value &#955;G(&#964; ). Thus, Th &#955; M &#8804; &#955;G(&#964; ).</p><p>Because lim &#955;&#8594;&#8734; Th &#955; O /&#955; = G(&#964; ), SRPT maximizes the throughput asymptotically.</p></div></body>
		</text>
</TEI>
