Problem definition: We study scheduling multi-class impatient customers in parallel server queueing systems. At the time of arrival, customers are identified as being one of many classes, and the class represents the service and patience time distributions as well as cost characteristics. From the system’s perspective, customers of the same class at time of arrival get differentiated on their residual patience time as they wait in queue. We leverage this property and propose two novel and easy-to-implement multi-class scheduling policies. Academic/practical relevance: Scheduling multi-class impatient customers is an important and challenging topic, especially when customers’ patience times are nonexponential. In these contexts, even for customers of the same class, processing them under the first-come, first-served (FCFS) policy is suboptimal. This is because, at time of arrival, the system only knows the overall patience distribution from which a customer’s patience value is drawn, and as time elapses, the estimate of the customer’s residual patience time can be further updated. For nonexponential patience distributions, such an update indeed reveals additional information, and using this information to implement within-class prioritization can lead to additional benefits relative to the FCFS policy. Methodology: We use fluid approximations to analyze the multi-class scheduling problem with ideas borrowed from convex optimization. These approximations are known to perform well for large systems, and we use simulations to validate our proposed policies for small systems. Results: We propose a multi-class time-in-queue policy that prioritizes both across customer classes and within each class using a simple rule and further show that most of the gains of such a policy can be achieved by deviating from within-class FCFS for at most one customer class. In addition, for systems with exponential patience times, our policy reduces to a simple priority-based policy, which we prove is asymptotically optimal for Markovian systems with an optimality gap that does not grow with system scale. Managerial implications: Our work provides managers ways of improving quality of service to manage parallel server queueing systems. We propose easy-to-implement policies that perform well relative to reasonable benchmarks. Our work also adds to the academic literature on multi-class queueing systems by demonstrating the joint benefits of cross- and within-class prioritization. Funding: A. Bassamboo received financial support from the National Science Foundation [Grant CMMI 2006350]. C. (A.) Wu received financial support from the Hong Kong General Research Fund [Early Career Scheme, Project 26206419]. Supplemental Material: The online appendix is available at https://doi.org/10.1287/msom.2023.1190 .
more »
« less
When Service Times Depend on Customers’ Delays: A Relationship Between Two Models of Dependence
As empirically observed in restaurants, call centers, and intensive care units, service times needed by customers are often related to the delay they experience in queue. Two forms of dependence mechanisms in service systems with customer abandonment immediately come to mind: First, the service requirement of a customer may evolve while waiting in queue, in which case the service time of each customer is endogenously determined by the system’s dynamics. Second, customers may arrive (exogenously) to the system with a service and patience time that are stochastically dependent, so that the service-time distribution of the customers that end up in service is different than that of the entire customer population. We refer to the former type of dependence as endogenous and to the latter as exogenous. Because either dependence mechanism can have significant impacts on a system’s performance, it should be identified and taken into consideration for performance-evaluation and decision-making purposes. However, identifying the source of dependence from observed data is hard because both the service times and patience times are censored due to customer abandonment. Further, even if the dependence is known to be exogenous, there remains the difficult problem of fitting a joint service-patience times distribution to the censored data. We address these two problems and provide a solution to the corresponding statistical challenges by proving that both problems can be avoided. We show that, for any exogenous dependence, there exists a corresponding endogenous dependence, such that the queuing dynamics under either dependence have the same law. We also prove that there exist endogenous dependencies for which no equivalent exogenous dependence exists. Therefore, the endogenous dependence can be considered as a generalization of the exogenous dependence. As a result, if dependence is observed in data, one can always consider the system as having an endogenous dependence, regardless of the true underlying dependence mechanism. Because estimating the structure of an endogenous dependence is substantially easier than estimating a joint service-patience distribution from censored data, our approach facilitates statistical estimations considerably. Funding: C. A. Wu received financial support from the Hong Kong Research Grant Council [Early Career Scheme, Project 26206419]. A. Bassamboo and O. Perry received partial financial support from the National Science Foundation [Grant CMMI 2006350].
more »
« less
- Award ID(s):
- 2006350
- PAR ID:
- 10464337
- Date Published:
- Journal Name:
- Operations Research
- Volume:
- 70
- Issue:
- 6
- ISSN:
- 0030-364X
- Page Range / eLocation ID:
- 3345 to 3354
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
The shortest-remaining-processing-time (SRPT) scheduling policy has been extensively studied, for more than 50 years, in single-server queues with infinitely patient jobs. Yet, much less is known about its performance in multiserver queues. In this paper, we present the first theoretical analysis of SRPT in multiserver queues with abandonment. In particular, we consider the M/GI/s+GI queue and demonstrate that, in the many-sever overloaded regime, performance in the SRPT queue is equivalent, asymptotically in steady state, to a preemptive two-class priority queue where customers with short service times (below a threshold) are served without wait, and customers with long service times (above a threshold) eventually abandon without service. We prove that the SRPT discipline maximizes, asymptotically, the system throughput, among all scheduling disciplines. We also compare the performance of the SRPT policy to blind policies and study the effects of the patience-time and service-time distributions. This paper was accepted by Baris Ata, stochastic models & simulation.more » « less
-
We study matching queues with abandonment. The simplest of these is the two-sided queue with servers on one side and customers on the other, both arriving dynamically over time and abandoning if not matched by the time their patience elapses. We identify nonasymptotic and universal scaling laws for the matching loss due to abandonment, which we refer to as the “cost of impatience.” The scaling laws characterize the way in which this cost depends on the arrival rates and the (possibly different) mean patience of servers and customers. Our characterization reveals four operating regimes identified by an operational measure of patience that brings together mean patience and utilization. The four regimes subsume the regimes that arise in asymptotic (heavy-traffic) approximations. The scaling laws, specialized to each regime, reveal the fundamental structure of the cost of impatience and show that its order of magnitude is fully determined by (i) a “winner-take-all” competition between customer impatience and utilization, and (ii) the ability to accumulate inventory on the server side. Practically important is that when servers are impatient, the cost of impatience is, up to an order of magnitude, given by an insightful expression where only the minimum of the two patience rates appears. Considering the trade-off between abandonment and capacity costs, we characterize the scaling of the optimal safety capacity as a function of costs, arrival rates, and patience parameters. We prove that the ability to hold inventory of servers means that the optimal safety capacity grows logarithmically in abandonment cost and, in turn, slower than the square-root growth in the single-sided queue. This paper was accepted by Baris Ata, stochastic models and simulation. Supplemental Material: The online appendix and data files are available at https://doi.org/10.1287/mnsc.2023.01513 .more » « less
-
We characterize heavy-traffic process and steady-state limits for systems staffed according to the square-root safety rule, when the service requirements of the customers are perfectly correlated with their individual patience for waiting in queue. Under the usual many-server diffusion scaling, we show that the system is asymptotically equivalent to a system with no abandonment. In particular, the limit is the Halfin-Whitt diffusion for the [Formula: see text] queue when the traffic intensity approaches its critical value 1 from below, and is otherwise a transient diffusion, despite the fact that the prelimit is positive recurrent. To obtain a refined measure of the congestion due to the correlation, we characterize a lower-order fluid (LOF) limit for the case in which the diffusion limit is transient, demonstrating that the queue in this case scales like [Formula: see text]. Under both the diffusion and LOF scalings, we show that the stationary distributions converge weakly to the time-limiting behavior of the corresponding process limit. Funding: This work was supported by the National Natural Science Foundation of China [Grant 72188101] and the Division of Civil, Mechanical and Manufacturing Innovation [Grants 1763100 and 2006350].more » « less
-
We study the problem of optimal information sharing in the context of a service system. In particular, we consider an unobservable single server queue offering a service at a fixed price to a Poisson arrival of delay-sensitive customers. The service provider can observe the queue, and may share information about the state of the queue with each arriving customer. The customers are Bayesian and strategic, and incorporate any information provided by the service provider into their prior beliefs about the queue length before making the decision whether to join the queue or leave without obtaining service. We pose the following question: which signaling mechanism and what price should the service provider select to maximize her revenue? We formulate this problem as an instance of Bayesian persuasion in dynamic settings. The underlying dynamics make the problem more difficult because, in contrast to static settings, the signaling mechanism adopted by the service provider affects the customers' prior beliefs about the queue (given by the steady state distribution of the queue length in equilibrium). The core contribution of this work is in characterizing the structure of the optimal signaling mechanism. We summarize our main results as follows. (1) Structural characterization: Using a revelation-principle style argument, we find that it suffices to consider signaling mechanisms where the service provider sends a binary signal of "join" or "leave", and under which the equilibrium strategy of a customer is to follow the service provider's recommended action. (2) Optimality of threshold policies: For a given fixed price for service, we use the structural characterization to show that the optimal signaling mechanism can be obtained as a solution to a linear program with a countable number of variables and constraints. Under some mild technical conditions on the waiting costs, we establish that there exists an optimal signaling mechanism with a threshold structure, where service provider sends the "join" signal if the queue length is below a threshold, and "leave" otherwise. (In addition, at the threshold, the service provider randomizes.) For the special case of linear waiting costs, we derive an analytical expression for the optimal threshold i terms of the two branches of the Lambert-W function. (3) Revenue comparison: Finally, we show that with the optimal choice of the fixed price and using the corresponding optimal signaling mechanism, the service provider can achieve the same revenue as with the optimal state-dependent pricing mechanism in a fully-observable queue. This implies that in settings where state-dependent pricing is not feasible, the service provider can effectively use optimal signaling (with the optimal fixed price) to achieve the same revenue.more » « less
An official website of the United States government

