skip to main content

Title: Optimally Scheduling Heterogeneous Impatient Customers

Problem definition: We study scheduling multi-class impatient customers in parallel server queueing systems. At the time of arrival, customers are identified as being one of many classes, and the class represents the service and patience time distributions as well as cost characteristics. From the system’s perspective, customers of the same class at time of arrival get differentiated on their residual patience time as they wait in queue. We leverage this property and propose two novel and easy-to-implement multi-class scheduling policies. Academic/practical relevance: Scheduling multi-class impatient customers is an important and challenging topic, especially when customers’ patience times are nonexponential. In these contexts, even for customers of the same class, processing them under the first-come, first-served (FCFS) policy is suboptimal. This is because, at time of arrival, the system only knows the overall patience distribution from which a customer’s patience value is drawn, and as time elapses, the estimate of the customer’s residual patience time can be further updated. For nonexponential patience distributions, such an update indeed reveals additional information, and using this information to implement within-class prioritization can lead to additional benefits relative to the FCFS policy. Methodology: We use fluid approximations to analyze the multi-class scheduling problem with ideas borrowed from convex optimization. These approximations are known to perform well for large systems, and we use simulations to validate our proposed policies for small systems. Results: We propose a multi-class time-in-queue policy that prioritizes both across customer classes and within each class using a simple rule and further show that most of the gains of such a policy can be achieved by deviating from within-class FCFS for at most one customer class. In addition, for systems with exponential patience times, our policy reduces to a simple priority-based policy, which we prove is asymptotically optimal for Markovian systems with an optimality gap that does not grow with system scale. Managerial implications: Our work provides managers ways of improving quality of service to manage parallel server queueing systems. We propose easy-to-implement policies that perform well relative to reasonable benchmarks. Our work also adds to the academic literature on multi-class queueing systems by demonstrating the joint benefits of cross- and within-class prioritization.

Funding: A. Bassamboo received financial support from the National Science Foundation [Grant CMMI 2006350]. C. (A.) Wu received financial support from the Hong Kong General Research Fund [Early Career Scheme, Project 26206419].

Supplemental Material: The online appendix is available at .

more » « less
Award ID(s):
Author(s) / Creator(s):
; ;  
Publisher / Repository:
Date Published:
Journal Name:
Manufacturing & Service Operations Management
Page Range / eLocation ID:
1066 to 1080
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The shortest-remaining-processing-time (SRPT) scheduling policy has been extensively studied, for more than 50 years, in single-server queues with infinitely patient jobs. Yet, much less is known about its performance in multiserver queues. In this paper, we present the first theoretical analysis of SRPT in multiserver queues with abandonment. In particular, we consider the M/GI/s+GI queue and demonstrate that, in the many-sever overloaded regime, performance in the SRPT queue is equivalent, asymptotically in steady state, to a preemptive two-class priority queue where customers with short service times (below a threshold) are served without wait, and customers with long service times (above a threshold) eventually abandon without service. We prove that the SRPT discipline maximizes, asymptotically, the system throughput, among all scheduling disciplines. We also compare the performance of the SRPT policy to blind policies and study the effects of the patience-time and service-time distributions. This paper was accepted by Baris Ata, stochastic models & simulation. 
    more » « less
  2. Abstract

    We develop a robust queueing network analyzer algorithm to approximate the steady‐state performance of a single‐class open queueing network of single‐server queues with Markovian routing. The algorithm allows nonrenewal external arrival processes, general service‐time distributions and customer feedback. The algorithm is based on a decomposition approximation, where each flow is partially characterized by its rate and a continuous function that measures the stochastic variability over time. This function is a scaled version of the variance‐time curve, called the index of dispersion for counts (IDC). The required IDC functions for the external arrival processes can be calculated from the model primitives or estimated from data. Approximations for the IDC functions of the internal flows are calculated by solving a set of linear equations. The theoretical basis is provided by heavy‐traffic limits for the flows established in our previous papers. A robust queueing technique is used to generate approximations of the mean steady‐state performance at each queue from the IDC of the total arrival flow and the service specification at that queue. The algorithm's effectiveness is supported by extensive simulation studies.

    more » « less
  3. Abstract

    Optimal scheduling in single-server queueing systems is a classic problem in queueing theory. The Gittins index policy is known to be the optimal nonanticipating policy minimizing the mean delay in the M/G/1 queue. While the Gittins index is thoroughly characterized for ordinary jobs whose state is described by the attained service, it is not at all the case with jobs that have more complex structure. Recently, a class of such jobs, multistage jobs, were introduced, and it was shown that the computation of Gittins index of a multistage job decomposes into separable computations for the individual stages. The characterization is, however, indirect in the sense that it relies on the recursion for an auxiliary function (the so-called SJP—single-job profit—function) and not for the Gittins index itself. In this paper, we focus on sequential multistage jobs, which have a fixed sequence of stages, and prove that, for them, it is possible to compute the Gittins index directly by recursively combining the Gittins indices of its individual stages. In addition, we give sufficient conditions for the optimality of the FCFS and SERPT disciplines for scheduling sequential multistage jobs. On the other hand, we demonstrate that, for nonsequential multistage jobs, it is better to compute the Gittins index by utilizing the SJP functions.

    more » « less
  4. We study the problem of maximizing payoff generated over a period of time in a general class of closed queueing networks with a finite, fixed number of supply units that circulate in the system. Demand arrives stochastically, and serving a demand unit (customer) causes a supply unit to relocate from the “origin” to the “destination” of the customer. The key challenge is to manage the distribution of supply in the network. We consider general controls including customer entry control, pricing, and assignment. Motivating applications include shared transportation platforms and scrip systems. Inspired by the mirror descent algorithm for optimization and the backpressure policy for network control, we introduce a rich family of mirror backpressure (MBP) control policies. The MBP policies are simple and practical and crucially do not need any statistical knowledge of the demand (customer) arrival rates (these rates are permitted to vary in time). Under mild conditions, we propose MBP policies that are provably near optimal. Specifically, our policies lose at most [Formula: see text] payoff per customer relative to the optimal policy that knows the demand arrival rates, where K is the number of supply units, T is the total number of customers over the time horizon, and η is the demand process’ average rate of change per customer arrival. An adaptation of MBP is found to perform well in numerical experiments based on data from NYC Cab.

    This paper was accepted by Gabriel Weintraub, revenue management and market analytics.

    Funding: Y. Kanoria was supported by the National Science Foundation’s Division of Civil, Mechanical, and Manufacturing Innovation [Grant CMMI-1653477].

    Supplemental Material: The data files and online appendices are available at .

    more » « less
  5. We study the problem of learning, from observational data, fair and interpretable policies that effectively match heterogeneous individuals to scarce resources of different types. We model this problem as a multi-class multi-server queuing system where both individuals and resources arrive stochastically over time. Each individual, upon arrival, is assigned to a queue where they wait to be matched to a resource. The resources are assigned in a first come first served (FCFS) fashion according to an eligibility structure that encodes the resource types that serve each queue. We propose a methodology based on techniques in modern causal inference to construct the individual queues as well as learn the matching outcomes and provide a mixed-integer optimization (MIO) formulation to optimize the eligibility structure. The MIO problem maximizes policy outcome subject to wait time and fairness constraints. It is very flexible, allowing for additional linear domain constraints. We conduct extensive analyses using synthetic and real-world data. In particular, we evaluate our framework using data from the U.S. Homeless Management Information System (HMIS). We obtain wait times as low as an FCFS policy while improving the rate of exit from homelessness for underserved or vulnerable groups (7% higher for the Black individuals and 15% higher for those below 17 years old) and overall. 
    more » « less