skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Fluid limit for a multi-server, multiclass random order of service queue with reneging and tracking of residual patience times
Abstract In this paper, we consider a multi-server, multiclass queue with reneging operating under the random order of service discipline. Interarrival times, service times, and patience times are assumed to be generally distributed. Under mild conditions, we establish a fluid limit theorem for a measure-valued process that keeps track of the remaining patience time for each job in the queue, when the number of servers and classes is held fixed. We prove uniqueness for fluid model solutions in all but one case. We characterize the unique invariant state for the fluid model and prove that fluid model solutions converge to the invariant state as time goes to infinity, uniformly for suitable initial conditions.  more » « less
Award ID(s):
2153866
PAR ID:
10578623
Author(s) / Creator(s):
;
Publisher / Repository:
Springer Science + Business Media
Date Published:
Journal Name:
Queueing Systems
Volume:
109
Issue:
2
ISSN:
0257-0130
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The shortest-remaining-processing-time (SRPT) scheduling policy has been extensively studied, for more than 50 years, in single-server queues with infinitely patient jobs. Yet, much less is known about its performance in multiserver queues. In this paper, we present the first theoretical analysis of SRPT in multiserver queues with abandonment. In particular, we consider the M/GI/s+GI queue and demonstrate that, in the many-sever overloaded regime, performance in the SRPT queue is equivalent, asymptotically in steady state, to a preemptive two-class priority queue where customers with short service times (below a threshold) are served without wait, and customers with long service times (above a threshold) eventually abandon without service. We prove that the SRPT discipline maximizes, asymptotically, the system throughput, among all scheduling disciplines. We also compare the performance of the SRPT policy to blind policies and study the effects of the patience-time and service-time distributions. This paper was accepted by Baris Ata, stochastic models & simulation. 
    more » « less
  2. Motivated by a service platform, we study a two-sided network where heterogeneous demand (customers) and heterogeneous supply (workers) arrive randomly over time to get matched. Customers and workers arrive with a randomly sampled patience time (also known as reneging time in the literature) and are lost if forced to wait longer than that time to be matched. The system dynamics depend on the matching policy, which determines when to match a particular customer class with a particular worker class. Matches between classes use the head-of-line customer and worker from each class. Since customer and worker arrival processes can be very general counting processes, and the reneging times can be sampled from any finite mean distribution that is absolutely continuous, the state descriptor must track the age-in-system for every customer and worker waiting in order to be Markovian, as well as the time elapsed since the last arrival for every class. We develop a measure-valued fluid model that approximates the evolution of the discrete-event stochastic matching model and prove its solution is unique under a fixed matching policy. For a sequence of matching models, we establish a tightness result for the associated sequence of fluid-scaled state descriptors and show that any distributional limit point is a fluid model solution almost surely. When arrival rates are constant, we characterize the invariant states of the fluid model solution and show convergence to these invariant states as time becomes large. Finally, again when arrival rates are constant, we establish another tightness result for the sequence of fluid-scaled state descriptors distributed according to a stationary distribution and show that any subsequence converges to an invariant state. As a consequence, the fluid and time limits can be interchanged, which justifies regarding invariant states as first order approximations to stationary distributions. 
    more » « less
  3. Problem definition: We study scheduling multi-class impatient customers in parallel server queueing systems. At the time of arrival, customers are identified as being one of many classes, and the class represents the service and patience time distributions as well as cost characteristics. From the system’s perspective, customers of the same class at time of arrival get differentiated on their residual patience time as they wait in queue. We leverage this property and propose two novel and easy-to-implement multi-class scheduling policies. Academic/practical relevance: Scheduling multi-class impatient customers is an important and challenging topic, especially when customers’ patience times are nonexponential. In these contexts, even for customers of the same class, processing them under the first-come, first-served (FCFS) policy is suboptimal. This is because, at time of arrival, the system only knows the overall patience distribution from which a customer’s patience value is drawn, and as time elapses, the estimate of the customer’s residual patience time can be further updated. For nonexponential patience distributions, such an update indeed reveals additional information, and using this information to implement within-class prioritization can lead to additional benefits relative to the FCFS policy. Methodology: We use fluid approximations to analyze the multi-class scheduling problem with ideas borrowed from convex optimization. These approximations are known to perform well for large systems, and we use simulations to validate our proposed policies for small systems. Results: We propose a multi-class time-in-queue policy that prioritizes both across customer classes and within each class using a simple rule and further show that most of the gains of such a policy can be achieved by deviating from within-class FCFS for at most one customer class. In addition, for systems with exponential patience times, our policy reduces to a simple priority-based policy, which we prove is asymptotically optimal for Markovian systems with an optimality gap that does not grow with system scale. Managerial implications: Our work provides managers ways of improving quality of service to manage parallel server queueing systems. We propose easy-to-implement policies that perform well relative to reasonable benchmarks. Our work also adds to the academic literature on multi-class queueing systems by demonstrating the joint benefits of cross- and within-class prioritization. Funding: A. Bassamboo received financial support from the National Science Foundation [Grant CMMI 2006350]. C. (A.) Wu received financial support from the Hong Kong General Research Fund [Early Career Scheme, Project 26206419]. Supplemental Material: The online appendix is available at https://doi.org/10.1287/msom.2023.1190 . 
    more » « less
  4. As empirically observed in restaurants, call centers, and intensive care units, service times needed by customers are often related to the delay they experience in queue. Two forms of dependence mechanisms in service systems with customer abandonment immediately come to mind: First, the service requirement of a customer may evolve while waiting in queue, in which case the service time of each customer is endogenously determined by the system’s dynamics. Second, customers may arrive (exogenously) to the system with a service and patience time that are stochastically dependent, so that the service-time distribution of the customers that end up in service is different than that of the entire customer population. We refer to the former type of dependence as endogenous and to the latter as exogenous. Because either dependence mechanism can have significant impacts on a system’s performance, it should be identified and taken into consideration for performance-evaluation and decision-making purposes. However, identifying the source of dependence from observed data is hard because both the service times and patience times are censored due to customer abandonment. Further, even if the dependence is known to be exogenous, there remains the difficult problem of fitting a joint service-patience times distribution to the censored data. We address these two problems and provide a solution to the corresponding statistical challenges by proving that both problems can be avoided. We show that, for any exogenous dependence, there exists a corresponding endogenous dependence, such that the queuing dynamics under either dependence have the same law. We also prove that there exist endogenous dependencies for which no equivalent exogenous dependence exists. Therefore, the endogenous dependence can be considered as a generalization of the exogenous dependence. As a result, if dependence is observed in data, one can always consider the system as having an endogenous dependence, regardless of the true underlying dependence mechanism. Because estimating the structure of an endogenous dependence is substantially easier than estimating a joint service-patience distribution from censored data, our approach facilitates statistical estimations considerably. Funding: C. A. Wu received financial support from the Hong Kong Research Grant Council [Early Career Scheme, Project 26206419]. A. Bassamboo and O. Perry received partial financial support from the National Science Foundation [Grant CMMI 2006350]. 
    more » « less
  5. Randomized load-balancing algorithms play an important role in improving performance in large-scale networks at relatively low computational cost. A common model of such a system is a network of N parallel queues in which incoming jobs with independent and identically distributed service times are routed on arrival using the join-the-shortest-of-d-queues routing algorithm. Under fairly general conditions, it was shown by Aghajani and Ramanan that as the size of the system goes to infinity, the state dynamics converge to the unique solution of a countable system of coupled deterministic measure-valued equations called the hydrodynamic equations. In this article, a characterization of invariant states of these hydrodynamic equations is obtained and, when d=2, used to construct a numerical algorithm to compute the queue length distribution and mean virtual waiting time in the invariant state. Additionally, it is also shown that under a suitable tail condition on the service distribution, the queue length distribution of the invariant state exhibits a doubly exponential tail decay, thus demonstrating a vast improvement in performance over the case [Formula: see text], which corresponds to random routing, when the tail decay could even be polynomial. Furthermore, numerical evidence is provided to support the conjecture that the invariant state is the limit of the steady-state distributions of the N-server models. The proof methodology, which entails analysis of a coupled system of measure-valued equations, can potentially be applied to other many-server systems with general service distributions, where measure-valued representations are useful. 
    more » « less