skip to main content


Title: The M/M/k with Deterministic Setup Times
Capacity management, whether it involves servers in a data center, or human staff in a call center, or doctors in a hospital, is largely about balancing a resource-delay tradeoff. On the one hand, one would like to turn off servers when not in use (or send home staff that are idle) to save on resources. On the other hand, one wants to avoid the considerable setup time required to turn an ''off'' server back ''on.'' This paper aims to understand the delay component of this tradeoff, namely, what is the effect of setup time on average delay in a multi-server system? Surprisingly little is known about the effect of setup times on delay. While there has been some work on studying the M/M/k with Exponentially-distributed setup times, these works provide only iterative methods for computing mean delay, giving little insight as to how delay is affected by k , by load, and by the setup time. Furthermore, setup time in practice is much better modeled by a Deterministic random variable, and, as this paper shows, the scaling effect of a Deterministic setup time is nothing like that of an Exponentially-distributed setup time. This paper provides the first analysis of the M/M/k with Deterministic setup times. We prove a lower bound on the effect of setup on delay, where our bound is highly accurate for the common case where the setup time is much higher than the job service time. Our result is a relatively simple algebraic formula which provides insights on how delay scales with the input parameters. Our proof uses a combination of renewal theory, martingale arguments and novel probabilistic arguments, providing strong intuition on the transient behavior of a system that turns servers on and off.  more » « less
Award ID(s):
1938909 1763701 2007733 2145713
NSF-PAR ID:
10394264
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Proceedings of the ACM on Measurement and Analysis of Computing Systems
Volume:
6
Issue:
3
ISSN:
2476-1249
Page Range / eLocation ID:
1 to 45
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The Age-of-Information (AoI) has recently been proposed as an important metric for investigating the timeliness performance in information-update systems. Prior studies on AoI optimization often consider a Push model, which is concerned about when and how to "push" (i.e., generate and transmit) the updated information to the user. In stark contrast, in this paper we introduce a new Pull model, which is more relevant for certain applications (such as the real-time stock quotes service), where a user sends requests to the servers to proactively "pull" the information of interest. Moreover, we propose to employ request replication to reduce the AoI. Interestingly, we find that under this new Pull model, replication schemes capture a novel tradeoff between different levels of information freshness and different response times across the servers, which can be exploited to minimize the expected AoI at the user's side. Specifically, assuming Poisson updating process at the servers and exponentially distributed response time, we derive a closed-form formula for computing the expected AoI and obtain the optimal number of responses to wait for to minimize the expected AoI. Finally, we conduct numerical simulations to elucidate our theoretical results. Our findings show that waiting for more than one response can significantly reduce the AoI in most scenarios. 
    more » « less
  2. Introduced by Sadeh et al., the K-star-graph private information retrieval (PIR) problem, so-labeled because the storage graph is a star-graph with K leaf nodes, is comprised of K messages that are stored separately (one-each) at K dedicated servers, and a universal server that stores all K messages, for a total of K + 1 servers. While it is one of the simplest PIR settings to describe, the capacity CK of K-star-graph PIR is open for K ≥ 4. We study the critical K = 4 setting, for which prior work establishes the bounds 2/5 ≤ C4 ≤ 3/7. As our main contribution, we characterize the exact capacity of 4-star-graph PIR as C4 = 5/12, thus improving upon both the prior lower- bound as well as the prior upper-bound. The main technical challenge resides in the new converse bound, whose non-trivial structure is deduced indirectly from the achievable schemes that emerge from the study of a finer tradeoff between the download costs from the dedicated servers versus the universal server. A sharp characterization of this tradeoff is also obtained for K = 4. 
    more » « less
  3. We study the problem of scheduling VMs (Virtual Machines) in a distributed server platform, motivated by cloud computing applications. The VMs arrive dynamically over time to the system, and require a certain amount of resources (e.g. memory, CPU, etc) for the duration of their service. To avoid costly preemptions, we consider non-preemptive scheduling: Each VM has to be assigned to a server which has enough residual capacity to accommodate it, and once a VM is assigned to a server, its service cannot be disrupted (preempted). Prior approaches to this problem either have high complexity, require synchronization among the servers, or yield queue sizes/delays which are excessively large. We propose a non-preemptive scheduling algorithm that resolves these issues. In general, given an approximation algorithm to Knapsack with approximation ratio r , our scheduling algorithm can provide rβ fraction of the throughput region for β < r. In the special case of a greedy approximation algorithm to Knapsack, we further show that this condition can be relaxed to β<1. The parameters β and r can be tuned to provide a tradeoff between achievable throughput, delay, and computational complexity of the scheduling algorithm. Finally extensive simulation results using both synthetic and real traffic traces are presented to verify the performance of our algorithm. 
    more » « less
  4. The operation of today’s data centers increasingly relies on environmental data collection and analysis to operate the cooling infrastructure as efficiently as possible and to maintain the reliability of IT equipment. This in turn emphasizes the importance of the quality of the data collected and their relevance to the overall operation of the data center. This study presents an experimentally based analysis and comparison between two different approaches for environmental data collection; one using a discrete sensor network, and another using available data from installed IT equipment through their Intelligent Platform Management Interface (IPMI). The comparison considers the quality and relevance of the data collected and investigates their effect on key performance and operational metrics. The results have shown the large variation of server inlet temperatures provided by the IPMI interface. On the other hand, the discrete sensor measurements showed much more reliable results where the server inlet temperatures had minimal variation inside the cold aisle. These results highlight the potential difficulty in using IPMI inlet temperature data to evaluate the thermal environment inside the contained cold aisle. The study also focuses on how industry common methods for cooling efficiency management and control can be affected by the data collection approach. Results have shown that using preheated IPMI inlet temperature data can lead to unnecessarily lower cooling set points, which in turn minimizes the potential cooling energy savings. It was shown in one case that using discrete sensor data for control provides 20% more energy savings than using IPMI inlet temperature data. 
    more » « less
  5. In the parallel paging problem, there are p processors that share a cache of size k. The goal is to partition the cache among the processors over time in order to minimize their average completion time. For this long-standing open problem, we give tight upper and lower bounds of ⇥(log p) on the competitive ratio with O(1) resource augmentation. A key idea in both our algorithms and lower bounds is to relate the problem of parallel paging to the seemingly unrelated problem of green paging. In green paging, there is an energy-optimized processor that can temporarily turn off one or more of its cache banks (thereby reducing power consumption), so that the cache size varies between a maximum size k and a minimum size k/p. The goal is to minimize the total energy consumed by the computation, which is proportional to the integral of the cache size over time. We show that any efficient solution to green paging can be converted into an efficient solution to parallel paging, and that any lower bound for green paging can be converted into a lower bound for parallel paging, in both cases in a black-box fashion. We then show that, with O(1) resource augmentation, the optimal competitive ratio for deterministic online green paging is ⇥(log p), which, in turn, implies the same bounds for deterministic online parallel paging. 
    more » « less