skip to main content


Search for: All records

Award ID contains: 2113027

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Free, publicly-accessible full text available June 19, 2024
  2. We examine the problem of smoothed online optimization, where a decision maker must sequentially choose points in a normed vector space to minimize the sum of per-round, non-convex hitting costs and the costs of switching decisions between rounds. The decision maker has access to a black-box oracle, such as a machine learning model, that provides untrusted and potentially inaccurate predictions of the optimal decision in each round. The goal of the decision maker is to exploit the predictions if they are accurate, while guaranteeing performance that is not much worse than the hindsight optimal sequence of decisions, even when predictions are inaccurate. We impose the standard assumption that hitting costs are globally α-polyhedral. We propose a novel algorithm, Adaptive Online Switching (AOS), and prove that, for a large set of feasible δ > 0, it is (1+δ)-competitive if predictions are perfect, while also maintaining a uniformly bounded competitive ratio of 2~O (1/(α δ)) even when predictions are adversarial. Further, we prove that this trade-off is necessary and nearly optimal in the sense that any deterministic algorithm which is (1+δ)-competitive if predictions are perfect must be at least 2~Ω (1/(α δ)) -competitive when predictions are inaccurate. In fact, we observe a unique threshold-type behavior in this trade-off: if δ is not in the set of feasible options, then no algorithm is simultaneously (1 + δ)-competitive if predictions are perfect and ζ-competitive when predictions are inaccurate for any ζ < ∞. Furthermore, we discuss that memory is crucial in AOS by proving that any algorithm that does not use memory cannot benefit from predictions. We complement our theoretical results by a numerical study on a microgrid application. 
    more » « less
  3. Modern data centers suffer from immense power consumption. As a result, data center operators have heavily invested in capacity-scaling solutions, which dynamically deactivate servers if the demand is low and activate them again when the workload increases. We analyze a continuous-time model for capacity scaling, where the goal is to minimize the weighted sum of flow time, switching cost, and power consumption in an online fashion. We propose a novel algorithm, called adaptive balanced capacity scaling (ABCS), that has access to black-box machine learning predictions. ABCS aims to adapt to the predictions and is also robust against unpredictable surges in the workload. In particular, we prove that ABCS is [Formula: see text] competitive if the predictions are accurate, and yet, it has a uniformly bounded competitive ratio even if the predictions are completely inaccurate. Finally, we investigate the performance of this algorithm on a real-world data set and carry out extensive numerical experiments, which positively support the theoretical results. Funding: This work was partially supported by the Division of Computing and Communication Foundations [Grant 2113027]. The authors also acknowledge financial support for this project from the Algorithm and Randomness Center–Transdisciplinary Research Institute for Advancing Data Science Fellowship at Georgia Tech. 
    more » « less
  4. Consider a system with N identical single-server queues and a number of task types, where each server is able to process only a small subset of possible task types. Arriving tasks select [Formula: see text] random compatible servers and join the shortest queue among them. The compatibility constraints are captured by a fixed bipartite graph between the servers and the task types. When the graph is complete bipartite, the mean-field approximation is accurate. However, such dense compatibility graphs are infeasible for large-scale implementation. We characterize a class of sparse compatibility graphs for which the mean-field approximation remains valid. For this, we introduce a novel notion, called proportional sparsity, and establish that systems with proportionally sparse compatibility graphs asymptotically match the performance of a fully flexible system. Furthermore, we show that proportionally sparse random compatibility graphs can be constructed, which reduce the server degree almost by a factor [Formula: see text] compared with the complete bipartite compatibility graph. 
    more » « less
  5. We consider a large-scale service system where incoming tasks have to be instantaneously dispatched to one out of many parallel server pools. The user-perceived performance degrades with the number of concurrent tasks and the dispatcher aims at maximizing the overall quality of service by balancing the load through a simple threshold policy. We demonstrate that such a policy is optimal on the fluid and diffusion scales, while only involving a small communication overhead, which is crucial for large-scale deployments. In order to set the threshold optimally, it is important, however, to learn the load of the system, which may be unknown. For that purpose, we design a control rule for tuning the threshold in an online manner. We derive conditions that guarantee that this adaptive threshold settles at the optimal value, along with estimates for the time until this happens. In addition, we provide numerical experiments that support the theoretical results and further indicate that our policy copes effectively with time-varying demand patterns. Summary of Contribution: Data centers and cloud computing platforms are the digital factories of the world, and managing resources and workloads in these systems involves operations research challenges of an unprecedented scale. Due to the massive size, complex dynamics, and wide range of time scales, the design and implementation of optimal resource-allocation strategies is prohibitively demanding from a computation and communication perspective. These resource-allocation strategies are essential for certain interactive applications, for which the available computing resources need to be distributed optimally among users in order to provide the best overall experienced performance. This is the subject of the present article, which considers the problem of distributing tasks among the various server pools of a large-scale service system, with the objective of optimizing the overall quality of service provided to users. A solution to this load-balancing problem cannot rely on maintaining complete state information at the gateway of the system, since this is computationally unfeasible, due to the magnitude and complexity of modern data centers and cloud computing platforms. Therefore, we examine a computationally light load-balancing algorithm that is yet asymptotically optimal in a regime where the size of the system approaches infinity. The analysis is based on a Markovian stochastic model, which is studied through fluid and diffusion limits in the aforementioned large-scale regime. The article analyzes the load-balancing algorithm theoretically and provides numerical experiments that support and extend the theoretical results. 
    more » « less