Display Ads and the generalized assignment problem are two well-studied online packing problems with important applications in ad allocation and other areas. In both problems, ad impressions arrive online and have to be allocated immediately to budget-constrained advertisers. Worst-case algorithms that achieve the ideal competitive ratio are known for both problems, but might act overly conservative given the predictable and usually tame nature of real-world input. Given this discrepancy, we develop an algorithm for both problems that incorporate machine-learned predictions and can thus improve the performance beyond the worst-case. Our algorithm is based on the work of Feldman et al. (2009) and similar in nature to Mahdian et al. (2007) who were the first to develop a learning-augmented algorithm for the related, but more structured Ad Words problem. We use a novel analysis to show that our algorithm is able to capitalize on a good prediction, while being robust against poor predictions. We experimentally evaluate our algorithm on synthetic and real-world data on a wide range of predictions. Our algorithm is consistently outperforming the worst-case algorithm without predictions.
more »
« less
Online Allocation with Replenishable Budgets: Worst Case and Beyond
This paper studies online resource allocation with replenishable budgets, where budgets can be replenished on top of the initial budget and an agent sequentially chooses online allocation decisions without violating the available budget constraint at each round. We propose a novel online algorithm, called OACP (Opportunistic Allocation with Conservative Pricing), that conservatively adjusts dual variables while opportunistically utilizing available resources. OACP achieves a bounded asymptotic competitive ratio in adversarial settings as the number of decision rounds T gets large. Importantly, the asymptotic competitive ratio of OACP is optimal in the absence of additional assumptions on budget replenishment. To further improve the competitive ratio, we make a mild assumption that there is budget replenishment every T* ≥ 1 decision rounds and propose OACP+ to dynamically adjust the total budget assignment for online allocation. Next, we move beyond the worst-case and propose LA-OACP (Learning-Augmented OACP/OACP+), a novel learning-augmented algorithm for online allocation with replenishable budgets. We prove that LA-OACP can improve the average utility compared to OACP/OACP+ when the ML predictor is properly trained, while still offering worst-case utility guarantees when the ML predictions are arbitrarily wrong. Finally, we run simulation studies of sustainable AI inference powered by renewables, validating our analysis and demonstrating the empirical benefits of LA-OACP.
more »
« less
- PAR ID:
- 10568132
- Publisher / Repository:
- ACM
- Date Published:
- Journal Name:
- Proceedings of the ACM on Measurement and Analysis of Computing Systems
- Volume:
- 8
- Issue:
- 1
- ISSN:
- 2476-1249
- Page Range / eLocation ID:
- 1 to 34
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
We study online convex optimization with switching costs, a practically important but also extremely challenging problem due to the lack of complete offline information. By tapping into the power of machine learning (ML) based optimizers, ML-augmented online algorithms (also referred to as expert calibration in this paper) have been emerging as state of the art, with provable worst-case performance guarantees. Nonetheless, by using the standard practice of training an ML model as a standalone optimizer and plugging it into an ML-augmented algorithm, the average cost performance can be highly unsatisfactory. In order to address the "how to learn" challenge, we propose EC-L2O (expert-calibrated learning to optimize), which trains an ML-based optimizer by explicitly taking into account the downstream expert calibrator. To accomplish this, we propose a new differentiable expert calibrator that generalizes regularized online balanced descent and offers a provably better competitive ratio than pure ML predictions when the prediction error is large. For training, our loss function is a weighted sum of two different losses --- one minimizing the average ML prediction error for better robustness, and the other one minimizing the post-calibration average cost. We also provide theoretical analysis for EC-L2O, highlighting that expert calibration can be even beneficial for the average cost performance and that the high-percentile tail ratio of the cost achieved by EC-L2O to that of the offline optimal oracle (i.e., tail cost ratio) can be bounded. Finally, we test EC-L2O by running simulations for sustainable datacenter demand response. Our results demonstrate that EC-L2O can empirically achieve a lower average cost as well as a lower competitive ratio than the existing baseline algorithms.more » « less
-
Many problems, such as online ad display, can be formulated as online bipartite matching. The crucial challenge lies in the nature of sequentially-revealed online item information, based on which we make irreversible matching decisions at each step. While numerous expert online algorithms have been proposed with bounded worst-case competitive ratios, they may not offer satisfactory performance in average cases. On the other hand, reinforcement learning (RL) has been applied to improve the average performance, but it lacks robustness and can perform arbitrarily poorly. In this paper, we propose a novel RL-based approach to edge-weighted online bipartite matching with robustness guarantees (LOMAR), achieving both good average-case and worst-case performance. The key novelty of LOMAR is a new online switching operation which, based on a judicious condition to hedge against future uncertainties, decides whether to follow the expert’s decision or the RL decision for each online item. We prove that for any rho in [0,1], LOMAR is rho-competitive against any given expert online algorithm. To improve the average performance, we train the RL policy by explicitly considering the online switching operation. Finally, we run empirical experiments to demonstrate the advantages of LOMAR compared to existing baselines.more » « less
-
Etessami, Kousha ; Feige, Uriel ; Puppis, Gabriele (Ed.)Online allocation is a broad class of problems where items arriving online have to be allocated to agents who have a fixed utility/cost for each assigned item so to maximize/minimize some objective. This framework captures a broad range of fundamental problems such as the Santa Claus problem (maximizing minimum utility), Nash welfare maximization (maximizing geometric mean of utilities), makespan minimization (minimizing maximum cost), minimization of 𝓁_p-norms, and so on. We focus on divisible items (i.e., fractional allocations) in this paper. Even for divisible items, these problems are characterized by strong super-constant lower bounds in the classical worst-case online model. In this paper, we study online allocations in the learning-augmented setting, i.e., where the algorithm has access to some additional (machine-learned) information about the problem instance. We introduce a general algorithmic framework for learning-augmented online allocation that produces nearly optimal solutions for this broad range of maximization and minimization objectives using only a single learned parameter for every agent. As corollaries of our general framework, we improve prior results of Lattanzi et al. (SODA 2020) and Li and Xian (ICML 2021) for learning-augmented makespan minimization, and obtain the first learning-augmented nearly-optimal algorithms for the other objectives such as Santa Claus, Nash welfare, 𝓁_p-minimization, etc. We also give tight bounds on the resilience of our algorithms to errors in the learned parameters, and study the learnability of these parameters.more » « less
-
We consider a discrete-time system where a resource-constrained source (e.g., a small sensor) transmits its time-sensitive data to a destination over a time-varying wireless channel. Each transmission incurs a fixed transmission cost (e.g., energy cost), and no transmission results in a staleness cost represented by the Age-of-Information. The source must balance the tradeoff between transmission and staleness costs. To address this challenge, we develop a robust online algorithm to minimize the sum of transmission and staleness costs, ensuring a worst-case performance guarantee. While online algorithms are robust, they are usually overly conservative and may have a poor average performance in typical scenarios. In contrast, by leveraging historical data and prediction models, machine learning (ML) algorithms perform well in average cases. However, they typically lack worst-case performance guarantees. To achieve the best of both worlds, we design a learning-augmented online algorithm that exhibits two desired properties: (i) consistency: closely approximating the optimal offline algorithm when the ML prediction is accurate and trusted; (ii) robustness: ensuring worst case performance guarantee even ML predictions are inaccurate. Finally, we perform extensive simulations to show that our online algorithm performs well empirically and that our learning augmented algorithm achieves both consistency and robustness.more » « less