Display Ads and the generalized assignment problem are two well-studied online packing problems with important applications in ad allocation and other areas. In both problems, ad impressions arrive online and have to be allocated immediately to budget-constrained advertisers. Worst-case algorithms that achieve the ideal competitive ratio are known for both problems, but might act overly conservative given the predictable and usually tame nature of real-world input. Given this discrepancy, we develop an algorithm for both problems that incorporate machine-learned predictions and can thus improve the performance beyond the worst-case. Our algorithm is based on the work of Feldman et al. (2009) and similar in nature to Mahdian et al. (2007) who were the first to develop a learning-augmented algorithm for the related, but more structured Ad Words problem. We use a novel analysis to show that our algorithm is able to capitalize on a good prediction, while being robust against poor predictions. We experimentally evaluate our algorithm on synthetic and real-world data on a wide range of predictions. Our algorithm is consistently outperforming the worst-case algorithm without predictions.
more »
« less
Online Allocation with Replenishable Budgets: Worst Case and Beyond
This paper studies online resource allocation with replenishable budgets, where budgets can be replenished on top of the initial budget and an agent sequentially chooses online allocation decisions without violating the available budget constraint at each round. We propose a novel online algorithm, called OACP (Opportunistic Allocation with Conservative Pricing), that conservatively adjusts dual variables while opportunistically utilizing available resources. OACP achieves a bounded asymptotic competitive ratio in adversarial settings as the number of decision rounds T gets large. Importantly, the asymptotic competitive ratio of OACP is optimal in the absence of additional assumptions on budget replenishment. To further improve the competitive ratio, we make a mild assumption that there is budget replenishment every T* ≥ 1 decision rounds and propose OACP+ to dynamically adjust the total budget assignment for online allocation. Next, we move beyond the worst-case and propose LA-OACP (Learning-Augmented OACP/OACP+), a novel learning-augmented algorithm for online allocation with replenishable budgets. We prove that LA-OACP can improve the average utility compared to OACP/OACP+ when the ML predictor is properly trained, while still offering worst-case utility guarantees when the ML predictions are arbitrarily wrong. Finally, we run simulation studies of sustainable AI inference powered by renewables, validating our analysis and demonstrating the empirical benefits of LA-OACP.
more »
« less
- PAR ID:
- 10568132
- Publisher / Repository:
- ACM
- Date Published:
- Journal Name:
- Proceedings of the ACM on Measurement and Analysis of Computing Systems
- Volume:
- 8
- Issue:
- 1
- ISSN:
- 2476-1249
- Page Range / eLocation ID:
- 1 to 34
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
This paper studies learning-augmented decentralized online convex optimization in a networked multi-agent system, a challenging setting that has remained under-explored. We first consider a linear learning-augmented decentralized online algorithm (LADO-Lin) that combines a machine learning (ML) policy with a baseline expert policy in a linear manner. We show that, while LADO-Lin can exploit the potential of ML predictions to improve the average cost performance, it cannot have guaranteed worst-case performance. To address this limitation, we propose a novel online algorithm (LADO) that adaptively combines the ML policy and expert policy to safeguard the ML predictions to achieve strong competitiveness guarantees. We also prove the average cost bound for LADO, revealing the tradeoff between average performance and worst-case robustness and demonstrating the advantage of training the ML policy by explicitly considering the robustness requirement. Finally, we run an experiment on decentralized battery management. Our results highlight the potential of ML augmentation to improve the average performance as well as the guaranteed worst-case performance of LADO.more » « less
-
Motivated by the emerging paradigm of resource allocation that integrates classical objectives, such as cost minimization, with societal objectives, such as carbon awareness, this paper proposes a general framework for the online fair allocation of reusable resources. Within this framework, an online decision-maker seeks to allocate a finite resource with capacityCto a sequence of requests arriving with unknown distributions of types, utilities, and resource usage durations. To accommodate diverse objectives, the framework supports multiple actions and utility types, and the goal is to achieve max-min fairness among utilities, i.e., maximize the minimum time-averaged utility across all utility types. Our performance metric is an (α,β)-competitive guarantee of the form: ALG ≥ α • OPT*- O(Tβ-1),; α, β ∈ (0, 1], where OPT*and ALG are the time-averaged optimum and objective value achieved by the decision maker, respectively. We propose a novel algorithm that achieves a competitive guarantee of (1-O(√(log C/C)), 2/3) under the bandit feedback. As resource capacity increases, the multiplicative competitive ratio term 1-O(√ logC/C) asymptotically approaches optimality. Notably, when the resource capacity exceeds a certain threshold, our algorithm achieves an improved competitive guarantee of (1, 2/3). Our algorithm employs an optimistic penalty-weight mechanism coupled with a dual exploration-discarding strategy to balance resource feasibility, exploration, and fairness among utilities.more » « less
-
We study online convex optimization with switching costs, a practically important but also extremely challenging problem due to the lack of complete offline information. By tapping into the power of machine learning (ML) based optimizers, ML-augmented online algorithms (also referred to as expert calibration in this paper) have been emerging as state of the art, with provable worst-case performance guarantees. Nonetheless, by using the standard practice of training an ML model as a standalone optimizer and plugging it into an ML-augmented algorithm, the average cost performance can be highly unsatisfactory. In order to address the how to learn challenge, we propose EC-L2O (expert-calibrated learning to optimize), which trains an ML-based optimizer by explicitly taking into account the downstream expert calibrator. To accomplish this, we propose a new differentiable expert calibrator that generalizes regularized online balanced descent and offers a provably better competitive ratio than pure ML predictions when the prediction error is large. For training, our loss function is a weighted sum of two different losses --- one minimizing the average ML prediction error for better robustness, and the other one minimizing the post-calibration average cost. We also provide theoretical analysis for EC-L2O, highlighting that expert calibration can be even beneficial for the average cost performance and that the high-percentile tail ratio of the cost achieved by EC-L2O to that of the offline optimal oracle (i.e., tail cost ratio) can be bounded. Finally, we test EC-L2O by running simulations for sustainable datacenter demand response. Our results demonstrate that EC-L2O can empirically achieve a lower average cost as well as a lower competitive ratio than the existing baseline algorithms.more » « less
-
Many problems, such as online ad display, can be formulated as online bipartite matching. The crucial challenge lies in the nature of sequentially-revealed online item information, based on which we make irreversible matching decisions at each step. While numerous expert online algorithms have been proposed with bounded worst-case competitive ratios, they may not offer satisfactory performance in average cases. On the other hand, reinforcement learning (RL) has been applied to improve the average performance, but it lacks robustness and can perform arbitrarily poorly. In this paper, we propose a novel RL-based approach to edge-weighted online bipartite matching with robustness guarantees (LOMAR), achieving both good average-case and worst-case performance. The key novelty of LOMAR is a new online switching operation which, based on a judicious condition to hedge against future uncertainties, decides whether to follow the expert’s decision or the RL decision for each online item. We prove that for any rho in [0,1], LOMAR is rho-competitive against any given expert online algorithm. To improve the average performance, we train the RL policy by explicitly considering the online switching operation. Finally, we run empirical experiments to demonstrate the advantages of LOMAR compared to existing baselines.more » « less
An official website of the United States government

