skip to main content


Search for: All records

Award ID contains: 2106403

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. We introduce and study the online pause and resume problem. In this problem, a player attempts to find the k lowest (alternatively, highest) prices in a sequence of fixed length T, which is revealed sequentially. At each time step, the player is presented with a price and decides whether to accept or reject it. The player incurs aswitching cost whenever their decision changes in consecutive time steps, i.e., whenever they pause or resume purchasing. This online problem is motivated by the goal of carbon-aware load shifting, where a workload may be paused during periods of high carbon intensity and resumed during periods of low carbon intensity and incurs a cost when saving or restoring its state. It has strong connections to existing problems studied in the literature on online optimization, though it introduces unique technical challenges that prevent the direct application of existing algorithms. Extending prior work on threshold-based algorithms, we introducedouble-threshold algorithms for both the minimization and maximization variants of this problem. We further show that the competitive ratios achieved by these algorithms are the best achievable by any deterministic online algorithm. Finally, we empirically validate our proposed algorithm through case studies on the application of carbon-aware load shifting using real carbon trace data and existing baseline algorithms.

     
    more » « less
    Free, publicly-accessible full text available December 7, 2024
  2. Free, publicly-accessible full text available November 1, 2024
  3. We consider a minimization variant on the classical prophet inequality with monomial cost functions. A firm would like to procure some fixed amount of a divisible commodity from sellers that arrive sequentially. Whenever a seller arrives, the seller’s cost function is revealed, and the firm chooses how much of the commodity to buy. We first show that if one restricts the set of distributions for the coefficients to a family of natural distributions that include, for example, the uniform and truncated normal distributions, then there is a thresholding policy that is asymptotically optimal in the number of sellers. We then compare two scenarios based on whether the firm has in-house production capabilities or not. We precisely compute the optimal algorithm’s competitive ratio when in-house production capabilities exist and for a special case when they do not. We show that the main advantage of the ability to produce the commodity in house is that it shields the firm from price spikes in worst-case scenarios.

    Funding: This work was supported by NSF Grants [CNS-2146814, CPS-2136197, CNS-2106403, NGSDI-2105648].

     
    more » « less
    Free, publicly-accessible full text available June 7, 2024
  4. We examine the problem of designing learning-augmented algorithms for metrical task systems (MTS) that exploit machine-learned advice while maintaining rigorous, worst-case guarantees on performance. We propose an algorithm, DART, that achieves this dual objective, providing cost within a multiplicative factor (1+ϵ) of the machine-learned advice (i.e., consistency) while ensuring cost within a multiplicative factor 2O(1/ϵ) of a baseline robust algorithm (i.e., robustness) for any ϵ>0 . We show that this exponential tradeoff between consistency and robustness is unavoidable in general, but that in important subclasses of MTS, such as when the metric space has bounded diameter and in the k -server problem, our algorithm achieves improved, polynomial tradeoffs between consistency and robustness. 
    more » « less
  5. The online knapsack problem is a classic online resource allocation problem in networking and operations research. Its basic version studies how to pack online arriving items of different sizes and values into a capacity-limited knapsack. In this paper, we study a general version that includes item departures, while also considering multiple knapsacks and multi-dimensional item sizes. We design a threshold-based online algorithm and prove that the algorithm can achieve order-optimal competitive ratios. Beyond worst-case performance guarantees, we also aim to achieve near-optimal average performance under typical instances. Towards this goal, we propose a data-driven online algorithm that learns within a policy-class that guarantees a worst-case performance bound. In trace-driven experiments, we show that our data-driven algorithm outperforms other benchmark algorithms in an application of online knapsack to job scheduling for cloud computing. 
    more » « less
  6. We study a variant of online optimization in which the learner receives k-rounddelayed feedback about hitting cost and there is a multi-step nonlinear switching cost, i.e., costs depend on multiple previous actions in a nonlinear manner. Our main result shows that a novel Iterative Regularized Online Balanced Descent (iROBD) algorithm has a constant, dimension-free competitive ratio that is $O(L^2k )$, where L is the Lipschitz constant of the switching cost. Additionally, we provide lower bounds that illustrate the Lipschitz condition is required and the dependencies on k and L are tight. Finally, via reductions, we show that this setting is closely related to online control problems with delay, nonlinear dynamics, and adversarial disturbances, where iROBD directly offers constant-competitive online policies. 
    more » « less
  7. We study reinforcement learning (RL) in a setting with a network of agents whose states and actions interact in a local manner where the objective is to find localized policies such that the (discounted) global reward is maximized. A fundamental challenge in this setting is that the state-action space size scales exponentially in the number of agents, rendering the problem intractable for large networks. In this paper, we propose a scalable actor critic (SAC) framework that exploits the network structure and finds a localized policy that is an [Formula: see text]-approximation of a stationary point of the objective for some [Formula: see text], with complexity that scales with the local state-action space size of the largest [Formula: see text]-hop neighborhood of the network. We illustrate our model and approach using examples from wireless communication, epidemics, and traffic. 
    more » « less