Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to nonfederal websites. Their policies may differ from this site.

arXiv:2402.05300v2 (Ed.)This paper considers a multiplayer resourcesharing game with a fair reward allocation model. Multiple players choose from a collection of resources. Each resource brings a random reward equally divided among the players who choose it. We consider two settings. The first setting is a oneslot game where the mean rewards of the resources are known to all the players, and the objective of player 1 is to maximize their worstcase expected utility. Certain special cases of this setting have explicit solutions. These cases provide interesting yet nonintuitive insights into the problem. The second setting is an online setting, where the game is played over a finite time horizon, where the mean rewards are unknown to the first player. Instead, the first player receives, as feedback, the rewards of the resources they chose after the action. We develop a novel Upper Confidence Bound (UCB) algorithm that minimizes the worstcase regret of the first player using the feedback received.more » « lessFree, publiclyaccessible full text available March 4, 2025

arXiv:2401.07170v1 (Ed.)This paper considers online optimization for a system that performs a sequence of backtoback tasks. Each task can be processed in one of multiple processing modes that affect the duration of the task, the reward earned, and an additional vector of penalties (such as energy or cost). Let A[k] be a random matrix of parameters that specifies the duration, reward, and penalty vector under each processing option for task k. The goal is to observe A[k] at the start of each new task k and then choose a processing mode for the task so that, over time, time average reward is maximized subject to time average penalty constraints. This is a renewal optimization problem and is challenging because the probability distribution for the A[k] sequence is unknown. Prior work shows that any algorithm that comes within ϵ of optimality must have (1/ϵ^2) convergence time. The only known algorithm that can meet this bound operates without time average penalty constraints and uses a diminishing stepsize that cannot adapt when probabilities change. This paper develops a new algorithm that is adaptive and comes within O(ϵ) of optimality for any interval of (1/ϵ^2) tasks over which probabilities are held fixed, regardless of probabilities before the start of the interval.more » « lessFree, publiclyaccessible full text available January 13, 2025

arXiv:2311.11180v1 (Ed.)This paper presents a subgradientbased algorithm for constrained nonsmooth convex optimization that does not require projections onto the feasible set. While the wellestablished FrankWolfe algorithm and its variants already avoid projections, they are primarily designed for smooth objective functions. In con trast, our proposed algorithm can handle nonsmooth problems with general convex functional inequality constraints. It achieves an ϵsuboptimal solution in O(ϵ^−2) iterations, with each iteration requiring only a single (potentially inexact) Linear Minimization Oracle (LMO) call and a (possibly inexact) subgra dient computation. This performance is consistent with existing lower bounds. Similar performance is observed when deterministic subgradients are replaced with stochastic subgradients. In the special case where there are no functional inequality constraints, our algorithm competes favorably with a recent nonsmooth projectionfree method designed for constraintfree problems. Our approach uti lizes a simple separation scheme in conjunction with a new Lagrange multiplier update rule.more » « lessFree, publiclyaccessible full text available November 18, 2024