%AM. Wijewardena%AM. J. Neely%AarXiv:2402.05300v2 Ed.%AarXiv:2402.05300v2%B
%D2024%IarXiv:2402.05300v2
%Kasymmetric information; bandits; optimization; multiple channels
%MOSTI ID: 10494717
%PMedium: X
%TMulti-Player Resource-Sharing Games with Fair Reward Allocation
%XThis paper considers a multi-player resource-sharing game with a fair reward allocation model.
Multiple players choose from a collection of resources. Each resource brings a random reward equally
divided among the players who choose it. We consider two settings. The first setting is a one-slot game
where the mean rewards of the resources are known to all the players, and the objective of player 1 is to
maximize their worst-case expected utility. Certain special cases of this setting have explicit solutions.
These cases provide interesting yet non-intuitive insights into the problem. The second setting is an online
setting, where the game is played over a finite time horizon, where the mean rewards are unknown to
the first player. Instead, the first player receives, as feedback, the rewards of the resources they chose
after the action. We develop a novel Upper Confidence Bound (UCB) algorithm that minimizes the
worst-case regret of the first player using the feedback received.
Country unknown/Code not availableOSTI-MSA