NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Decentralized Upper Confidence Bound Algorithms for Homogeneous Multi-agent Multi-armed Bandits

https://doi.org/10.1109/TAC.2024.3525417

Zhu, Jingxuan; Mulle, Ethan; Smith, Christopher S; Koppel, Alec; Liu, Ji (July 2025, IEEE Transactions on Automatic Control)

Free, publicly-accessible full text available July 1, 2026
Decentralized Multi-Armed Bandit Can Outperform Classic Upper Confidence Bound: A Homogeneous Case over Strongly Connected Graphs

https://doi.org/10.1109/CDC56724.2024.10886699

Zhu, Jingxuan; Liu, Ji (December 2024, IEEE)

Free, publicly-accessible full text available December 16, 2025
Byzantine-resilient decentralized multi-armed bandits

Zhu, Jingxuan; Koppel, Alec; Velasquez, Alvaro; Liu, Ji (August 2024, Transactions on machine learning research)

Full Text Available
A Resilient Distributed Algorithm for Solving Linear Equations

https://doi.org/10.1109/CDC49753.2023.10383841

Zhu, Jingxuan; Velasquez, Alvaro; Liu, Ji (December 2023, IEEE)

Full Text Available
Distributed Multiarmed Bandits

https://doi.org/10.1109/TAC.2023.3247982

Zhu, Jingxuan; Liu, Ji (May 2023, IEEE Transactions on Automatic Control)

Full Text Available
Resilient Distributed Optimization ^*

https://doi.org/10.23919/ACC55779.2023.10156564

Zhu, Jingxuan; Lin, Yixuan; Velasquez, Alvaro; Liu, Ji (May 2023, 2023 American Control Conference)

Full Text Available
Reaching a consensus with limited information

https://doi.org/10.1016/j.sysconle.2023.105524

Zhu, Jingxuan; Lin, Yixuan; Liu, Ji; Morse, A. Stephen (June 2023, Systems & Control Letters)

Full Text Available
Reaching a Consensus with Limited Information

https://doi.org/10.1109/CDC51059.2022.9993039

Zhu, Jingxuan; Lin, Yixuan; Liu, Ji; Morse, A. Stephen (December 2022, 61st IEEE Conference on Decision and Control)

Full Text Available
Federated Bandit: A Gossiping Approach

https://doi.org/10.1145/3447380

Zhu, Zhaowei; Zhu, Jingxuan; Liu, Ji; Liu, Yang (February 2021, Proceedings of the ACM on Measurement and Analysis of Computing Systems)

In this paper, we study Federated Bandit, a decentralized Multi-Armed Bandit problem with a set of N agents, who can only communicate their local data with neighbors described by a connected graph G. Each agent makes a sequence of decisions on selecting an arm from M candidates, yet they only have access to local and potentially biased feedback/evaluation of the true reward for each action taken. Learning only locally will lead agents to sub-optimal actions while converging to a no-regret strategy requires a collection of distributed data. Motivated by the proposal of federated learning, we aim for a solution with which agents will never share their local observations with a central entity, and will be allowed to only share a private copy of his/her own information with their neighbors. We first propose a decentralized bandit algorithm \textttGossip\_UCB, which is a coupling of variants of both the classical gossiping algorithm and the celebrated Upper Confidence Bound (UCB) bandit algorithm. We show that \textttGossip\_UCB successfully adapts local bandit learning into a global gossiping process for sharing information among connected agents, and achieves guaranteed regret at the order of O(\max\ \textttpoly (N,M) łog T, \textttpoly (N,M)łog_łambda_2^-1 N\ ) for all N agents, where łambda_2\in(0,1) is the second largest eigenvalue of the expected gossip matrix, which is a function of G. We then propose \textttFed\_UCB, a differentially private version of \textttGossip\_UCB, in which the agents preserve ε-differential privacy of their local data while achieving O(\max \\frac\textttpoly (N,M) ε łog^2.5 T, \textttpoly (N,M) (łog_łambda_2^-1 N + łog T) \ ) regret.
more » « less

Search for: All records