We consider the problem where N agents collaboratively interact with an instance of a stochastic K arm bandit problem for K N. The agents aim to simultaneously minimize the cumulative regret over all the agents for a total of T time steps, the number of communication rounds, and the number of bits in each communication round. We present Limited Communication Collaboration  Upper Confidence Bound (LCCUCB), a doublingepoch based algorithm where each agent communicates only after the end of the epoch and shares the index of the best arm it knows. With our algorithm, LCCUCB, each agent enjoys a regret of O√(K/N + N)T, communicates for O(log T) steps and broadcasts O(log K) bits in each communication step. We extend the work to sparse graphs with maximum degree KG and diameter D to propose LCCUCBGRAPH which enjoys a regret bound of O√(D(K/N + KG)DT). Finally, we empirically show that the LCCUCB and the LCCUCBGRAPH algorithms perform well and outperform strategies that communicate through a central node.
more »
« less
Graph Exploration by EnergySharing Mobile Agents
We consider the problem of collective exploration of a known n
node edgeweighted graph by k mobile agents that have limited energy but are
capable of energy transfers. The agents are initially placed at an arbitrary subset
of nodes in the graph, and each agent has an initial, possibly different, amount
of energy. The goal of the exploration problem is for every edge in the graph to
be traversed by at least one agent. The amount of energy used by an agent to
travel distance x is proportional to x. In our model, the agents can share energy
when colocated: when two agents meet, one can transfer part of its energy to
the other.
For an nnode path, we give an O(n+k) time algorithm that either nds an
exploration strategy, or reports that one does not exist. For an nnode tree with
l leaves, we give an O(n+lk^2) algorithm that finds an exploration strategy if one
exists. Finally, for the general graph case, we show that the problem of deciding
if exploration is possible by energysharing agents is NPhard, even for 3regular
graphs. In addition, we show that it is always possible to find an exploration
strategy if the total energy of the agents is at least twice the total weight of the
edges; moreover, this is asymptotically optimal.
more »
« less
 Award ID(s):
 1813940
 NSFPAR ID:
 10231280
 Date Published:
 Journal Name:
 28th International Colloquium on Structural Information and Communication Complexity (SIROCCO)
 Format(s):
 Medium: X
 Sponsoring Org:
 National Science Foundation
More Like this


Tauman_Kalai, Yael (Ed.)Consider an agent exploring an unknown graph in search of some goal state. As it walks around the graph, it learns the nodes and their neighbors. The agent only knows where the goal state is when it reaches it. How do we reach this goal while moving only a small distance? This problem seems hopeless, even on trees of bounded degree, unless we give the agent some help. This setting with "help" often arises in exploring large search spaces (e.g., huge game trees) where we assume access to some score/quality function for each node, which we use to guide us towards the goal. In our case, we assume the help comes in the form of distance predictions: each node v provides a prediction f(v) of its distance to the goal vertex. Naturally if these predictions are correct, we can reach the goal along a shortest path. What if the predictions are unreliable and some of them are erroneous? Can we get an algorithm whose performance relates to the error of the predictions? In this work, we consider the problem on trees and give deterministic algorithms whose total movement cost is only O(OPT + Δ ⋅ ERR), where OPT is the distance from the start to the goal vertex, Δ the maximum degree, and the ERR is the total number of vertices whose predictions are erroneous. We show this guarantee is optimal. We then consider a "planning" version of the problem where the graph and predictions are known at the beginning, so the agent can use this global information to devise a search strategy of low cost. For this planning version, we go beyond trees and give an algorithms which gets good performance on (weighted) graphs with bounded doubling dimension.more » « less

We consider a multiagent multiarmed bandit setting in which n honest agents collaborate over a network to minimize regret but m malicious agents can disrupt learning arbitrarily. Assuming the network is the complete graph, existing algorithms incur O((m + K/n) łog (T) / Δ ) regret in this setting, where K is the number of arms and Δ is the arm gap. For m łl K, this improves over the singleagent baseline regret of O(Kłog(T)/Δ). In this work, we show the situation is murkier beyond the case of a complete graph. In particular, we prove that if the stateoftheart algorithm is used on the undirected line graph, honest agents can suffer (nearly) linear regret until time is doubly exponential in K and n . In light of this negative result, we propose a new algorithm for which the i th agent has regret O(( dmal (i) + K/n) łog(T)/Δ) on any connected and undirected graph, where dmal(i) is the number of i 's neighbors who are malicious. Thus, we generalize existing regret bounds beyond the complete graph (where dmal(i) = m), and show the effect of malicious agents is entirely local (in the sense that only the dmal (i) malicious agents directly connected to i affect its longterm regret).more » « less

Given a directed acyclic graph (DAG) G=(V,E), we say that G is (e,d)depthrobust (resp. (e,d)edgedepthrobust) if for any set S⊆V (resp. S⊆E) of at most S≤e nodes (resp. edges) the graph G−S contains a directed path of length d. While edgedepthrobust graphs are potentially easier to construct, many applications in cryptography require node depthrobust graphs with small indegree. We create a graph reduction that transforms an (e,d)edgedepthrobust graph with m edges into a (e/2,d)depthrobust graph with O(m) nodes and constant indegree. One immediate consequence of this result is the first construction of a provably (nloglognlogn,nlogn(logn)loglogn)depthrobust graph with constant indegree. Our reduction crucially relies on STrobust graphs, a new graph property we introduce which may be of independent interest. We say that a directed, acyclic graph with n inputs and n outputs is (k1,k2)STrobust if we can remove any k1 nodes and there exists a subgraph containing at least k2 inputs and k2 outputs such that each of the k2 inputs is connected to all of the k2 outputs. If the graph if (k1,n−k1)STrobust for all k1≤n we say that the graph is maximally STrobust. We show how to construct maximally STrobust graphs with constant indegree and O(n) nodes. Given a family M of STrobust graphs and an arbitrary (e,d)edgedepthrobust graph G we construct a new constantindegree graph Reduce(G,M) by replacing each node in G with an STrobust graph from M. We also show that STrobust graphs can be used to construct (tight) proofsofspace and (asymptotically) improved wideblock labeling functions.more » « less

null (Ed.)Motivated by applications in gerrymandering detection, we study a reconfiguration problem on connected partitions of a connected graph G. A partition of V(G) is connected if every part induces a connected subgraph. In many applications, it is desirable to obtain parts of roughly the same size, possibly with some slack s. A Balanced Connected kPartition with slack s, denoted (k, s)BCP, is a partition of V(G) into k nonempty subsets, of sizes n1,…,nk with ni−n/k≤s , each of which induces a connected subgraph (when s=0 , the k parts are perfectly balanced, and we call it kBCP for short). A recombination is an operation that takes a (k, s)BCP of a graph G and produces another by merging two adjacent subgraphs and repartitioning them. Given two kBCPs, A and B, of G and a slack s≥0 , we wish to determine whether there exists a sequence of recombinations that transform A into B via (k, s)BCPs. We obtain four results related to this problem: (1) When s is unbounded, the transformation is always possible using at most 6(k−1) recombinations. (2) If G is Hamiltonian, the transformation is possible using O(kn) recombinations for any s≥n/k , and (3) we provide negative instances for s≤n/(3k) . (4) We show that the problem is PSPACEcomplete when k∈O(nε) and s∈O(n1−ε) , for any constant 0<ε≤1 , even for restricted settings such as when G is an edgemaximal planar graph or when k≥3 and G is planar.more » « less