In traditional models of opinion dynamics, each agent in a network has an opinion and changes in opinions arise from pairwise (i.e., dyadic) interactions between agents. However, in many situations, groups of individuals possess a collective opinion that can differ from the opinions of their constituent individuals. In this paper, we study the effects of group opinions on opinion dynamics. We formulate a hypergraph model in which both individual agents and groups of three agents have opinions, and we examine how opinions evolve through both dyadic interactions and group memberships. We find for some parameter values that the presence of group opinions can lead to oscillatory and excitable opinion dynamics. In the oscillatory regime, the mean opinion of the agents in a network has self-sustained oscillations. In the excitable regime, finite-size effects create large but short-lived opinion swings (as in social fads). We develop a mean-field approximation of our model and obtain good agreement with direct numerical simulations. We also show—both numerically and via our mean-field description—that oscillatory dynamics occur only when the numbers of dyadic and polyadic interactions of the agents are not completely correlated. Our results illustrate how polyadic structures, such as groups of agents, can have important effects on collective opinion dynamics.
more »
« less
Collaboratively Learning the Best Option on Graphs, Using Bounded Local Memory
We consider multi-armed bandit problems in social groups where in each individual has bounded memory and shares the common goal of learning the best arm/option. We say an individual learns the best option if eventually (ast→∞) it pulls only the arm with the highest expected reward. While this goal is provably impossible for an isolated individual due to bounded memory, we show that, in social groups, this goal can be achieved easily with the aid of social persuasion (i.e., communication) as long as the communication networks/graphs satisfy some mild conditions. In this work, we model and analyze a type of learning dynamics which are well-observed in social groups. Specifically, under the learning dynamics of interest, an individual sequentially decides on which arm to pull next based on not only its private reward feedback but also the suggestion provided by a randomly chosen neighbor. To deal with the interplay between the randomness in the rewards and in the social interaction, we employ the mean-field approximation method.Considering the possibility that the individuals in the networks may not be exchangeable when the communication networks are not cliques, we go beyond the classic mean-field techniques and apply a refined version of mean-field approximation. Notably, our results hold even if the communication graphs are highly sparse.
more »
« less
- Award ID(s):
- 1810758
- PAR ID:
- 10161907
- Date Published:
- Journal Name:
- ACM SIGMETRICS
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
We study the multi-player stochastic multiarmed bandit (MAB) problem in an abruptly changing environment. We consider a collision model in which a player receives reward at an arm if it is the only player to select the arm. We design two novel algorithms, namely, Round-Robin Sliding-Window Upper Confidence Bound# (RR-SW-UCB#), and the Sliding- Window Distributed Learning with Prioritization (SW-DLP). We rigorously analyze these algorithms and show that the expected cumulative group regret for these algorithms is upper bounded by sublinear functions of time, i.e., the time average of the regret asymptotically converges to zero. We complement our analytic results with numerical illustrations.more » « less
-
We study a hinted heterogeneous multi-agent multi-armed bandits problem (HMA2B), where agents can query low-cost observations (hints) in addition to pulling arms. In this framework, each of the M agents has a unique reward distribution over K arms, and in T rounds, they can observe the reward of the arm they pull only if no other agent pulls that arm. The goal is to maximize the total utility by querying the minimal necessary hints without pulling arms, achieving time-independent regret. We study HMA2B in both centralized and decentralized setups. Our main centralized algorithm, GP-HCLA, which is an extension of HCLA, uses a central decision-maker for arm-pulling and hint queries, achieving O(M^4 K) regret with O(M K log T) adaptive hints. In decentralized setups, we propose two algorithms, HD-ETC and EBHD-ETC, that allow agents to choose actions independently through collision-based communication and query hints uniformly until stopping, yielding O(M^3 K^2) regret with O(M^3 K log T) hints, where the former requires knowledge of the minimum gap and the latter does not. Finally, we establish lower bounds to prove the optimality of our results and verify them through numerical simulations.more » « less
-
In constrained reinforcement learning (RL), a learning agent seeks to not only optimize the overall reward but also satisfy the additional safety, diversity, or budget constraints. Consequently, existing constrained RL solutions require several new algorithmic ingredients that are notably different from standard RL. On the other hand, reward-free RL is independently developed in the unconstrained literature, which learns the transition dynamics without using the reward information, and thus naturally capable of addressing RL with multiple objectives under the common dynamics. This paper bridges reward-free RL and constrained RL. Particularly, we propose a simple meta-algorithm such that given any reward-free RL oracle, the approachability and constrained RL problems can be directly solved with negligible overheads in sample complexity. Utilizing the existing reward-free RL solvers, our framework provides sharp sample complexity results for constrained RL in the tabular MDP setting, matching the best existing results up to a factor of horizon dependence; our framework directly extends to a setting of tabular two-player Markov games, and gives a new result for constrained RL with linear function approximation.more » « less
-
Mean-field games (MFGs) are developed to model the decision-making processes of a large number of interacting agents in multi-agent systems. This paper studies mean-field games on graphs (G-MFGs). The equilibria of G-MFGs, namely, mean-field equilibria (MFE), are challenging to solve for their high-dimensional action space because each agent has to make decisions when they are at junction nodes or on edges. Furthermore, when the initial population state varies on graphs, we have to recompute MFE, which could be computationally challenging and memory-demanding. To improve the scalability and avoid repeatedly solving G-MFGs every time their initial state changes, this paper proposes physics-informed graph neural operators (PIGNO). The PIGNO utilizes a graph neural operator to generate population dynamics, given initial population distributions. To better train the neural operator, it leverages physics knowledge to propagate population state transitions on graphs. A learning algorithm is developed, and its performance is evaluated on autonomous driving games on road networks. Our results demonstrate that the PIGNO is scalable and generalizable when tested under unseen initial conditions.more » « less
An official website of the United States government

