We introduce a sequential Bayesian binary hypothesis testing problem under social learning, termed selfish learning, where agents work to maximize their individual rewards. In particular, each agent receives a private signal and is aware of decisions made by earlieracting agents. Beside inferring the underlying hypothesis, agents also decide whether to stop and declare, or pass the inference to the next agent. The employer rewards only correct responses and the reward per worker decreases with the number of employees used for decision making. We characterize decision regions of agents in the infinite and finite horizon. In particular, we show that the decision boundaries in the infinite horizon are the solutions to a Markov Decision Process with discounted costs, and can be solved using value iteration. In the finite horizon, we show that team performance is enhanced upon appropriate incentivization when compared to sequential social learning.
Probability Reweighting in Social Learning: Optimality and Suboptimality
This work explores sequential Bayesian binary hypothesis testing in the social learning setup under expertise diversity. We consider a twoagent (say advisorlearner) sequential binary hypothesis test where the learner infers the hypothesis based on the decision of the advisor, a prior private signal, and individual belief. In addition, the agents have varying
expertise, in terms of the noise variance in the private signal. Under such a setting, we first investigate the behavior of optimal agent beliefs and observe that the nature of optimal agents could be inverted depending on expertise levels. We also discuss suboptimality of the Prelec reweighting function under diverse expertise. Next, we consider an advisor selection problem wherein the belief of the learner is fixed and the
advisor is to be chosen for a given prior. We characterize the decision region for choosing such an advisor and argue that a learner with beliefs varying from the true prior often ends up selecting a suboptimal advisor.
 Award ID(s):
 1717530
 Publication Date:
 NSFPAR ID:
 10059999
 Journal Name:
 Proceedings of the 2018 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
 Page Range or eLocationID:
 69666970
 Sponsoring Org:
 National Science Foundation
More Like this


We consider sequential stochastic decision problems in which, at each time instant, an agent optimizes its local utility by solving a stochastic program and, subsequently, announces its decision to the world. Given this action, we study the problem of estimating the agent’s private belief (i.e., its posterior distribution over the set of states of nature based on its private observations). We demonstrate that it is possible to determine the set of private beliefs that are consistent with public data by leveraging techniques from inverse optimization. We further give a number of useful characterizations of this set; for example, tight bounds by solving a set of linear programs (under concave utility). As an illustrative example, we consider estimating the private belief of an investor in regimeswitching portfolio allocation. Finally, our theoretical results are illustrated and evaluated in numerical simulations.

We study the problem of optimal information sharing in the context of a service system. In particular, we consider an unobservable single server queue offering a service at a fixed price to a Poisson arrival of delaysensitive customers. The service provider can observe the queue, and may share information about the state of the queue with each arriving customer. The customers are Bayesian and strategic, and incorporate any information provided by the service provider into their prior beliefs about the queue length before making the decision whether to join the queue or leave without obtaining service. We pose the following question: which signaling mechanism and what price should the service provider select to maximize her revenue? We formulate this problem as an instance of Bayesian persuasion in dynamic settings. The underlying dynamics make the problem more difficult because, in contrast to static settings, the signaling mechanism adopted by the service provider affects the customers' prior beliefs about the queue (given by the steady state distribution of the queue length in equilibrium). The core contribution of this work is in characterizing the structure of the optimal signaling mechanism. We summarize our main results as follows. (1) Structural characterization: Using a revelationprinciplemore »

We consider the problem of decentralized sequential active hypothesis testing (DSAHT), where two transmitting agents, each possessing a private message, are actively helping a third agent–and each other–to learn the message pair over a discrete memoryless multiple access channel (DMMAC). The third agent (receiver) observes the noisy channel output, which is also available to the transmitting agents via noiseless feedback. We formulate this problem as a decentralized dynamic team, show that optimal transmission policies have a timeinvariant domain, and characterize the solution through a dynamic program. Several alternative formulations are discussed involving timehomogenous cost functions and/or variablelength codes, resulting in solutions described through fixedpoint, Bellmantype equations. Subsequently, we make connections with the problem of simplifying the multiletter capacity expressions for the noiseless feedback capacity of the DMMAC. We show that restricting attention to distributions induced by optimal transmission schemes for the DSAHT problem, without loss of optimality, transforms the capacity expression, so that it can be thought of as the average reward received by an appropriately defined stochastic dynamical system with timeinvariant state space.

We consider information design in spatial resource competition, motivated by ride sharing platforms sharing information with drivers about rider demand. Each of N colocated agents (drivers) decides whether to move to another location with an uncertain and possibly higher resource level (rider demand), where the utility for moving increases in the resource level and decreases in the number of other agents that move. A principal who can observe the resource level wishes to share this information in a way that ensures a welfaremaximizing number of agents move. Analyzing the principal’s information design problem using the Bayesian persuasion framework, we study both private signaling mechanisms, where the principal sends personalized signals to each agent, and public signaling mechanisms, where the principal sends the same information to all agents. We show: 1) For private signaling, computing the optimal mechanism using the standard approach leads to a linear program with 2 N variables, rendering the computation challenging. We instead describe a computationally efficient twostep approach to finding the optimal private signaling mechanism. First, we perform a change of variables to solve a linear program with O(N^2) variables that provides the marginal probabilities of recommending each agent move. Second, we describe an efficient samplingmore »