skip to main content

Title: Probability Reweighting in Social Learning: Optimality and Suboptimality
This work explores sequential Bayesian binary hypothesis testing in the social learning setup under expertise diversity. We consider a two-agent (say advisor-learner) sequential binary hypothesis test where the learner infers the hypothesis based on the decision of the advisor, a prior private signal, and individual belief. In addition, the agents have varying expertise, in terms of the noise variance in the private signal. Under such a setting, we first investigate the behavior of optimal agent beliefs and observe that the nature of optimal agents could be inverted depending on expertise levels. We also discuss suboptimality of the Prelec reweighting function under diverse expertise. Next, we consider an advisor selection problem wherein the belief of the learner is fixed and the advisor is to be chosen for a given prior. We characterize the decision region for choosing such an advisor and argue that a learner with beliefs varying from the true prior often ends up selecting a suboptimal advisor.
Authors:
; ;
Award ID(s):
1717530
Publication Date:
NSF-PAR ID:
10059999
Journal Name:
Proceedings of the 2018 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
Page Range or eLocation-ID:
6966-6970
Sponsoring Org:
National Science Foundation
More Like this
  1. We introduce a sequential Bayesian binary hypothesis testing problem under social learning, termed selfish learning, where agents work to maximize their individual rewards. In particular, each agent receives a private signal and is aware of decisions made by earlier-acting agents. Beside inferring the underlying hypothesis, agents also decide whether to stop and declare, or pass the inference to the next agent. The employer rewards only correct responses and the reward per worker decreases with the number of employees used for decision making. We characterize decision regions of agents in the infinite and finite horizon. In particular, we show that the decision boundaries in the infinite horizon are the solutions to a Markov Decision Process with discounted costs, and can be solved using value iteration. In the finite horizon, we show that team performance is enhanced upon appropriate incentivization when compared to sequential social learning.
  2. We consider sequential stochastic decision problems in which, at each time instant, an agent optimizes its local utility by solving a stochastic program and, subsequently, announces its decision to the world. Given this action, we study the problem of estimating the agent’s private belief (i.e., its posterior distribution over the set of states of nature based on its private observations). We demonstrate that it is possible to determine the set of private beliefs that are consistent with public data by leveraging techniques from inverse optimization. We further give a number of useful characterizations of this set; for example, tight bounds by solving a set of linear programs (under concave utility). As an illustrative example, we consider estimating the private belief of an investor in regime-switching portfolio allocation. Finally, our theoretical results are illustrated and evaluated in numerical simulations.
  3. We study the problem of optimal information sharing in the context of a service system. In particular, we consider an unobservable single server queue offering a service at a fixed price to a Poisson arrival of delay-sensitive customers. The service provider can observe the queue, and may share information about the state of the queue with each arriving customer. The customers are Bayesian and strategic, and incorporate any information provided by the service provider into their prior beliefs about the queue length before making the decision whether to join the queue or leave without obtaining service. We pose the following question: which signaling mechanism and what price should the service provider select to maximize her revenue? We formulate this problem as an instance of Bayesian persuasion in dynamic settings. The underlying dynamics make the problem more difficult because, in contrast to static settings, the signaling mechanism adopted by the service provider affects the customers' prior beliefs about the queue (given by the steady state distribution of the queue length in equilibrium). The core contribution of this work is in characterizing the structure of the optimal signaling mechanism. We summarize our main results as follows. (1) Structural characterization: Using a revelation-principlemore »style argument, we find that it suffices to consider signaling mechanisms where the service provider sends a binary signal of "join" or "leave", and under which the equilibrium strategy of a customer is to follow the service provider's recommended action. (2) Optimality of threshold policies: For a given fixed price for service, we use the structural characterization to show that the optimal signaling mechanism can be obtained as a solution to a linear program with a countable number of variables and constraints. Under some mild technical conditions on the waiting costs, we establish that there exists an optimal signaling mechanism with a threshold structure, where service provider sends the "join" signal if the queue length is below a threshold, and "leave" otherwise. (In addition, at the threshold, the service provider randomizes.) For the special case of linear waiting costs, we derive an analytical expression for the optimal threshold i terms of the two branches of the Lambert-W function. (3) Revenue comparison: Finally, we show that with the optimal choice of the fixed price and using the corresponding optimal signaling mechanism, the service provider can achieve the same revenue as with the optimal state-dependent pricing mechanism in a fully-observable queue. This implies that in settings where state-dependent pricing is not feasible, the service provider can effectively use optimal signaling (with the optimal fixed price) to achieve the same revenue.« less
  4. We consider the problem of decentralized sequential active hypothesis testing (DSAHT), where two transmitting agents, each possessing a private message, are actively helping a third agent–and each other–to learn the message pair over a discrete memoryless multiple access channel (DM-MAC). The third agent (receiver) observes the noisy channel output, which is also available to the transmitting agents via noiseless feedback. We formulate this problem as a decentralized dynamic team, show that optimal transmission policies have a time-invariant domain, and characterize the solution through a dynamic program. Several alternative formulations are discussed involving time-homogenous cost functions and/or variable-length codes, resulting in solutions described through fixed-point, Bellman-type equations. Subsequently, we make connections with the problem of simplifying the multi-letter capacity expressions for the noiseless feedback capacity of the DM-MAC. We show that restricting attention to distributions induced by optimal transmission schemes for the DSAHT problem, without loss of optimality, transforms the capacity expression, so that it can be thought of as the average reward received by an appropriately defined stochastic dynamical system with time-invariant state space.
  5. We consider information design in spatial resource competition, motivated by ride sharing platforms sharing information with drivers about rider demand. Each of N co-located agents (drivers) decides whether to move to another location with an uncertain and possibly higher resource level (rider demand), where the utility for moving increases in the resource level and decreases in the number of other agents that move. A principal who can observe the resource level wishes to share this information in a way that ensures a welfare-maximizing number of agents move. Analyzing the principal’s information design problem using the Bayesian persuasion framework, we study both private signaling mechanisms, where the principal sends personalized signals to each agent, and public signaling mechanisms, where the principal sends the same information to all agents. We show: 1) For private signaling, computing the optimal mechanism using the standard approach leads to a linear program with 2 N variables, rendering the computation challenging. We instead describe a computationally efficient two-step approach to finding the optimal private signaling mechanism. First, we perform a change of variables to solve a linear program with O(N^2) variables that provides the marginal probabilities of recommending each agent move. Second, we describe an efficient samplingmore »procedure over sets of agents consistent with these optimal marginal probabilities; the optimal private mechanism then asks the sampled set of agents to move and the rest to stay. 2) For public signaling, we first show the welfare-maximizing equilibrium given any common belief has a threshold structure. Using this, we show that the optimal public mechanism with respect to the sender-preferred equilibrium can be computed in polynomial time. 3) We support our analytical results with numerical computations that show the optimal private and public signaling mechanisms achieve substantially higher social welfare when compared with no-information and full-information benchmarks.« less