Observational learning models seek to understand how distributed agents learn from observing the actions of others. In the basic model, agents seek to choose between two alternatives, where the underlying value of each alternative is the same for each agent. Agents do not know this value but only observe a noisy signal of the value and make their decision based on this signal and observations of other agents’ actions. Here, instead we consider a scenario in which the choices faced by an agent exhibit a negative externality so that value of a choice may decrease depending on the history of other agents selecting that choice. We study the learning behavior of Bayesian agents with such an externality and show that this can lead to very different outcomes compared to models without such an externality.
more »
« less
Using Response Times to Infer Others’ Private Information: An Application to Information Cascades
The standard assumption in social learning environments is that agents learn from others through choice outcomes. We argue that in many settings, agents can also infer information from others’ response times (RT), which can increase efficiency. To investigate this, we conduct a standard information cascade experiment and find that RTs do contain information that is not revealed by choice outcomes alone. When RTs are observable, subjects extract this private information and are more likely to break from incorrect cascades. Our results suggest that in environments where RTs are publicly available, the information structure may be richer than previously thought.
more »
« less
- Award ID(s):
- 1749824
- PAR ID:
- 10415778
- Date Published:
- Journal Name:
- Management Science
- Volume:
- 68
- Issue:
- 4
- ISSN:
- 2745-9934
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
null (Ed.)Revealed preference is the dominant approach for inferring preferences, but it is limited in that it relies solely on discrete choice data. When a person chooses one alternative over another, we cannot infer the strength of their preference or predict how likely they will be to make the same choice again. However, the choice process also produces response times (RTs), which are continuous and easily observable. It has been shown that RTs often decrease with strength-of-preference. This is a basic property of sequential sampling models such as the drift diffusion model. What remains unclear is whether this relationship is sufficiently strong, relative to the other factors that affect RTs, to allow us to reliably infer strength-of-preference across individuals. Using several experiments, we show that even when every subject chooses the same alternative, we can still rank them based on their RTs and predict their behavior on other choice problems. We can also use RTs to predict whether a subject will repeat or reverse their decision when presented with the same choice problem a second time. Finally, as a proof-of-concept, we demonstrate that it is also possible to recover individual preference parameters from RTs alone. These results demonstrate that it is indeed possible to use RTs to infer preferences.more » « less
-
Standard methods for synthesis of control policies in Markov decision processes with unknown transition probabilities largely rely on a combination of exploration and exploitation. While these methods often offer theoretical guarantees on system performance, the number of time steps and samples needed to initially explore the environment before synthesizing a well-performing control policy is impractically large. This paper partially alleviates such a burden by incorporating a priori existing knowledge into learning, when such knowledge is available. Based on prior information about bounds on the differences between the transition probabilities at different states, we propose a learning approach where the transition probabilities at a given state are not only learned from outcomes of repeatedly performing a certain action at that state, but also from outcomes of performing actions at states that are known to have similar transition probabilities. Since the directly obtained information is more reliable at determining transition probabilities than second-hand information, i.e., information obtained from similar but potentially slightly different states, samples obtained indirectly are weighted with respect to the known bounds on the differences of transition probabilities. While the proposed strategy can naturally lead to errors in learned transition probabilities, we show that, by proper choice of the weights, such errors can be reduced, and the number of steps needed to form a near-optimal control policy in the Bayesian sense can be significantly decreased.more » « less
-
Abstract We present an approach to analyse learning outcomes in a broad class of misspecified environments, spanning both single-agent and social learning. We introduce a novel “prediction accuracy” order over subjective models and observe that this makes it possible to partially restore standard martingale convergence arguments that apply under correctly specified learning. Based on this, we derive general conditions to determine when beliefs in a given environment converge to some long-run belief either locally or globally (i.e. from some or all initial beliefs). We show that these conditions can be applied, first, to unify and generalize various convergence results in previously studied settings. Second, they enable us to analyse environments where learning is “slow”, such as costly information acquisition and sequential social learning. In such environments, we illustrate that even if agents learn the truth when they are correctly specified, vanishingly small amounts of misspecification can generate extreme failures of learning.more » « less
-
The measurement of individual differences in specific cognitive functions has been an important area of study for decades. Often the goal of such studies is to determine whether there are cognitive deficits or enhancements associated with, for example, a specific population, psychological disorder, health status, or age group. The inherent difficulty, however, is that most cognitive functions are not directly observable, so researchers rely on indirect measures to infer an individual’s functioning. One of the most common approaches is to use a task that is designed to tap into a specific function and to use behavioral measures, such as reaction times (RTs), to assess performance on that task. Although this approach is widespread, it unfortunately is subject to a problem of reverse inference: Differences in a given cognitive function can be manifest as differences in RTs, but that does not guarantee that differences in RTs imply differences in that cognitive function. We illustrate this inference problem with data from a study on aging and lexical processing, highlighting how RTs can lead to erroneous conclusions about processing. Then we discuss how employing choice-RT models to analyze data can improve inference and highlight practical approaches to improving the models and incorporating them into one’s analysis pipeline.more » « less
An official website of the United States government

