Can deep convolutional neural networks (CNNs) for image classification be interpreted as utility maximizers with information costs? By performing set-valued system identifica- tion for Bayesian decision systems, we demonstrate that deep CNNs behave equivalently (in terms of necessary and sufficient conditions) to rationally inattentive Bayesian utility maximizers, a generative model used extensively in economics for human decision-making. Our claim is based on approximately 500 numerical experiments on 5 widely used neural network archi- tectures. The parameters of the resulting interpretable model are computed efficiently via convex feasibility algorithms. As a practical application, we also illustrate how the reconstructed interpretable model can predict the classification performance of deep CNNs with high accuracy. The theoretical foundation of our approach lies in Bayesian revealed preference studied in micro-economics. All our results are on GitHub and completely reproducible.
more »
« less
Rationally Inattentive Inverse Reinforcement Learning Explains YouTube Commenting Behavior
We consider a novel application of inverse reinforcement learning with behavioral economics constraints to model, learn and predict the commenting behavior of YouTube viewers. Each group of users is modeled as a rationally inattentive Bayesian agent which solves a contextual bandit problem. Our methodology integrates three key components. First, to identify distinct commenting patterns, we use deep embedded clustering to estimate framing information (essential extrinsic features) that clusters users into distinct groups. Second, we present an inverse reinforcement learning algorithm that uses Bayesian revealed preferences to test for rationality: does there exist a utility function that rationalizes the given data, and if yes, can it be used to predict commenting behavior? Finally, we impose behavioral economics constraints stemming from rational inattention to characterize the attention span of groups of users. The test imposes a Rényi mutual information cost constraint which impacts how the agent can select attention strategies to maximize their expected utility. After a careful analysis of a massive YouTube dataset, our surprising result is that in most YouTube user groups, the commenting behavior is consistent with optimizing a Bayesian utility with rationally inattentive constraints. The paper also highlights how the rational inattention model can accurately predict commenting behavior. The massive YouTube dataset and analysis used in this paper are available on GitHub and completely reproducible
more »
« less
- Award ID(s):
- 1714180
- PAR ID:
- 10227155
- Date Published:
- Journal Name:
- Journal of machine learning research
- Volume:
- 21
- ISSN:
- 1532-4435
- Page Range / eLocation ID:
- 1-39
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
This paper discusses the theory and algorithms for interacting large language model agents (LLMAs) using methods from statistical signal processing and microeconomics. While both fields are mature, their application to decision-making involving interacting LLMAs remains unexplored. Motivated by Bayesian sentiment analysis on online platforms, we construct interpretable models and stochastic control algorithms that enable LLMAs to interact and perform Bayesian inference. Because interacting LLMAs learn from both prior decisions and external inputs, they can exhibit bias and herding behavior. Thus, developing interpretable models and stochastic control algorithms is essential to understand and mitigate these behaviors. This paper has three main results. First, we show using Bayesian revealed preferences from microeconomics that an individual LLMA satisfies the necessary and sufficient conditions for rationally inattentive (bounded rationality) Bayesian utility maximization and, given an observation, the LLMA chooses an action that maximizes a regularized utility. Second, we utilize Bayesian social learning to construct interpretable models for LLMAs that interact sequentially with each other and the environment while performing Bayesian inference. Our proposed models capture the herding behavior exhibited by interacting LLMAs. Third, we propose a stochastic control framework to delay herding and improve state estimation accuracy under two settings: 1) centrally controlled LLMAs and 2) autonomous LLMAs with incentives. Throughout the paper, we numerically demonstrate the effectiveness of our methods on real datasets for hate speech classification and product quality assessment, using open-source models like LLaMA and Mistral and closed-source models like ChatGPT. The main takeaway of this paper, based on substantial empirical analysis and mathematical formalism, is that LLMAs act as rationally bounded Bayesian agents that exhibit social learning when interacting. Traditionally, such models are used in economics to study interacting human decision-makers.more » « less
-
Effective coordination of design teams must account for the influence of costs incurred while searching for the best design solutions. This article introduces a cost-aware multi-agent system (MAS), a theoretical model to (1) explain how individuals in a team should search, assuming that they are all rational utility-maximizing decision-makers and (2) study the impact of cost on the search performance of both individual agents and the system. First, we develop a new multi-agent Bayesian optimization framework accounting for information exchange among agents to support their decisions on where to sample in search. Second, we employ a reinforcement learning approach based on the multi-agent deep deterministic policy gradient for training MAS to identify where agents cannot sample due to design constraints. Third, we propose a new cost-aware stopping criterion for each agent to determine when costs outweigh potential gains in search as a criterion to stop. Our results indicate that cost has a more significant impact on MAS communication in complex design problems than in simple ones. For example, when searching in complex design spaces, some agents could initially have low-performance gains, thus stopping prematurely due to negative payoffs, even if those agents could perform better in the later stage of the search. Therefore, global-local communication becomes more critical in such situations for the entire system to converge. The proposed model can serve as a benchmark for empirical studies to quantitatively gauge how humans would rationally make design decisions in a team.more » « less
-
The human ability to deceive others and detect deception has long been tied to theory of mind. We make a stronger argument: in order to be adept liars – to balance gain (i.e. maximizing their own reward) and plausibility (i.e. maintaining a realistic lie) – humans calibrate their lies under the assumption that their partner is a rational, utility-maximizing agent. We develop an adversarial recursive Bayesian model that aims to formalize the behaviors of liars and lie detectors. We compare this model to (1) a model that does not perform theory of mind computations and (2) a model that has perfect knowledge of the opponent’s behavior. To test these models, we introduce a novel dyadic, stochastic game, allowing for quantitative measures of lies and lie detection. In a second experiment, we vary the ground truth probability. We find that our rational models qualitatively predict human lying and lie detecting behavior better than the non-rational model. Our findings suggest that humans control for the extremeness of their lies in a manner reflective of rational social inference. These findings provide a new paradigm and formal framework for nuanced quantitative analysis of the role of rationality and theory of mind in lying and lie detecting behavior.more » « less
-
null (Ed.)While traditional economics assumes that humans are fully rational agents who always maximize their expected utility, in practice, we constantly observe apparently irrational behavior. One explanation is that people have limited computational power, so that they are, quite rationally, making the best decisions they can, given their computational limitations. To test this hypothesis, we consider the multi-armed bandit (MAB) problem. We examine a simple strategy for playing an MAB that can be implemented easily by a probabilistic finite automaton (PFA). Roughly speaking, the PFA sets certain expectations, and plays an arm as long as it meets them. If the PFA has sufficiently many states, it performs near-optimally. Its performance degrades gracefully as the number of states decreases. Moreover, the PFA acts in a ``human-like'' way, exhibiting a number of standard human biases, like an optimism bias and a negativity bias.more » « less
An official website of the United States government

