Can deep convolutional neural networks (CNNs) for image classification be interpreted as utility maximizers with information costs? By performing set-valued system identifica- tion for Bayesian decision systems, we demonstrate that deep CNNs behave equivalently (in terms of necessary and sufficient conditions) to rationally inattentive Bayesian utility maximizers, a generative model used extensively in economics for human decision-making. Our claim is based on approximately 500 numerical experiments on 5 widely used neural network archi- tectures. The parameters of the resulting interpretable model are computed efficiently via convex feasibility algorithms. As a practical application, we also illustrate how the reconstructed interpretable model can predict the classification performance of deep CNNs with high accuracy. The theoretical foundation of our approach lies in Bayesian revealed preference studied in micro-economics. All our results are on GitHub and completely reproducible.
more »
« less
This content will become publicly available on January 1, 2026
Interacting Large Language Model Agents. Bayesian Social Learning Based Interpretable Models
This paper discusses the theory and algorithms for interacting large language model agents (LLMAs) using methods from statistical signal processing and microeconomics. While both fields are mature, their application to decision-making involving interacting LLMAs remains unexplored. Motivated by Bayesian sentiment analysis on online platforms, we construct interpretable models and stochastic control algorithms that enable LLMAs to interact and perform Bayesian inference. Because interacting LLMAs learn from both prior decisions and external inputs, they can exhibit bias and herding behavior. Thus, developing interpretable models and stochastic control algorithms is essential to understand and mitigate these behaviors. This paper has three main results. First, we show using Bayesian revealed preferences from microeconomics that an individual LLMA satisfies the necessary and sufficient conditions for rationally inattentive (bounded rationality) Bayesian utility maximization and, given an observation, the LLMA chooses an action that maximizes a regularized utility. Second, we utilize Bayesian social learning to construct interpretable models for LLMAs that interact sequentially with each other and the environment while performing Bayesian inference. Our proposed models capture the herding behavior exhibited by interacting LLMAs. Third, we propose a stochastic control framework to delay herding and improve state estimation accuracy under two settings: 1) centrally controlled LLMAs and 2) autonomous LLMAs with incentives. Throughout the paper, we numerically demonstrate the effectiveness of our methods on real datasets for hate speech classification and product quality assessment, using open-source models like LLaMA and Mistral and closed-source models like ChatGPT. The main takeaway of this paper, based on substantial empirical analysis and mathematical formalism, is that LLMAs act as rationally bounded Bayesian agents that exhibit social learning when interacting. Traditionally, such models are used in economics to study interacting human decision-makers.
more »
« less
- Award ID(s):
- 2112457
- PAR ID:
- 10607957
- Publisher / Repository:
- IEEE
- Date Published:
- Journal Name:
- IEEE Access
- Volume:
- 13
- ISSN:
- 2169-3536
- Page Range / eLocation ID:
- 25465 to 25504
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
We describe a general approach to modeling rational decision-making agents who adopt either quantum or classical mechanics based on the Quantum Bayesian (QBist) approach to quantum theory. With the additional ingredient of a scheme by which the properties of one agent may influence another, we arrive at a flexible framework for treating multiple interacting quantum and classical Bayesian agents. We present simulations in several settings to illustrate our construction: quantum and classical agents receiving signals from an exogenous source, two interacting classical agents, two interacting quantum agents, and interactions between classical and quantum agents. A consistent treatment of multiple interacting users of quantum theory may allow us to properly interpret existing multi-agent protocols and could suggest new approaches in other areas such as quantum algorithm design.more » « less
-
In this article, we consider the problem of stabilizing a class of degenerate stochastic processes, which are constrained to a bounded Euclidean domain or a compact smooth manifold, to a given target probability density. This stabilization problem arises in the field of swarm robotics, for example, in applications where a swarm of robots is required to cover an area according to a target probability density. Most existing works on modeling and control of robotic swarms that use partial differential equation (PDE) models assume that the robots' dynamics are holonomic and, hence, the associated stochastic processes have generators that are elliptic. We relax this assumption on the ellipticity of the generator of the stochastic processes, and consider the more practical case of the stabilization problem for a swarm of agents whose dynamics are given by a controllable driftless control-affine system. We construct state-feedback control laws that exponentially stabilize a swarm of nonholonomic agents to a target probability density that is sufficiently regular. State-feedback laws can stabilize a swarm only to target probability densities that are positive everywhere. To stabilize the swarm to probability densities that possibly have disconnected supports, we introduce a semilinear PDE model of a collection of interacting agents governed by a hybrid switching diffusion process. The interaction between the agents is modeled using a (mean-field) feedback law that is a function of the local density of the swarm, with the switching parameters as the control inputs. We show that under the action of this feedback law, the semilinear PDE system is globally asymptotically stable about the given target probability density. The stabilization strategies with and without agent interactions are verified numerically for agents that evolve according to the Brockett integrator; the strategy with interactions is additionally verified for agents that evolve according to an underactuated s...more » « less
-
Camps-Valls, Gustau; Ruiz, Francisco J.; Valera, Isabel (Ed.)Bayesian Networks are useful for analyzing the properties of systems with large populations of interacting agents (e.g., in social modeling applications and distributed service applications). These networks typically have large functions (CPTs), making exact inference intractable. However, often these models have additive symmetry. In this paper we show how summation-based CPTs, especially in the presence of symmetry, can be computed efficiently through the usage of the Fast Fourier Transform (FFT). In particular, we propose an efficient method using the FFT for reducing the size of Conditional Probability Tables (CPTs) in Bayesian Networks with summation-based causal independence (CI). We show how to apply it directly towards the acceleration of Bucket Elimination, and we subsequently provide experimental results demonstrating the computational speedup provided by our method.more » « less
-
Stochastic gradient Langevin dynamics (SGLD) and stochastic gradient Hamiltonian Monte Carlo (SGHMC) are two popular Markov Chain Monte Carlo (MCMC) algorithms for Bayesian inference that can scale to large datasets, allowing to sample from the posterior distribution of the parameters of a statistical model given the input data and the prior distribution over the model parameters. However, these algorithms do not apply to the decentralized learning setting, when a network of agents are working collaboratively to learn the parameters of a statistical model without sharing their individual data due to privacy reasons or communication constraints. We study two algorithms: Decentralized SGLD (DE-SGLD) and Decentralized SGHMC (DE-SGHMC) which are adaptations of SGLD and SGHMC methods that allow scaleable Bayesian inference in the decentralized setting for large datasets. We show that when the posterior distribution is strongly log-concave and smooth, the iterates of these algorithms converge linearly to a neighborhood of the target distribution in the 2-Wasserstein distance if their parameters are selected appropriately. We illustrate the efficiency of our algorithms on decentralized Bayesian linear regression and Bayesian logistic regression problemsmore » « less
An official website of the United States government
