Structural Estimation of Markov Decision Processes in High-Dimensional State Space with Finite-Time Guarantees

Zeng, Siliang; Hong, Mingyi; Garcia, Alfredo

doi:10.1287/opre.2022.0511

Citation Details

Structural Estimation of Markov Decision Processes in High-Dimensional State Space with Finite-Time Guarantees

Researchers have introduced a new algorithm to estimate structural models of dynamic decisions by human agents, addressing the challenge of high computational complexity. Traditionally, this task involves a nested structure: an inner problem identifying an optimal policy and an outer problem maximizing a measure of fit. Previous methods have struggled with large discrete state spaces or high-dimensional continuous state spaces, often sacrificing reward estimation accuracy. The new approach combines policy improvement with a stochastic gradient step for likelihood maximization, ensuring accurate reward estimation without compromising computational efficiency. This single-loop algorithm, designed to handle high-dimensional state spaces, converges to a stationary solution with finite-time guarantees. When the reward is linearly parameterized, it approximates the maximum likelihood estimator sublinearly, offering a robust solution for complex decision modeling tasks. more »

Award ID(s):: 1910385 2414372 2426064

PAR ID:: 10546631

Author(s) / Creator(s):: Zeng, Siliang; Hong, Mingyi; Garcia, Alfredo

Publisher / Repository:: INFORMS

Date Published:: 2024-09-19

Journal Name:: Operations Research

ISSN:: 0030-364X

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
https://doi.org/10.1287/opre.2022.0511

More Like this