Inference of Utilities and Time Preference in Sequential Decision-Making

Cao, Haoyang; Wu, Zhengqi; Xu, Renyuan

doi:10.1007/s00245-025-10318-7

This paper introduces a novel stochastic control framework to enhance the capabilities of automated investment managers, or robo-advisors, by accurately inferring clients’ investment preferences from past activities. Our approach leverages a continuous-time model that incorporates utility functions and a generic discounting scheme with a time-varying rate, which can be tailored to each client’s risk tolerance, valuation of daily consumption, and significant life goals; this general discounting scheme is also referred to as the time preference. Through state augmentation, the new stochastic control framework allows for the joint inference of utilities and time preferences. We establish the corresponding dynamic programming principle and the verification theorem. Additionally, we provide sufficient conditions for the identifiability of client investment preferences. To complement our theoretical developments, we propose a learning algorithm based on maximum likelihood estimation within a discrete-time Markov Decision Process framework, augmented with entropy regularization. We prove that the Hessian matrix of the log-likelihood function is negative semi-definite at the client’s investment parameters, facilitating fast convergence of our proposed algorithm. Practical effectiveness and efficiency are showcased through two numerical examples, including Merton’s problem and an investment problem with unhedgeable risks. Our proposed framework not only advances financial technology by improving personalized investment advice but also contributes broadly to other fields such as healthcare, economics, and artificial intelligence, where understanding individual preferences is crucial.

More Like this