Hidden Markov Model Estimation-Based Q-learning for Partially Observable Markov Decision Process

Yoon, Hyung-Jin; Lee, Donghwan; Hovakimyan, Naira

doi:10.23919/ACC.2019.8814849

In this paper, we introduce Max Markov Chain (MMC), a novel model for sequential data with sparse correlations among the state variables.It may also be viewed as a special class of approximate models for High-order Markov Chains (HMCs).MMC is desirable for domains where the sparse correlations are long-term and vary in their temporal stretches.Although generally intractable, parameter optimization for MMC can be solved analytically.However, based on this result,we derive an approximate solution that is highly efficient empirically.When compared with HMC and approximate HMC models, MMCcombines better sample efficiency, model parsimony, and an outstanding computational advantage.Such a quality allows MMC to scale to large domainswhere the competing models would struggle to perform.We compare MMC with several baselines with synthetic and real-world datasets to demonstrate MMC as a valuable alternative for stochastic modeling.

More Like this