skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Hidden Markov Model Estimation-Based Q-learning for Partially Observable Markov Decision Process
Award ID(s):
1830639
PAR ID:
10175172
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Proceedings of American Control Conference
Page Range / eLocation ID:
2366 to 2371
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. In this paper, we introduce Max Markov Chain (MMC), a novel model for sequential data with sparse correlations among the state variables.It may also be viewed as a special class of approximate models for High-order Markov Chains (HMCs).MMC is desirable for domains where the sparse correlations are long-term and vary in their temporal stretches.Although generally intractable, parameter optimization for MMC can be solved analytically.However, based on this result,we derive an approximate solution that is highly efficient empirically.When compared with HMC and approximate HMC models, MMCcombines better sample efficiency, model parsimony, and an outstanding computational advantage.Such a quality allows MMC to scale to large domainswhere the competing models would struggle to perform.We compare MMC with several baselines with synthetic and real-world datasets to demonstrate MMC as a valuable alternative for stochastic modeling. 
    more » « less
  2. We introduce the space of virtual Markov chains (VMCs) as a projective limit of the spaces of all finite state space Markov chains (MCs), in the same way that the space of virtual permutations is the projective limit of the spaces of all permutations of finite sets.We introduce the notions of virtual initial distribution (VID) and a virtual transition matrix (VTM), and we show that the law of any VMC is uniquely characterized by a pair of a VID and VTM which have to satisfy a certain compatibility condition.Lastly, we study various properties of compact convex sets associated to the theory of VMCs, including that the Birkhoff-von Neumann theorem fails in the virtual setting. 
    more » « less
  3. null (Ed.)
  4. Abstract Let G be a finite group. Let $H, K$ be subgroups of G and $$H \backslash G / K$$ the double coset space. If Q is a probability on G which is constant on conjugacy classes ( $$Q(s^{-1} t s) = Q(t)$$ ), then the random walk driven by Q on G projects to a Markov chain on $$H \backslash G /K$$ . This allows analysis of the lumped chain using the representation theory of G . Examples include coagulation-fragmentation processes and natural Markov chains on contingency tables. Our main example projects the random transvections walk on $$GL_n(q)$$ onto a Markov chain on $$S_n$$ via the Bruhat decomposition. The chain on $$S_n$$ has a Mallows stationary distribution and interesting mixing time behavior. The projection illuminates the combinatorics of Gaussian elimination. Along the way, we give a representation of the sum of transvections in the Hecke algebra of double cosets, which describes the Markov chain as a mixture of Metropolis chains. Some extensions and examples of double coset Markov chains with G a compact group are discussed. 
    more » « less