skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: The Expected-Length Model of Options
Effective options can make reinforcement learning easier by enhancing an agent's ability to both explore in a targeted manner and plan further into the future. However, learning an appropriate model of an option's dynamics in hard, requiring estimating a highly parameterized probability distribution. This paper introduces and motivates the Expected-Length Model (ELM) for options, an alternate model for transition dynamics. We prove ELM is a (biased) estimator of the traditional Multi-Time Model (MTM), but provide a non-vacuous bound on their deviation. We further prove that, in stochastic shortest path problems, ELM induces a value function that is sufficiently similar to the one induced by MTM, and is thus capable of supporting near-optimal behavior. We explore the practical utility of this option model experimentally, finding consistent support for the thesis that ELM is a suitable replacement for MTM. In some cases, we find ELM leads to more sample efficient learning, especially when options are arranged in a hierarchy.  more » « less
Award ID(s):
1637937
PAR ID:
10113326
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
IJCAI
ISSN:
1045-0823
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Effective options can make reinforcement learning easier by enhancing an agent’s ability to both ex- plore in a targeted manner and plan further into the future. However, learning an appropriate model of an option’s dynamics in hard, requiring estimat- ing a highly parameterized probability distribution. This paper introduces and motivates the Expected- Length Model (ELM) for options, an alternate model for transition dynamics. We prove ELM is a (biased) estimator of the traditional Multi- Time Model (MTM), but provide a non-vacuous bound on their deviation. We further prove that, in stochastic shortest path problems, ELM induces a value function that is sufficiently similar to the one induced by MTM, and is thus capable of support- ing near-optimal behavior. We explore the practical utility of this option model experimentally, finding consistent support for the thesis that ELM is a suit- able replacement for MTM. In some cases, we find ELM leads to more sample efficient learning, espe- cially when options are arranged in a hierarchy. 
    more » « less
  2. Abstract The multiple-try Metropolis method is an interesting extension of the classical Metropolis–Hastings algorithm. However, theoretical understanding about its usefulness and convergence behavior is still lacking. We here derive the exact convergence rate for the multiple-try Metropolis Independent sampler (MTM-IS) via an explicit eigen analysis. As a by-product, we prove that an naive application of the MTM-IS is less efficient than using the simpler approach of “thinned” independent Metropolis–Hastings method at the same computational cost. We further explore more variants and find it possible to design more efficient algorithms by applying MTM to part of the target distribution or creating correlated multiple trials. 
    more » « less
  3. The advancement of future large‐scale wireless networks necessitates the development of cost‐effective and scalable security solutions. Specifically, physical layer (PHY) security has been put forth as a cost‐effective alternative to cryptographic mechanisms that can circumvent the need for explicit key exchange between communication devices. Herein, a space–time‐modulated digitally‐coded metamaterial (MTM) leaky wave antenna (LWA) is proposed that can enable PHY security by achieving the functionalities of directional modulation (DM) using a machine learning‐aided branch‐and‐bound (B&B) optimized coding sequence. Theoretically, it is first shown that the proposed space–time MTM antenna can achieve DM through both the spatial and spectral manipulation of the orthogonal frequency division multiplexing signal. Simulation results are then provided as proof‐of‐principle, demonstrating the applicability of the approach for achieving DM in various communication settings. Furthermore, a prototype of the proposed architecture controlled by a field‐programmable gate array is realized, which achieves DM via an optimized coding sequence carried out by the learning‐aided B&B algorithm corresponding to the states of the MTM LWA's unit cells. Experimental results confirm the theory behind the space–time‐modulated MTM LWA in achieving DM, which is observed via both the spectral harmonic patterns and bit error rate measurements. 
    more » « less
  4. Electromagnetic compatibility (EMC) is a key requirement for electronic system design. Meeting the EMC regulations becomes more challenging as the component density increases and operation frequencies spread to multiple bands. Coupling between transmission lines is a common manifestation of electromagnetic interference (EMI). In this work, we present a novel method to suppress the noise between two transmission lines by using a metamaterial (MTM) structure. This MTM design helps to mitigate the coupling between the two transmission lines where one acts as an aggressor and the other as the victim. This approach helps miniaturize the solutions such as shielding or filtering to mitigate the noise. MTM provides good protection in terms of EMI isolation, is inexpensive, and has a smaller footprint compared to traditional EMC solutions. The second part of this article studies the impact of the relative permittivity (ε r ) of the MTM structure. Changing the ε r modifies the transmission and absorption bands. Thus, that can help in modulating the operation of the MTM through appropriate designs. The MTM designs used in this work enhanced the isolation between the victim and aggressor by 1–13.5 dB across 1–5 GHz. 
    more » « less
  5. To guide the selection of probabilistic solar power forecasting methods for day-ahead power grid operations, the performance of four methods, i.e., Bayesian model averaging (BMA), Analog ensemble (AnEn), ensemble learning method (ELM), and persistence ensemble (PerEn) is compared in this paper. A real-world hourly solar generation dataset from a rooftop solar plant is used to train and validate the methods under clear, partially cloudy, and overcast weather conditions. Comparisons have been made on a one-year testing set using popular performance metrics for probabilistic forecasts. It is found that the ELM method outperforms other methods by offering better reliability, higher resolution, and narrower prediction interval width under all weather conditions with a slight compromise in accuracy. The BMA method performs well under overcast and partially cloudy weather conditions, although it is outperformed by the ELM method under clear conditions. 
    more » « less