NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

More Efficient Randomized Exploration for Reinforcement Learning via Approximate Sampling

Haque, Ishfaq; Tan, Yixin; Yang, Yu; Lan, Qingfeng; Lu, Jianfeng; Mahmood, A Rupam; Precup, Doina; Xu, Pan (August 2024, Reinforcement Learning Journal)

Thompson sampling (TS) is one of the most popular exploration techniques in reinforcement learning (RL). However, most TS algorithms with theoretical guarantees are difficult to implement and not generalizable to Deep RL. While the emerging approximate sampling-based exploration schemes are promising, most existing algorithms are specific to linear Markov Decision Processes (MDP) with suboptimal regret bounds, or only use the most basic samplers such as Langevin Monte Carlo. In this work, we propose an algorithmic framework that incorporates different approximate sampling methods with the recently proposed Feel-Good Thompson Sampling (FGTS) approach \citep{zhang2022feel,dann2021provably}, which was previously known to be computationally intractable in general. When applied to linear MDPs, our regret analysis yields the best known dependency of regret on dimensionality, surpassing existing randomized algorithms. Additionally, we provide explicit sampling complexity for each employed sampler. Empirically, we show that in tasks where deep exploration is necessary, our proposed algorithms that combine FGTS and approximate sampling perform significantly better compared to other strong baselines. On several challenging games from the Atari 57 suite, our algorithms achieve performance that is either better than or on par with other strong baselines from the deep RL literature.
more » « less
Full Text Available
Provable and Practical: Efficient Exploration in Reinforcement Learning via Langevin Monte Carlo

Haque, Ishfaq; Lan, Qingfeng; Xu, Pan; Mahmood, A Rupam; Precup, Doina; Anandkumar, Anima; Azizzadenesheli, Kamyar (January 2024, The Twelfth International Conference on Learning Representations)

We present a scalable and effective exploration strategy based on Thompson sampling for reinforcement learning (RL). One of the key shortcomings of existing Thompson sampling algorithms is the need to perform a Gaussian approximation of the posterior distribution, which is not a good surrogate in most practical settings. We instead directly sample the Q function from its posterior distribution, by using Langevin Monte Carlo, an efficient type of Markov Chain Monte Carlo (MCMC) method. Our method only needs to perform noisy gradient descent updates to learn the exact posterior distribution of the Q function, which makes our approach easy to deploy in deep RL. We provide a rigorous theoretical analysis for the proposed method and demonstrate that, in the linear Markov decision process (linear MDP) setting, it has a regret bound of $$\tilde{O}(d^{3/2}H^{3/2}\sqrt{T})$$, where $$d$$ is the dimension of the feature mapping, $$H$$ is the planning horizon, and $$T$$ is the total number of steps. We apply this approach to deep RL, by using Adam optimizer to perform gradient updates. Our approach achieves better or similar results compared with state-of-the-art deep RL algorithms on several challenging exploration tasks from the Atari57 suite.\footnote{Our code is available at \url{https://github.com/hmishfaq/LMC-LSVI}}
more » « less
Full Text Available
Provable and Practical: Efficient Exploration in Reinforcement Learning via Langevin Monte Carlo

Haque, Ishfaq; Lan, Qingfeng; Xu, Pan; Mahmood, A Rupam; Precup, Doina; Anandkumar, Anima; Azizzadenesheli, Kamyar (January 2024, The Twelfth International Conference on Learning Representations)

We present a scalable and effective exploration strategy based on Thompson sampling for reinforcement learning (RL). One of the key shortcomings of existing Thompson sampling algorithms is the need to perform a Gaussian approximation of the posterior distribution, which is not a good surrogate in most practical settings. We instead directly sample the Q function from its posterior distribution, by using Langevin Monte Carlo, an efficient type of Markov Chain Monte Carlo (MCMC) method. Our method only needs to perform noisy gradient descent updates to learn the exact posterior distribution of the Q function, which makes our approach easy to deploy in deep RL. We provide a rigorous theoretical analysis for the proposed method and demonstrate that, in the linear Markov decision process (linear MDP) setting, it has a regret bound of $$\tilde{O}(d^{3/2}H^{3/2}\sqrt{T})$$, where $$d$$ is the dimension of the feature mapping, $$H$$ is the planning horizon, and $$T$$ is the total number of steps. We apply this approach to deep RL, by using Adam optimizer to perform gradient updates. Our approach achieves better or similar results compared with state-of-the-art deep RL algorithms on several challenging exploration tasks from the Atari57 suite.\footnote{Our code is available at \url{https://github.com/hmishfaq/LMC-LSVI}}
more » « less
Full Text Available
Measurement of spin-density matrix elements in Δ++(1232) photoproduction

https://doi.org/10.1016/j.physletb.2025.139368

Afzal, F; Akondi, CS; Albrecht, M; Amaryan, M; Arrigo, S; Arroyave, V; Asaturyan, A; Austregesilo, A; Baldwin, Z; Barbosa, F; et al (April 2025, Physics Letters B)

Free, publicly-accessible full text available April 1, 2026
First measurement of $a_{2}^{0} (1320)$ polarized photoproduction cross section

https://doi.org/10.1103/jfzb-rfl4

Afzal, F.; Akondi, C_S; Albrecht, M.; Amaryan, M.; Arrigo, S.; Arroyave, V.; Asaturyan, A.; Austregesilo, A.; Baldwin, Z.; Barbosa, F.; et al (July 2025, Physical Review C)
Upper Limit on the Photoproduction Cross Section of the Spin-Exotic $π_{1} (1600)$

https://doi.org/10.1103/PhysRevLett.133.261903

Afzal, F.; Akondi, C. S.; Albrecht, M.; Amaryan, M.; Arrigo, S.; Arroyave, V.; Asaturyan, A.; Austregesilo, A.; Baldwin, Z.; Barbosa, F.; et al (December 2024, Physical Review Letters)
Space-charge limited conduction in epitaxial chromia films grown on elemental and oxide-based metallic substrates

https://doi.org/10.1063/1.5087832

Kwan, C-P; Street, M.; Mahmood, A.; Echtenkamp, W.; Randle, M.; He, K.; Nathawat, J.; Arabchigavkani, N.; Barut, B.; Yin, S.; et al (May 2019, AIP Advances)

We study temperature dependent (200 – 400 K) dielectric current leakage in high-quality, epitaxial chromia films, synthesized on various conductive substrates (Pd, Pt and V2O3). We find that trap-assisted space-charge limited conduction is the dominant source of electrical leakage in the films, and that the density and distribution of charge traps within them is strongly dependent upon the choice of the underlying substrate. Pd-based chromia is found to exhibit leakage consistent with the presence of deep, discrete traps, a characteristic that is related to the known properties of twinning defects in the material. The Pt- and V2O3-based films, in contrast, show behavior typical of insulators with shallow, exponentially-distributed traps. The highest resistivity is obtained for chromia fabricated on V2O3 substrates, consistent with a lower total trap density in these films. Our studies suggest that chromia thin films formed on V2O3 substrates are a promising candidate for next-generation spintronics.
more » « less
Measurement of the $J$ / $ψ$ photoproduction cross section over the full near-threshold kinematic region

https://doi.org/10.1103/PhysRevC.108.025201

Adhikari, S.; Afzal, F.; Akondi, C. S.; Albrecht, M.; Amaryan, M.; Arroyave, V.; Asaturyan, A.; Austregesilo, A.; Baldwin, Z.; Barbosa, F.; et al (August 2023, Physical Review C)

Full Text Available
Measurement of spin-density matrix elements in $ρ (770)$ production with a linearly polarized photon beam at $E_{γ} = 8.2 - 8.8 GeV$

https://doi.org/10.1103/PhysRevC.108.055204

Adhikari, S.; Afzal, F.; Akondi, C. S.; Albrecht, M.; Amaryan, M.; Arroyave, V.; Asaturyan, A.; Austregesilo, A.; Baldwin, Z.; Barbosa, F.; et al (November 2023, Physical Review C)
Measurement of spin density matrix elements in $Λ (1520)$ photoproduction at 8.2–8.8 GeV

https://doi.org/10.1103/PhysRevC.105.035201

Adhikari, S.; Akondi, C. S.; Albrecht, M.; Ali, A.; Amaryan, M.; Asaturyan, A.; Austregesilo, A.; Baldwin, Z.; Barbosa, F.; Barlow, J.; et al (March 2022, Physical Review C)

Full Text Available

« Prev Next »

Search for: All records