NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning

https://doi.org/10.1609/aaai.v36i5.20478

Ma, Yecheng Jason; Shen, Andrew; Bastani, Osbert; Dinesh, Jayaraman (June 2023, Proceedings of the AAAI Conference on Artificial Intelligence)

Full Text Available
Counterfactual Explanations for Natural Language Interfaces

https://doi.org/10.18653/v1/2022.acl-short.14

Tolkachev, George; Mell, Stephen; Zdancewic, Stephan; Bastani, Osbert (July 2022, 60th Annual Meeting of the Association for Computational Linguistics)

Full Text Available
On the (In)Tractability of Reinforcement Learning for LTL Objectives

https://doi.org/10.24963/ijcai.2022/507

Yang, Cambridge; Littman, Michael L.; Carbin, Michael (July 2022, Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence (IJCAI-22))

Full Text Available
Distributed learning of fully connected neural networks using independent subnet training

https://doi.org/10.14778/3529337.3529343

Yuan, Binhang; Wolfe, Cameron R.; Dun, Chen; Tang, Yuxin; Kyrillidis, Anastasios; Jermaine, Chris (April 2022, Proceedings of the VLDB Endowment)

Full Text Available
LEMMA: Bootstrapping High-Level Mathematical Reasoning with Learned Symbolic Abstractions

Zhening Li; Gabriel Poesia; Omar Costilla-Reyes; Noah Goodman; Armando Solar-Lezama (January 2022, Neurips 2022)

Humans tame the complexity of mathematical reasoning by developing hierarchies of abstractions. With proper abstractions, solutions to hard problems can be expressed concisely, thus making them more likely to be found. In this paper, we propose Learning Mathematical Abstractions (LEMMA): an algorithm that implements this idea for reinforcement learning agents in mathematical domains. LEMMA augments Expert Iteration with an abstraction step, where solutions found so far are revisited and rewritten in terms of new higher-level actions, which then become available to solve new problems. We evaluate LEMMA on two mathematical reasoning tasks--equation solving and fraction simplification--in a step-by-step fashion. In these two domains, LEMMA improves the ability of an existing agent, both solving more problems and generalizing more effectively to harder problems than those seen during training.
more » « less
Full Text Available
Offline Goal-Conditioned Reinforcement Learning via f-Advantage Regression

Yecheng Jason Ma; Jason Yan; Dinesh Jayaraman; Osbert Bastani (January 2022, 36th Conference on Neural Information Processing Systems (NeurIPS 2022))

Offline goal-conditioned reinforcement learning (GCRL) promises general-purpose skill learning in the form of reaching diverse goals from purely offline datasets. We propose Go al-conditioned f - A dvantage R egression (GoFAR), a novel regression-based offline GCRL algorithm derived from a state-occupancy matching perspective; the key intuition is that the goal-reaching task can be formulated as a state-occupancy matching problem between a dynamics-abiding imitator agent and an expert agent that directly teleports to the goal. In contrast to prior approaches, GoFAR does not require any hindsight relabeling and enjoys uninterleaved optimization for its value and policy networks. These distinct features confer GoFAR with much better offline performance and stability as well as statistical performance guarantee that is unattainable for prior methods. Furthermore, we demonstrate that GoFAR's training objectives can be re-purposed to learn an agent-independent goal-conditioned planner from purely offline source-domain data, which enables zero-shot transfer to new target domains. Through extensive experiments, we validate GoFAR's effectiveness in various problem settings and tasks, significantly outperforming prior state-of-art. Notably, on a real robotic dexterous manipulation task, while no other method makes meaningful progress, GoFAR acquires complex manipulation behavior that successfully accomplishes diverse goals.
more » « less
Full Text Available
Neurosymbolic Programming for Science

Jennifer J. Sun; Megan Tjandrasuwita; Atharva Sehgal; Armando Solar-Lezama; Swarat Chaudhuri; Yisong Yue; Omar Costilla-Reyes (January 2022, Neurips 2022)

Full Text Available
DreamCoder: bootstrapping inductive program synthesis with wake-sleep library learning

https://doi.org/10.1145/3453483.3454080

Ellis, Kevin; Wong, Catherine; Nye, Maxwell; Sablé-Meyer, Mathias; Morales, Lucas; Hewitt, Luke; Cary, Luc; Solar-Lezama, Armando; Tenenbaum, Joshua B. (June 2021, International Conference on Programming Language Design and Implementation)

Full Text Available
Pragmatic Code Autocomplete

https://doi.org/10.1609/aaai.v35i1.16121

Poesia, Gabriel; Goodman, Noah (May 2021, Proceedings of the AAAI Conference on Artificial Intelligence)

Full Text Available
GeoMol: Torsional Geometric Generation of Molecular 3D Conformer Ensembles

Ganea, Octavian-Eugen; Pattanaik, Lagnajit; Coley, Connor W.; Barzilay, Regina; Jensen, Klavs F.; Green, William H.; Jaakkola, Tommi S. (January 2021, Advances in neural information processing systems)

Full Text Available

« Prev Next »

Search for: All records