NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Relationship design for socially-aware behavior in static games

https://doi.org/10.1007/s10458-025-09699-4

Chen, Shenghui; Bayiz, Yigit E; Fridovich-Keil, David; Topcu, Ufuk (June 2025, Autonomous Agents and Multi-Agent Systems)

Free, publicly-accessible full text available June 1, 2026
Dense Dynamics-Aware Reward Synthesis: Integrating Prior Experience with Demonstrations

Koprulu, Cevahir; Li, Po-Han; Qiu, Tianyu; Zhao, Ruihan; Westenbroek, Tyler; Fridovich-Keil, David; Chinchali, Sandeep; Topcu, Ufuk (June 2025, 7th Annual Learning for Dynamics & Control Conference)

Free, publicly-accessible full text available June 6, 2026
Non-Parametric Neuro-Adaptive Formation Control

https://doi.org/10.1109/TASE.2025.3528501

Verginis, Christos K; Xu, Zhe; Topcu, Ufuk (January 2025, IEEE Transactions on Automation Science and Engineering)

Free, publicly-accessible full text available January 1, 2026
Basis-to-Basis Operator Learning Using Function Encoders

https://doi.org/10.1016/j.cma.2024.117646

Ingebrand, Tyler; Thorpe, Adam J; Goswami, Somdatta; Kumar, Krishna; Topcu, Ufuk (February 2025, Computer Methods in Applied Mechanics and Engineering)

We present Basis-to-Basis (B2B) operator learning, a novel approach for learning operators on Hilbert spaces of functions based on the foundational ideas of function encoders. We decompose the task of learning operators into two parts: learning sets of basis functions for both the input and output spaces and learning a potentially nonlinear mapping between the coefficients of the basis functions. B2B operator learning circumvents many challenges of prior works, such as requiring data to be at fixed locations, by leveraging classic techniques such as least squares to compute the coefficients. It is especially potent for linear operators, where we compute a mapping between bases as a single matrix transformation with a closed-form solution. Furthermore, with minimal modifications and using the deep theoretical connections between function encoders and functional analysis, we derive operator learning algorithms that are directly analogous to eigen-decomposition and singular value decomposition. We empirically validate B2B operator learning on seven benchmark operator learning tasks and show that it demonstrates a two-orders-of-magnitude improvement in accuracy over existing approaches on several benchmark tasks.
more » « less
Free, publicly-accessible full text available February 1, 2026
Zero-Shot Transfer of Neural ODEs

Ingebrand, Tyler; Thorpe, Adam J; Topcu, Ufuk (December 2024, 38th conference on Neural Information Processing Systems (NeurIPS 2024))

Free, publicly-accessible full text available December 9, 2025
Auto-Encoding Bayesian Inverse Games

Liu, Xinjie; Peters, Lasse; Alonso-Mora, Javier; Topcu, Ufuk; Fridovich-Keil, David (October 2024, The 16th International Workshop on the Algorithmic Foundations of Robotics)

Full Text Available
Human-Agent Cooperation in Games under Incomplete Information through Natural Language Communication

https://doi.org/10.24963/ijcai.2024/867

Chen, Shenghui; Fried, Daniel; Topcu, Ufuk (August 2024, International Joint Conferences on Artificial Intelligence Organization)

Developing autonomous agents that can strategize and cooperate with humans under information asymmetry is challenging without effective communication in natural language. We introduce a shared-control game, where two players collectively control a token in alternating turns to achieve a common objective under incomplete information. We formulate a policy synthesis problem for an autonomous agent in this game with a human as the other player. To solve this problem, we propose a communication-based approach comprising a language module and a planning module. The language module translates natural language messages into and from a finite set of flags, a compact representation defined to capture player intents. The planning module leverages these flags to compute a policy using an asymmetric information-set Monte Carlo tree search with flag exchange algorithm we present. We evaluate the effectiveness of this approach in a testbed based on Gnomes at Night, a search-and-find maze board game. Results of human subject experiments show that communication narrows the information gap between players and enhances human-agent cooperation efficiency with fewer turns.
more » « less
Full Text Available
Zero-Shot Reinforcement Learning via Function Encoders

Ingebrand, Tyler; Zhang, Amy; Topcu, Ufuk (July 2024, PMLR)

Full Text Available
Joint learning of reward machines and policies in environments with partially known semantics

https://doi.org/10.1016/j.artint.2024.104146

Verginis, Christos K; Koprulu, Cevahir; Chinchali, Sandeep; Topcu, Ufuk (August 2024, Artificial Intelligence)

We study the problem of reinforcement learning for a task encoded by a reward machine. The task is defined over a set of properties in the environment, called atomic propositions, and represented by Boolean variables. One unrealistic assumption commonly used in the literature is that the truth values of these propositions are accurately known. In real situations, however, these truth values are uncertain since they come from sensors that suffer from imperfections. At the same time, reward machines can be difficult to model explicitly, especially when they encode complicated tasks. We develop a reinforcement-learning algorithm that infers a reward machine that encodes the underlying task while learning how to execute it, despite the uncertainties of the propositions’ truth values. In order to address such uncertainties, the algorithm maintains a probabilistic estimate about the truth value of the atomic propositions; it updates this estimate according to new sensory measurements that arrive from exploration of the environment. Additionally, the algorithm maintains a hypothesis reward machine, which acts as an estimate of the reward machine that encodes the task to be learned. As the agent explores the environment, the algorithm updates the hypothesis reward machine according to the obtained rewards and the estimate of the atomic propositions’ truth value. Finally, the algorithm uses a Q-learning procedure for the states of the hypothesis reward machine to determine an optimal policy that accomplishes the task. We prove that the algorithm successfully infers the reward machine and asymptotically learns a policy that accomplishes the respective task.
more » « less
Full Text Available
Encouraging Inferable Behavior for Autonomy: Repeated Bimatrix Stackelberg Games with Observations

https://doi.org/10.23919/ACC60939.2024.10644936

Karabag, Mustafa O; Smith, Sophia; Fridovich-Keil, David; Topcu, Ufuk (July 2024, Proceedings of the American Control Conference)

When interacting with other non-competitive decision-making agents, it is critical for an autonomous agent to have inferable behavior: Their actions must convey their intention and strategy. For example, an autonomous car's strategy must be inferable by the pedestrians interacting with the car. We model the inferability problem using a repeated bimatrix Stackelberg game with observations where a leader and a follower repeatedly interact. During the interactions, the leader uses a fixed, potentially mixed strategy. The follower, on the other hand, does not know the leader's strategy and dynamically reacts based on observations that are the leader's previous actions. In the setting with observations, the leader may suffer from an inferability loss, i.e., the performance compared to the setting where the follower has perfect information of the leader's strategy. We show that the inferability loss is upper-bounded by a function of the number of interactions and the stochasticity level of the leader's strategy, encouraging the use of inferable strategies with lower stochasticity levels. As a converse result, we also provide a game where the required number of interactions is lower bounded by a function of the desired inferability loss.
more » « less
Full Text Available

« Prev Next »

Search for: All records