NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Anticipating Oblivious Opponents in Stochastic Games

Tasdighi_Kalat, Shadi; Sankaranarayanan, Sriram; Trivedi, Ashutosh (October 2024, Springer Proceedings in Advanced Robotics (SPAR))
Amato, Nancy; Driggs-Campbell, Katie; Ekenna, Chinwe; Morales, Marco; O'Kane, Jason (Ed.)
We present an approach for systematically anticipating the actions and policies employed by oblivious environments in concurrent stochastic games, while maximizing a reward function. Our main contribution lies in the synthesis of a finite information state machine (ISM) whose alphabet ranges over the actions of the environment. Each state of the ISM is mapped to a belief state about the policy used by the environment. We introduce a notion of consistency that guarantees that the belief states tracked by the ISM stays within a fixed distance of the precise belief state obtained by knowledge of the full history. We provide methods for checking consistency of an automaton and a synthesis approach which, upon successful termination, yields an ISM. We construct a Markov Decision Process (MDP) that serves as the starting point for computing optimal policies for maximizing a reward function defined over plays. We present an experimental evaluation over benchmark examples including human activity data for tasks such as cataract surgery and furniture assembly, wherein our approach successfully anticipates the policies and actions of the environment in order to maximize the reward.
more » « less
Full Text Available
The hexatope and octatope abstract domains for neural network verification

https://doi.org/10.1007/s10703-024-00457-y

Bak, Stanley; Dohmen, Taylor; Subramani, K; Trivedi, Ashutosh; Velasquez, Alvaro; Wojciechowski, Piotr (December 2024, Formal Methods in System Design)

Efficient verification algorithms for neural networks often depend on various abstract domains such as intervals, zonotopes, and linear star sets. The choice of the abstract domain presents an expressiveness vs. scalability trade-off: simpler domains are less precise but yield faster algorithms. This paper investigates the hexatope and octatope abstract domains in the context of neural net verification. Hexatopes are affine transformations of higher-dimensional hexagons, defined by difference constraint systems, and octatopes are affine transformations of higher-dimensional octagons, defined by unit-two-variable-per-inequality constraint systems. These domains generalize the idea of zonotopes which can be viewed as affine transformations of hypercubes. On the other hand, they can be considered as a restriction of linear star sets, which are affine transformations of arbitrary H-Polytopes. This distinction places hexatopes and octatopes firmly between zonotopes and linear star sets in their expressive power, but what about the efficiency of decision procedures? An important analysis problem for neural networks is the exact range computation problem that asks to compute the exact set of possible outputs given a set of possible inputs. For this, three computational procedures are needed: (1) optimization of a linear cost function; (2) affine mapping; and (3) over-approximating the intersection with a half-space. While zonotopes allow an efficient solution for these approaches, star sets solves these procedures via linear programming. We show that these operations are faster for hexatopes and octatopes than they are for the more expressive linear star sets by reducing the linear optimization problem over these domains to the minimum cost network flow, which can be solved in strongly polynomial time using the Out-of-Kilter algorithm. Evaluating exact range computation on several ACAS Xu neural network benchmarks, we find that hexatopes and octatopes show promise as a practical abstract domain for neural network verification.
more » « less
Free, publicly-accessible full text available December 1, 2025
What is Your Discount Factor?

Tasdighi_Kalat, Shadi; Sankaranarayanan, Sriram; Trivedi, Ashutosh (August 2024, Springer, Cham)
Hillston, Jane; Soudjiani, Sadegh (Ed.)
We study the problem of inferring the discount factor of an agent optimizing a discounted reward objective in a finite state Markov Decision Process (MDP). Discounted reward objectives are common in sequential optimization, reinforcement learning, and algorithmic game theory. The discount factor is an important parameter used in formulating the discounted reward. It captures the “time value” of the reward- i.e., how much reward at hand would equal a promised reward at a future time. Knowing an agent’s discount factor can provide valuable insights into their decision-making, and help predict their preferences in previously unseen environments. However, pinpointing the exact value of the discount factor used by the agent is a challenging problem. Ad-hoc guesses are often incorrect. This paper focuses on the problem of computing the range of possible discount factors for a rational agent given their policy. A naive solution to this problem can be quite expensive. A classic result by Smallwood shows that the interval [0, 1) of possible discount factor can be partitioned into finitely many sub-intervals, such that the optimal policy remains the same for each such sub-interval. Furthermore, optimal policies for neighboring sub-intervals diﬀer for a single state. We show how Smallwood’s result can be exploited to search for discount factor intervals for which a given policy is optimal by reducing it to polynomial root isolation. We extend the result to situations where the policy is suboptimal, but with a value function that is close to optimal. We develop numerical approaches to solve the discount factor elicitation problem and demonstrate the eﬀectiveness of our algorithms through some case studies.
more » « less
Full Text Available
Multi-Agent Reinforcement Learning for Alternating-Time Logic

Hahna, Ernst Moritz; Perez, Mateo; Schewe, Sven; Somenzi, Fabio; Trivedi, Ashutosh; Wojtczak, Dominik (October 2024, Open Access by IOS Press)
Endriss, Ulle; Melo, Francisco (Ed.)
Alternating-time temporal logic (ATL) extends branching time logic by enabling quantification over paths that result from the strategic choices made by multiple agents in various coalitions within the system. While classical temporal logics express properties of “closed” systems, ATL can express properties of “open” systems resulting from interactions among several agents. Reinforcement learning (RL) is a sampling-based approach to decision-making where learning agents, guided by a scalar reward function, discover optimal policies through repeated interactions with the environment. The challenge of translating high-level objectives into scalar rewards for RL has garnered increased interest, particularly following the success of model-free RL algorithms. This paper presents an approach for deploying model-free RL to verify multi-agent systems against ATL specifications. The key contribution of this paper is a verification procedure for model-free RL of quantitative and non-nested classic ATL properties, based on Q-learning, demonstrated on a natural subclass of non-nested ATL formulas.
more » « less
Full Text Available
Regular Reinforcement Learning

Dohmen, Taylor; Perez, Mateo; Somenzi, Fabio; Trivedi, Ashutosh (July 2024, Springer, Cham)
Gurfinkel, Arie; Ganesh, Vijay (Ed.)
In reinforcement learning, an agent incrementally refines a behavioral policy through a series of episodic interactions with its environment. This process can be characterized as explicit reinforcement learning, as it deals with explicit states and concrete transitions. Building upon the concept of symbolic model checking, we propose a symbolic variant of reinforcement learning, in which sets of states are represented through predicates and transitions are represented by predicate transformers. Drawing inspiration from regular model checking, we choose regular languages over the states as our predicates, and rational transductions as predicate transformations. We refer to this framework as regular reinforcement learning, and study its utility as a symbolic approach to reinforcement learning. Theoretically, we establish results around decidability, approximability, and eﬃcient learnability in the context of regular reinforcement learning. Towards practical applications, we develop a deep regular reinforcement learning algorithm, enabled by the use of graph neural networks. We showcase the applicability and eﬀectiveness of (deep) regular reinforcement learning through empirical evaluation on a diverse set of case studies.
more » « less
Full Text Available
Technical Challenges in Maintaining Tax Prep Software with Large Language Models

Gogani-Khiabani, Sina; Dewangan, Varsha; Olson, Nina; Trivedi, Ashutosh; Tizpaz-Niari, Saeid (August 2024, IRS-TPC)
McClelland, Robert; Johnson, Barry (Ed.)
As the US tax law evolves to adapt to ever-changing politico-economic realities, tax preparation software plays a significant role in helping taxpayers navigate these complexities. The dynamic nature of tax regulations poses a significant challenge to accurately and timely maintaining tax software artifacts. The state-of-the-art in maintaining tax prep software is time-consuming and error-prone as it involves manual code analysis combined with an expert interpretation of tax law amendments. We posit that the rigor and formality of tax amendment language, as expressed in IRS publications, makes it amenable to automatic translation to executable specifications (code). Our research efforts focus on identifying, understanding, and tackling technical challenges in leveraging Large Language Models (LLMs), such as ChatGPT and Llama, to faithfully extract code differentials from IRS publications and automatically integrate them with the prior version of the code to automate tax prep software maintenance.
more » « less
Full Text Available
Closure Certificates

https://doi.org/10.1145/3641513.3650120

Murali, Vishnu; Trivedi, Ashutosh; Zamani, Majid (May 2024, ACM)

Full Text Available
Predicting Fairness of ML Software Configurations

https://doi.org/10.1145/3663533.3664040

Robles_Herrera, Salvador; Monjezi, Verya; Kreinovich, Vladik; Trivedi, Ashutosh; Tizpaz-Niari, Saeid (July 2024, ACM)

Full Text Available
Neural Closure Certificates

https://doi.org/10.1609/aaai.v38i19.30141

Nadali, Alireza; Murali, Vishnu; Trivedi, Ashutosh; Zamani, Majid (March 2024, Proceedings of the AAAI Conference on Artificial Intelligence)

Notions of transition invariants and closure certificates have seen recent use in the formal verification of controlled dynamical systems against \omega-regular properties.Unfortunately, existing approaches face limitations in two directions.First, they require a closed-form mathematical expression representing the model of the system.Such an expression may be difficult to find, too complex to be of any use, or unavailable due to security or privacy constraints.Second, finding such invariants typically rely on optimization techniques such as sum-of-squares (SOS) or satisfiability modulo theory (SMT) solvers.This restricts the classes of systems that need to be formally verified.To address these drawbacks, we introduce a notion of neural closure certificates.We present a data-driven algorithm that trains a neural network to represent a closure certificate.Our approach is formally correct under some mild assumptions, i.e., one is able to formally show that the unknown system satisfies the \omega-regular property of interest if a neural closure certificate can be computed.Finally, we demonstrate the efficacy of our approach with relevant case studies.
more » « less
Full Text Available
Transfer of Safety Controllers Through Learning Deep Inverse Dynamics Model

https://doi.org/10.1016/j.ifacol.2024.07.436

Nadali, Alireza; Trivedi, Ashutosh; Zamani, Majid (January 2024, IFAC-PapersOnLine)

Full Text Available

« Prev Next »

Search for: All records