 Award ID(s):
 2009022
 NSFPAR ID:
 10417924
 Editor(s):
 Bouajjani, A.; Holík, L.; Wu, Z.
 Date Published:
 Journal Name:
 International Symposium on Automated Technology for Verification and Analysis (ATVA 2022)
 Volume:
 13505
 Page Range / eLocation ID:
 303319
 Format(s):
 Medium: X
 Sponsoring Org:
 National Science Foundation
More Like this

Bouajjani, A. ; Holík, L. ; Wu, Z. (Ed.)The expanding role of reinforcement learning (RL) in safetycritical system design has promoted omegaautomata as a way to express learning requirements—often nonMarkovian—with greater ease of expression and interpretation than scalar reward signals. When 𝜔automata were first proposed in modelfree RL, deterministic Rabin acceptance conditions were used in an attempt to provide a direct translation from omegaautomata to finite state “reward” machines defined over the same automaton structure (a memoryless reward translation). While these initial attempts to provide faithful, memoryless reward translations for Rabin acceptance conditions remained unsuccessful, translations were discovered for other acceptance conditions such as suitable, limitdeterministic Buechi acceptance or more generally, goodforMDP Buechi acceptance conditions. Yet, the question “whether a memoryless translation of Rabin conditions to scalar rewards exists” remained unresolved. This paper presents an impossibility result implying that any attempt to use Rabin automata directly (without extra memory) for modelfree RL is bound to fail. To establish this result, we show a link between a class of automata enabling memoryless reward translation to closure properties of its accepting and rejecting infinity sets, and to the insight that both the property and its complement need to allow for positional strategies for such an approach to work. We believe that such impossibility results will provide foundations for the application of RL to safetycritical systems.more » « less

Biere, Armin ; Parker, David (Ed.)We characterize the class of nondeterministic 𝜔automata that can be used for the analysis of finite Markov decision processes (MDPs). We call these automata ‘goodforMDPs’ (GFM). We show that GFM automata are closed under classic simulation as well as under more powerful simulation relations that leverage properties of optimal control strategies for MDPs. This closure enables us to exploit statespace reduction techniques, such as those based on direct and delayed simulation, that guarantee simulation equivalence. We demonstrate the promise of GFM automata by defining a new class of automata with favorable properties—they are Büchi automata with low branching degree obtained through a simple construction—and show that going beyond limitdeterministic automata may significantly benefit reinforcement learning.more » « less

The determinization of a nondeterministic Büchi automaton (NBA) is a fundamental construction of automata theory, with applications to probabilistic verification and reactive synthesis. The standard determinization constructions, such as the ones based on the SafraPiterman’s approach, work on the whole NBA. In this work we propose a divideandconquer determinization approach. To this end, we first classify the strongly connected components (SCCs) of the given NBA as inherently weak, deterministic accepting, and nondeterministic accepting. We then present how to determinize each type of SCC independently from the others; this results in an easier handling of the determinization algorithm that takes advantage of the structure of that SCC. Once all SCCs have been determinized, we show how to compose them so to obtain the final equivalent deterministic EmersonLei automaton, which can be converted into a deterministic Rabin automaton without blowup of states and transitions. We implement our algorithm in our tool COLA and empirically evaluate COLA with the stateoftheart tools Spot and Owl on a large set of benchmarks from the literature. The experimental results show that our prototype COLA outperforms Spot and Owl regarding the number of states and transitions.more » « less

Motivated by distributed data processing applications, we introduce a class of labeled directed acyclic graphs constructed using sequential and parallel composition operations, and study automata and logics over them. We show that deterministic and nondeterministic acceptors over such graphs have the same expressive power, which can be equivalently characterized by Monadic SecondOrder logic and the graded µcalculus. We establish closure under composition operations and decision procedures for membership, emptiness, and inclusion. A key feature of our graphs, called synchronized seriesparallel graphs (SSPG), is that parallel composition introduces a synchronization edge from the newly introduced source vertex to the sink. The transfer of information enabled by such edges is crucial to the determinization construction, which would not be possible for the traditional definition of seriesparallel graphs. SSPGs allow both ordered ranked parallelism and unordered unranked parallelism. The latter feature means that in the corresponding automata, the transition function needs to account for an arbitrary number of predecessors by counting each type of state only up to a specified constant, thus leading to a notion of counting complexity that is distinct from the classical notion of state complexity. The determinization construction translates a nondeterministic automaton with n states and k counting complexity to a deterministic automaton with 2 n 2 states and kn counting complexity, and both these bounds are shown to be tight. Furthermore, for nondeterministic automata a bound of 2 on counting complexity suffices without loss of expressiveness.more » « less

The problems of verification and realizability are two central themes in the analysis of reactive systems. When multiagent systems are considered, these problems have natural analogues of existence (nonemptiness) of purestrategy Nash equilibria and verification of purestrategy Nash equilibria. Recently, this body of work has begun to include finitehorizon temporal goals. With finitehorizon temporal goals, there is a natural hierarchy of goal representation, ranging from deterministic finite automata (DFA), to nondeterministic finite automata (NFA), and to alternating finite automata (AFA), with a worstcase exponential gap between each successive representation. Previous works showed that the realizability problem with DFA goals was PSPACEcomplete, while the realizability problem with temporal logic goals is in 2EXPTIME. In this work, we study both the realizability and the verification problems with respect to various goal representations. We first show that the realizability problem with NFA goals is EXPTIMEcomplete and with AFA goals is 2EXPTIMEcomplete, thus establishing strict complexity gaps between realizability with respect to DFA, NFA, and AFA goals. We then contrast these complexity gaps with the complexity of the verification problem, where we show that verification with respect to DFAs, NFA, and AFA goals is PSPACEcomplete.