skip to main content


Title: LTLf Synthesis under Partial Observability: From Theory to Practice
LTL synthesis is the problem of synthesizing a reactive system from a formal specification in Linear Temporal Logic. The extension of allowing for partial observability, where the system does not have direct access to all relevant information about the environment, allows generalizing this problem to a wider set of real-world applications, but the difficulty of implementing such an extension in practice means that it has remained in the realm of theory. Recently, it has been demonstrated that restricting LTL synthesis to systems with finite executions by using LTL with finite-horizon semantics (LTLf) allows for significantly simpler implementations in practice. With the conceptual simplicity of LTLf, it becomes possible to explore extensions such as partial observability in practice for the first time. Previous work has analyzed the problem of LTLf synthesis under partial observability theoretically and suggested two possible algorithms, one with 3EXPTIME and another with 2EXPTIME complexity. In this work, we first prove a complexity lower bound conjectured in earlier work. Then, we complement the theoretical analysis by showing how the two algorithms can be integrated in practice into an established framework for LTLf synthesis. We furthermore identify a third, MSO-based, approach enabled by this framework. Our experimental evaluation reveals very different results from what the theory seems to suggest, with the 3EXPTIME algorithm often outperforming the 2EXPTIME approach. Furthermore, as long as it is able to overcome an initial memory bottleneck, the MSO-based approach can often outperforms the others.  more » « less
Award ID(s):
1830549
NSF-PAR ID:
10190973
Author(s) / Creator(s):
;
Date Published:
Journal Name:
The Eleventh International Symposium on Games, Automata, Logics, and Formal Verification (GandALF)
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Many systems are naturally modeled as Markov Decision Processes (MDPs), combining probabilities and strategic actions. Given a model of a system as an MDP and some logical specification of system behavior, the goal of synthesis is to find a policy that maximizes the probability of achieving this behavior. A popular choice for defining behaviors is Linear Temporal Logic (LTL). Policy synthesis on MDPs for properties specified in LTL has been well studied. LTL, however, is defined over infinite traces, while many properties of interest are inherently finite. Linear Temporal Logic over finite traces (LTLf ) has been used to express such properties, but no tools exist to solve policy synthesis for MDP behaviors given finite-trace properties. We present two algorithms for solving this synthesis problem: the first via reduction of LTLf to LTL and the second using native tools for LTLf . We compare the scalability of these two approaches for synthesis and show that the native approach offers better scalability compared to existing automaton generation tools for LTL. 
    more » « less
  2. null (Ed.)
    Linear Temporal Logic (LTL) synthesis aims at automatically synthesizing a program that complies with desired properties expressed in LTL. Unfortunately it has been proved to be too difficult computationally to perform full LTL synthesis. There have been two success stories with LTL synthesis, both having to do with the form of the specification. The first is the GR(1) approach: use safety conditions to determine the possible transitions in a game between the environment and the agent, plus one powerful notion of fairness, Generalized Reactivity(1), or GR(1). The second, inspired by AI planning, is focusing on finite-trace temporal synthesis, with LTLf (LTL on finite traces) as the specification language. In this paper we take these two lines of work and bring them together. We first study the case in which we have an LTLf agent goal and a GR(1) assumption. We then add to the framework safety conditions for both the environment and the agent, obtaining a highly expressive yet still scalable form of LTL synthesis. 
    more » « less
  3. We consider the problem of synthesizing good-enough (GE)-strategies for linear temporal logic (LTL) over finite traces or LTLf for short.The problem of synthesizing GE-strategies for an LTL formula φ over infinite traces reduces to the problem of synthesizing winning strategies for the formula (∃Oφ)⇒φ where O is the set of propositions controlled by the system.We first prove that this reduction does not work for LTLf formulas.Then we show how to synthesize GE-strategies for LTLf formulas via the Good-Enough (GE)-synthesis of LTL formulas.Unfortunately, this requires to construct deterministic parity automata on infinite words, which is computationally expensive.We then show how to synthesize GE-strategies for LTLf formulas by a reduction to solving games played on deterministic Büchi automata, based on an easier construction of deterministic automata on finite words.We show empirically that our specialized synthesis algorithm for GE-strategies outperforms the algorithms going through GE-synthesis of LTL formulas by orders of magnitude.

     
    more » « less
  4. null (Ed.)
    We present a data-driven framework for strategy synthesis for partially-known switched stochastic systems. The properties of the system are specified using linear temporal logic (LTL) over finite traces (LTLf), which is as expressive as LTL and enables interpretations over finite behaviors. The framework first learns the unknown dynamics via Gaussian process regression. Then, it builds a formal abstraction of the switched system in terms of an uncertain Markov model, namely an Interval Markov Decision Process (IMDP), by accounting for both the stochastic behavior of the system and the uncertainty in the learning step. Then, we synthesize a strategy on the resulting IMDP that maximizes the satisfaction probability of the LTLf specification and is robust against all the uncertainties in the abstraction. This strategy is then refined into a switching strategy for the original stochastic system. We show that this strategy is near-optimal and provide a bound on its distance (error) to the optimal strategy. We experimentally validate our framework on various case studies, including both linear and non-linear switched stochastic systems. 
    more » « less
  5. Centralized Training for Decentralized Execution, where agents are trained offline in a centralized fashion and execute online in a decentralized manner, has become a popular approach in Multi-Agent Reinforcement Learning (MARL). In particular, it has become popular to develop actor-critic methods that train decentralized actors with a centralized critic where the centralized critic is allowed access to global information of the entire system, including the true system state. Such centralized critics are possible given offline information and are not used for online execution. While these methods perform well in a number of domains and have become a de facto standard in MARL, using a centralized critic in this context has yet to be sufficiently analyzed theoretically or empirically. In this paper, we therefore formally analyze centralized and decentralized critic approaches, and analyze the effect of using state-based critics in partially observable environments. We derive theories contrary to the common intuition: critic centralization is not strictly beneficial, and using state values can be harmful. We further prove that, in particular, state-based critics can introduce unexpected bias and variance compared to history-based critics. Finally, we demonstrate how the theory applies in practice by comparing different forms of critics on a wide range of common multi-agent benchmarks. The experiments show practical issues such as the difficulty of representation learning with partial observability, which highlights why the theoretical problems are often overlooked in the literature. 
    more » « less