- Award ID(s):
- 1757207
- NSF-PAR ID:
- 10465102
- Date Published:
- Journal Name:
- Lecture notes in computer science
- ISSN:
- 0302-9743
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
In this paper we develop a state transition function for partially observable multi-agent epistemic domains and implement it using Answer Set Programming (ASP). The transition function computes the next state upon an occurrence of a single action. Thus it can be used as a module in epistemic planners. Our transition function incorporates ontic, sensing and announcement actions and allows for arbitrary nested belief formulae and general common knowledge. A novel feature of our model is that upon an action occurrence, an observing agent corrects his (possibly wrong) initial beliefs about action precondition and his observability. By examples, we show that this step is necessary for robust state transition. We establish some properties of our state transition function regarding its soundness in updating beliefs of agents consistent with their observability.more » « less
-
Do agents know each others’ strategies? In multi-process software construction, each process has access to the processes already constructed; but in typical human-robot interactions, a human may not announce its strategy to the robot (indeed, the human may not even know their own strategy). This question has often been overlooked when modeling and reasoning about multi-agent systems. In this work, we study how it impacts strategic reasoning.To do so we consider Strategy Logic (SL), a well-established and highly expressive logic for strategic reasoning. Its usual semantics, which we call “white-box semantics”, models systems in which agents “broadcast” their strategies. By adding imperfect information to the evaluation games for the usual semantics, we obtain a new semantics called “black-box semantics”, in which agents keep their strategies private. We consider the model-checking problem and show that the black-box semantics has much lower complexity than white-box semantics for an important fragment of Strategy Logic.
-
In multi-agent domains (MADs), an agent's action may not just change the world and the agent's knowledge and beliefs about the world, but also may change other agents' knowledge and beliefs about the world and their knowledge and beliefs about other agents' knowledge and beliefs about the world. The goals of an agent in a multi-agent world may involve manipulating the knowledge and beliefs of other agents' and again, not just their knowledge/belief about the world, but also their knowledge about other agents' knowledge about the world. Our goal is to present an action language (mA+) that has the necessary features to address the above aspects in representing and RAC in MADs. mA+ allows the representation of and reasoning about different types of actions that an agent can perform in a domain where many other agents might be present -- such as world-altering actions, sensing actions, and announcement/communication actions. It also allows the specification of agents' dynamic awareness of action occurrences which has future implications on what agents' know about the world and other agents' knowledge about the world. mA+ considers three different types of awareness: full-, partial- awareness, and complete oblivion of an action occurrence and its effects. This keeps the language simple, yet powerful enough to address a large variety of knowledge manipulation scenarios in MADs. The semantics of mA+ relies on the notion of state, which is described by a pointed Kripke model and is used to encode the agent's knowledge and the real state of the world. It is defined by a transition function that maps pairs of actions and states into sets of states. We illustrate properties of the action theories, including properties that guarantee finiteness of the set of initial states and their practical implementability. Finally, we relate mA+ to other related formalisms that contribute to RAC in MADs.more » « less
-
Experimental data are often costly to obtain, which makes it difficult to calibrate complex models. For many models an experimental design that produces the best calibration given a limited experimental budget is not obvious. This paper introduces a deep reinforcement learning (RL) algorithm for design of experiments that maximizes the information gain measured by Kullback–Leibler divergence obtained via the Kalman filter (KF). This combination enables experimental design for rapid online experiments where manual trial-and-error is not feasible in the high-dimensional parametric design space. We formulate possible configurations of experiments as a decision tree and a Markov decision process, where a finite choice of actions is available at each incremental step. Once an action is taken, a variety of measurements are used to update the state of the experiment. This new data leads to a Bayesian update of the parameters by the KF, which is used to enhance the state representation. In contrast to the Nash–Sutcliffe efficiency index, which requires additional sampling to test hypotheses for forward predictions, the KF can lower the cost of experiments by directly estimating the values of new data acquired through additional actions. In this work our applications focus on mechanical testing of materials. Numerical experiments with complex, history-dependent models are used to verify the implementation and benchmark the performance of the RL-designed experiments.more » « less
-
This paper addresses incomplete-information dynamic games, where reward parameters of agents are private. Previous studies have shown that online belief update is necessary for deriving equilibrial policies of such games, especially for high-risk games such as vehicle interactions. However, updating beliefs in real time is computationally expensive as it requires continuous computation of Nash equilibria of the sub-games starting from the current states. In this paper, we consider the triggering mechanism of belief update as a policy defined on the agents’ physical and belief states, and propose learning this policy through reinforcement learning (RL). Using a two-vehicle uncontrolled intersection case, we show that intermittent belief update via RL is sufficient for safe interactions, reducing the computation cost of updates by 59% when agents have full observations of physical states. Simulation results also show that the belief update frequency will increase as noise becomes more significant in measurements of the vehicle positions.more » « less