Autonomous agents in a multi-agent system work with each other to achieve their goals. However, In a partially observable world, current multi-agent systems are often less effective in achieving their goals. This limitation is due to the agents’ lack of reasoning about other agents and their mental states. Another factor is the agents’ inability to share required knowledge with other agents. This paper addresses the limitations by presenting a general approach for autonomous agents to work together in a multi-agent system. In this approach, an agent applies two main concepts: goal reasoning- to determine what goals to pursue and share; Theory of mind-to select an agent(s) for sharing goals and knowledge. We evaluate the performance of our multi-agent system in a Marine Life Survey Domain and compare it to another multi-agent system that randomly selects agent(s) to delegates its goals.
more »
« less
Case-based explanations and goal specific resource estimations
Autonomous agents often have sufficient resources to achieve the goals that are provided to them. However, in dynamic worlds where unexpected problems are bound to occur, an agent may formulate new goals with further resource requirements. Thus, agents should be smart enough to man-age their goals and the limited resources they possess in an effective and flexible manner. We present an approach to the selection and monitoring of goals using resource estimation and goal priorities. To evaluate our approach, we designed an experiment on top of our previous work in a complex mine-clearance domain. The agent in this domain formulates its own goals by retrieving a case to explain uncovered discrepancies and generating goals from the explanation. Finally, we compare the performance of our approach to two alternatives.
more »
« less
- Award ID(s):
- 1849131
- PAR ID:
- 10352627
- Editor(s):
- Barták, Roman; Bell, Eric
- Date Published:
- Journal Name:
- Proceedings of the 33rd International Conference of the Florida Artificial Intelligence Research Society
- Page Range / eLocation ID:
- 407-412
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Congested traffic wastes billions of liters of fuel and is a significant contributor to Green House Gas (GHG) emissions. Although convenient, ride sharing services such as Uber and Lyft are becoming a significant contributor to these emissions not only because of added traffic but by spending time on the road while waiting for passengers. To help improve the impact of ride sharing, we propose an algorithm to optimize the efficiency of drivers searching for customers. In our model, the main goal is to direct drivers represented as idle agents, i.e., not currently assigned a customer or resource, to locations where we predict new resources to appear. Our approach uses non-negative matrix factorization (NMF) to model and predict the spatio-temporal distributions of resources. To choose destinations for idle agents, we employ a greedy heuristic that strikes a balance between distance greed, i.e., to avoid long trips without resources and resource greed, i.e., to move to a location where resources are expected to appear following the NMF model. To ensure that agents do not oversupply areas for which resources are predicted and under supply other areas, we randomize the destinations of agents using the predicted resource distribution within the local neighborhood of an agent. Our experimental evaluation shows that our approach reduces the search time of agents and the wait time of resources using real-world data from Manhattan, New York, USA.more » « less
-
In multi-agent systems, limited resources must be shared by individuals during missions to maximize the group utility of the system in the field. In this paper, we present a generalized adaptive self-organization process for multi-agent systems featuring fast and efficient distribution of a consumable and refillable on-board resource throughout the group. An adaptive inter-agent spacing (AIS) controller based on individual resource levels is proposed that spaces out high resource bearing agents throughout the group including the group boundary extrema, and allows low resource bearing agents to adaptively occupy the in-between spaces receiving resource from the high resource bearing agents without over-crowding. Experimental results for cases with and without the proposed AIS controller validate faster convergence of individual resource levels to the group mean resource level using the proposed AIS controller. The generalized approach of the self-organizing process allows flexibility in adapting the proposed AIS controller for various multi-agent applications.more » « less
-
We study settings where a set of identical, reusable resources must be allocated in an online fashion to arriving agents. Each arriving agent is patient and willing to wait for some period of time to be matched. When matched, each agent occupies a resource for a certain amount of time, and then releases it, gaining some utility from having done so. The goal of the system designer is to maximize overall utility given some prior knowledge of the distribution of arriving agents. We are particularly interested in settings where demand for the resources far outstrips supply, as is typical in the provision of social services, for example homelessness resources. We formulate this problem as online bipartite matching with reusable resources and patient agents. We develop new, efficient nonmyopic algorithms for this class of problems, and compare their performance with that of greedy algorithms in a variety of simulated settings, as well as in a setting calibrated to real-world data on household demand for homelessness services. We find substantial overall welfare benefits to using our nonmyopic algorithms, particularly in more extreme settings – those where agents are unwilling or unable to wait for resources, and where the ratio of resource demand to supply is particularly high.more » « less
-
Mobile wireless networks present several challenges for any learning system, due to uncertain and variable device movement, a decentralized network architecture, and constraints on network resources. In this work, we use deep reinforcement learning (DRL) to learn a scalable and generalizable forwarding strategy for such networks. We make the following contributions: i) we use hierarchical RL to design DRL packet agents rather than device agents, to capture the packet forwarding decisions that are made over time and improve training efficiency; ii) we use relational features to ensure generalizability of the learned forwarding strategy to a wide range of network dynamics and enable offline training; and iii) we incorporate both forwarding goals and network resource considerations into packet decision-making by designing a weighted DRL reward function. Our results show that our DRL agent often achieves a similar delay per packet delivered as the optimal forwarding strategy and outperforms all other strategies including state-of-the-art strategies, even on scenarios on which the DRL agent was not trained.more » « less