<?xml version="1.0" encoding="UTF-8"?><rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dcq="http://purl.org/dc/terms/"><records count="1" morepages="false" start="1" end="1"><record rownumber="1"><dc:product_type>Conference Paper</dc:product_type><dc:title>Multi-Agent Reinforcement Learning in Stochastic Networked Systems</dc:title><dc:creator>Yiheng Lin, Guannan Qu</dc:creator><dc:corporate_author/><dc:editor/><dc:description>We study multi-agent reinforcement learning (MARL) in a stochastic network of agents. The objective is to find localized policies that maximize the (discounted) global reward. In general, scalability is a challenge in this setting because the size of the global state/action space can be exponential in the number of agents. Scalable algorithms are only known in cases where dependencies are static, fixed and local, e.g., between neighbors in a fixed, time-invariant underlying graph. In this work, we propose a Scalable Actor Critic framework that applies in settings where the dependencies can be non-local and stochastic, and provide a finite-time error bound that shows how the convergence rate depends on the speed of information spread in the network. Additionally, as a byproduct of our analysis, we obtain novel finite-time convergence results for a general stochastic approximation scheme and for temporal difference learning with state aggregation, which apply beyond the setting of MARL in networked systems.</dc:description><dc:publisher/><dc:date>2021-01-01</dc:date><dc:nsf_par_id>10324693</dc:nsf_par_id><dc:journal_name>Advances in Neural Information Processing Systems 34 (NeurIPS 2021)</dc:journal_name><dc:journal_volume/><dc:journal_issue/><dc:page_range_or_elocation/><dc:issn/><dc:isbn/><dc:doi>https://doi.org/</dc:doi><dcq:identifierAwardId>2105648</dcq:identifierAwardId><dc:subject/><dc:version_number/><dc:location/><dc:rights/><dc:institution/><dc:sponsoring_org>National Science Foundation</dc:sponsoring_org></record></records></rdf:RDF>