We develop a general reinforcement learning framework for mean field control (MFC) problems. Such problems arise for instance as the limit of collaborative multi-agent control problems when the number of agents is very large. The asymptotic problem can be phrased as the optimal control of a non-linear dynamics. This can also be viewed as a Markov decision process (MDP) but the key difference with the usual RL setup is that the dynamics and the reward now depend on the state's probability distribution itself. Alternatively, it can be recast as a MDP on the Wasserstein space of measures. In this work, we introduce generic model-free algorithms based on the state-action value function at the mean field level and we prove convergence for a prototypical Q-learning method. We then implement an actor-critic method and report numerical results on two archetypal problems: a finite space model motivated by a cyber security application and a continuous space model motivated by an application to swarm motion.
more »
« less
Students Do Not Always Mean What We Think They Mean: A Questioning Strategy to Elicit the Reasoning Behind Unexpected Causal Patterns in Student System Models
- PAR ID:
- 10445498
- Date Published:
- Journal Name:
- International Journal of Science and Mathematics Education
- Volume:
- 21
- Issue:
- 5
- ISSN:
- 1571-0068
- Page Range / eLocation ID:
- 1591 to 1614
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Mean curvature flow is the negative gradient flow of volume, so any hypersurface flows through hypersurfaces in the direction of steepest descent for volume and eventually becomes extinct in finite time. Before it becomes extinct, topological changes can occur as it goes through singularities. If the hypersurface is in general or generic position, then we explain what singularities can occur under the flow, what the flow looks like near these singularities, and what this implies for the structure of the singular set. At the end, we will briefly discuss how one may be able to use the flow in low-dimensional topology.more » « less
An official website of the United States government

