skip to main content

Title: A machine learning framework for solving high-dimensional mean field game and mean field control problems

Mean field games (MFG) and mean field control (MFC) are critical classes of multiagent models for the efficient analysis of massive populations of interacting agents. Their areas of application span topics in economics, finance, game theory, industrial engineering, crowd motion, and more. In this paper, we provide a flexible machine learning framework for the numerical solution of potential MFG and MFC models. State-of-the-art numerical methods for solving such problems utilize spatial discretization that leads to a curse of dimensionality. We approximately solve high-dimensional problems by combining Lagrangian and Eulerian viewpoints and leveraging recent advances from machine learning. More precisely, we work with a Lagrangian formulation of the problem and enforce the underlying Hamilton–Jacobi–Bellman (HJB) equation that is derived from the Eulerian formulation. Finally, a tailored neural network parameterization of the MFG/MFC solution helps us avoid any spatial discretization. Our numerical results include the approximate solution of 100-dimensional instances of optimal transport and crowd motion problems on a standard work station and a validation using a Eulerian solver in two dimensions. These results open the door to much-anticipated applications of MFG and MFC models that are beyond reach with existing numerical methods.

Authors:
; ; ; ;
Award ID(s):
1751636
Publication Date:
NSF-PAR ID:
10143814
Journal Name:
Proceedings of the National Academy of Sciences
Volume:
117
Issue:
17
Page Range or eLocation-ID:
p. 9183-9193
ISSN:
0027-8424
Publisher:
Proceedings of the National Academy of Sciences
Sponsoring Org:
National Science Foundation
More Like this
  1. We develop a general reinforcement learning framework for mean field control (MFC) problems. Such problems arise for instance as the limit of collaborative multi-agent control problems when the number of agents is very large. The asymptotic problem can be phrased as the optimal control of a non-linear dynamics. This can also be viewed as a Markov decision process (MDP) but the key difference with the usual RL setup is that the dynamics and the reward now depend on the state's probability distribution itself. Alternatively, it can be recast as a MDP on the Wasserstein space of measures. In this work,more »we introduce generic model-free algorithms based on the state-action value function at the mean field level and we prove convergence for a prototypical Q-learning method. We then implement an actor-critic method and report numerical results on two archetypal problems: a finite space model motivated by a cyber security application and a continuous space model motivated by an application to swarm motion.« less
  2. Prior mathematical work of Constantin & Iyer ( Commun. Pure Appl. Maths , vol. 61, 2008, pp. 330–345; Ann. Appl. Probab. , vol. 21, 2011, pp. 1466–1492) has shown that incompressible Navier–Stokes solutions possess infinitely many stochastic Lagrangian conservation laws for vorticity, backward in time, which generalize the invariants of Cauchy ( Sciences mathématiques et physique , vol. I, 1815, pp. 33–73) for smooth Euler solutions. We reformulate this theory for the case of wall-bounded flows by appealing to the Kuz'min ( Phys. Lett. A , vol. 96, 1983, pp. 88–90)–Oseledets ( Russ. Math. Surv. , vol. 44, 1989, p.more »210) representation of Navier–Stokes dynamics, in terms of the vortex-momentum density associated to a continuous distribution of infinitesimal vortex rings. The Constantin–Iyer theory provides an exact representation for vorticity at any interior point as an average over stochastic vorticity contributions transported from the wall. We point out relations of this Lagrangian formulation with the Eulerian theory of Lighthill (Boundary layer theory. In Laminar Boundary Layers (ed. L. Rosenhead), 1963, pp. 46–113)–Morton ( Geophys. Astrophys. Fluid Dyn. , vol. 28, 1984, pp. 277–308) for vorticity generation at solid walls, and also with a statistical result of Taylor ( Proc. R. Soc. Lond. A , vol. 135, 1932, pp. 685–702)–Huggins ( J. Low Temp. Phys. , vol. 96, 1994, pp. 317–346), which connects dissipative drag with organized cross-stream motion of vorticity and which is closely analogous to the ‘Josephson–Anderson relation’ for quantum superfluids. We elaborate a Monte Carlo numerical Lagrangian scheme to calculate the stochastic Cauchy invariants and their statistics, given the Eulerian space–time velocity field. The method is validated using an online database of a turbulent channel-flow simulation (Graham et al. , J. Turbul. , vol. 17, 2016, pp. 181–215), where conservation of the mean Cauchy invariant is verified for two selected buffer-layer events corresponding to an ‘ejection’ and a ‘sweep’. The variances of the stochastic Cauchy invariants grow exponentially backward in time, however, revealing Lagrangian chaos of the stochastic trajectories undergoing both fluid advection and viscous diffusion.« less
  3. This project investigates numerical methods for solving fully coupled forward-backward stochastic differential equations (FBSDEs) of McKean-Vlasov type. Having numerical solvers for such mean field FBSDEs is of interest because of the potential application of these equations to optimization problems over a large population, say for instance mean field games (MFG) and optimal mean field control problems . Theory for this kind of problems has met with great success since the early works on mean field games by Lasry and Lions, see [29], and by Huang, Caines, and Malhame, see [26]. Generally speaking, the purpose is to understand the continuum limitmore »of optimizers or of equilibria (say in Nash sense) as the number of underlying players tends to infinity. When approached from the probabilistic viewpoint, solutions to these control problems (or games) can be described by coupled mean field FBSDEs, meaning that the coefficients depend upon the own marginal laws of the solution. In this note, we detail two methods for solving such FBSDEs which we implement and apply to five benchmark problems . The first method uses a tree structure to represent the pathwise laws of the solution, whereas the second method uses a grid discretization to represent the time marginal laws of the solutions. Both are based on a Picard scheme; importantly, we combine each of them with a generic continuation method that permits to extend the time horizon (or equivalently the coupling strength between the two equations) for which the Picard iteration converges.« less
  4. This project investigates numerical methods for solving fully coupled forward-backward stochastic differential equations (FBSDEs) of McKean-Vlasov type. Having numerical solvers for such mean field FBSDEs is of interest because of the potential application of these equations to optimization problems over a large population, say for instance mean field games (MFG) and optimal mean field control problems. Theory for this kind of problems has met with great success since the early works on mean field games by Lasry and Lions, and by Huang, Caines, and Malhame. Generally speaking, the purpose is to understand the continuum limit of optimizers or of equilibriamore »(say in Nash sense) as the number of underlying players tends to infinity. When approached from the probabilistic viewpoint, solutions to these control problems (or games) can be described by coupled mean field FBSDEs, meaning that the coefficients depend upon the own marginal laws of the solution. In this note, we detail two methods for solving such FBSDEs which we implement and apply to five benchmark problems. The first method uses a tree structure to represent the pathwise laws of the solution, whereas the second method uses a grid discretization to represent the time marginal laws of the solutions. Both are based on a Picard scheme; importantly, we combine each of them with a generic continuation method that permits to extend the time horizon (or equivalently the coupling strength between the two equations) for which the Picard iteration converges.« less
  5. Ocean volume and tracer transports are commonly computed on density surfaces because doing so approximates the semi-Lagrangian mean advective transport. The resulting density-averaged transport can be related approximately to Eulerian-averaged quantities via the Temporal Residual Mean (TRM), valid in the limit of small isopycnal height fluctuations. This article builds on a formulation of the TRM for volume fluxes within Neutral Density surfaces, (the “NDTRM”), selected because Neutral Density surfaces are constructed to be as neutral as possible while still forming well-defined surfaces. This article derives a TRM, referred to as the “Neutral TRM” (NTRM), that approximates volume fluxes within surfacesmore »whose vertical fluctuations are defined directly by the neutral relation. The purpose of the NTRM is to more closely approximate the semi-Lagrangian mean transport than the NDTRM, because the latter introduces errors associated with differences between the instantaneous state of the modeled/observed ocean and the reference climatology used to assign the Neutral Density variable. It is shown that the NDTRM collapses to the NTRM in the limiting case of a Neutral Density variable defined with reference to the Eulerian-mean salinity, potential temperature and pressure, rather than an external reference climatology, and therefore that the NTRM approximately advects this density variable. This prediction is verified directly using output from an idealized eddy-resolving numerical model. The NTRM therefore offers an efficient and accurate estimate of modeled semi-Lagrangian mean transports without reference to an external reference climatology, but requires that a Neutral Density variable be computed once from the model’s time-mean state in order to estimate isopycnal and diapycnal components of the transport.« less