It has been recently established in David and Mayboroda (Approximation of green functions and domains with uniformly rectifiable boundaries of all dimensions.
The softmax policy gradient (PG) method, which performs gradient ascent under softmax policy parameterization, is arguably one of the de facto implementations of policy optimization in modern reinforcement learning. For
- Publication Date:
- NSF-PAR ID:
- 10392524
- Journal Name:
- Mathematical Programming
- ISSN:
- 0025-5610
- Publisher:
- Springer Science + Business Media
- Sponsoring Org:
- National Science Foundation
More Like this
-
Abstract arXiv:2010.09793 ) that on uniformly rectifiable sets the Green function is almost affine in the weak sense, and moreover, in some scenarios such Green function estimates are equivalent to the uniform rectifiability of a set. The present paper tackles a strong analogue of these results, starting with the “flagship degenerate operators on sets with lower dimensional boundaries. We consider the elliptic operators associated to a domain$$L_{\beta ,\gamma } =- {\text {div}}D^{d+1+\gamma -n} \nabla $$ with a uniformly rectifiable boundary$$\Omega \subset {\mathbb {R}}^n$$ of dimension$$\Gamma $$ , the now usual distance to the boundary$$d < n-1$$ given by$$D = D_\beta $$ for$$D_\beta (X)^{-\beta } = \int _{\Gamma } |X-y|^{-d-\beta } d\sigma (y)$$ , where$$X \in \Omega $$ and$$\beta >0$$ . In this paper we show that the Green function$$\gamma \in (-1,1)$$ G for , with pole at infinity, is well approximated by multiples of$$L_{\beta ,\gamma }$$ , in the sense that the function$$D^{1-\gamma }$$ satisfies a Carleson measure estimate on$$\big | D\nabla \big (\ln \big ( \frac{G}{D^{1-\gamma }} \big )\big )\big |^2$$ . We underline that the strong and the weak results are different in nature and, of course, at the levelmore »$$\Omega $$ -
Abstract The free multiplicative Brownian motion
is the large-$$b_{t}$$ N limit of the Brownian motion on in the sense of$$\mathsf {GL}(N;\mathbb {C}),$$ -distributions. The natural candidate for the large-$$*$$ N limit of the empirical distribution of eigenvalues is thus the Brown measure of . In previous work, the second and third authors showed that this Brown measure is supported in the closure of a region$$b_{t}$$ that appeared in the work of Biane. In the present paper, we compute the Brown measure completely. It has a continuous density$$\Sigma _{t}$$ on$$W_{t}$$ which is strictly positive and real analytic on$$\overline{\Sigma }_{t},$$ . This density has a simple form in polar coordinates:$$\Sigma _{t}$$ where$$\begin{aligned} W_{t}(r,\theta )=\frac{1}{r^{2}}w_{t}(\theta ), \end{aligned}$$ is an analytic function determined by the geometry of the region$$w_{t}$$ . We show also that the spectral measure of free unitary Brownian motion$$\Sigma _{t}$$ is a “shadow” of the Brown measure of$$u_{t}$$ , precisely mirroring the relationship between the circular and semicircular laws. We develop several new methods, based on stochastic differential equations and PDE, to prove these results.$$b_{t}$$ -
Abstract Based on the recent development of the framework of Volterra rough paths (Harang and Tindel in Stoch Process Appl 142:34–78, 2021), we consider here the probabilistic construction of the Volterra rough path associated to the fractional Brownian motion with
and for the standard Brownian motion. The Volterra kernel$$H>\frac{1}{2}$$ k (t ,s ) is allowed to be singular, and behaving similar to for some$$|t-s|^{-\gamma }$$ . The construction is done in both the Stratonovich and Itô senses. It is based on a modified Garsia–Rodemich–Romsey lemma which is of interest in its own right, as well as tools from Malliavin calculus. A discussion of challenges and potential extensions is provided.$$\gamma \ge 0$$ -
Abstract We study the structure of the Liouville quantum gravity (LQG) surfaces that are cut out as one explores a conformal loop-ensemble
for$$\hbox {CLE}_{\kappa '}$$ in (4, 8) that is drawn on an independent$$\kappa '$$ -LQG surface for$$\gamma $$ . The results are similar in flavor to the ones from our companion paper dealing with$$\gamma ^2=16/\kappa '$$ for$$\hbox {CLE}_{\kappa }$$ in (8/3, 4), where the loops of the CLE are disjoint and simple. In particular, we encode the combined structure of the LQG surface and the$$\kappa $$ in terms of stable growth-fragmentation trees or their variants, which also appear in the asymptotic study of peeling processes on decorated planar maps. This has consequences for questions that do a priori not involve LQG surfaces: In our paper entitled “$$\hbox {CLE}_{\kappa '}$$ CLE Percolations ” described the law of interfaces obtained when coloring the loops of a independently into two colors with respective probabilities$$\hbox {CLE}_{\kappa '}$$ p and . This description was complete up to one missing parameter$$1-p$$ . The results of the present paper about CLE on LQG allow us to determine its value in terms of$$\rho $$ p and . It shows in particular that$$\kappa '$$ and$$\hbox {CLE}_{\kappa '}$$ are related via a continuum analog of the Edwards-Sokal coupling between$$\hbox {CLE}_{16/\kappa '}$$ percolation and the$$\hbox {FK}_q$$ q -state Potts model (which makes sense evenmore » -
Abstract A long-standing problem in mathematical physics is the rigorous derivation of the incompressible Euler equation from Newtonian mechanics. Recently, Han-Kwan and Iacobelli (Proc Am Math Soc 149:3045–3061, 2021) showed that in the monokinetic regime, one can directly obtain the Euler equation from a system of
N particles interacting in ,$${\mathbb {T}}^d$$ , via Newton’s second law through a$$d\ge 2$$ supercritical mean-field limit . Namely, the coupling constant in front of the pair potential, which is Coulombic, scales like$$\lambda $$ for some$$N^{-\theta }$$ , in contrast to the usual mean-field scaling$$\theta \in (0,1)$$ . Assuming$$\lambda \sim N^{-1}$$ , they showed that the empirical measure of the system is effectively described by the solution to the Euler equation as$$\theta \in (1-\frac{2}{d(d+1)},1)$$ . Han-Kwan and Iacobelli asked if their range for$$N\rightarrow \infty $$ was optimal. We answer this question in the negative by showing the validity of the incompressible Euler equation in the limit$$\theta $$ for$$N\rightarrow \infty $$ . Our proof is based on Serfaty’s modulated-energy method, but compared to that of Han-Kwan and Iacobelli, crucially uses an improved “renormalized commutator” estimate to obtain the larger range for$$\theta \in (1-\frac{2}{d},1)$$ . Additionally, we show that for$$\theta $$ , one cannot, in general, expect convergence in the modulated energy notion of distance.$$\theta \le 1-\frac{2}{d}$$