null
(Ed.)
We present a stochastic descent algorithm for unconstrained optimization that is
particularly efficient when the objective function is slow to evaluate and gradients
are not easily obtained, as in some PDE-constrained optimization and machine
learning problems. The algorithm maps the gradient onto a low-dimensional ran-
dom subspace of dimension at each iteration, similar to coordinate descent but
without restricting directional derivatives to be along the axes. Without requiring
a full gradient, this mapping can be performed by computing directional deriva-
tives (e.g., via forward-mode automatic differentiation). We give proofs for conver-
gence in expectation under various convexity assumptions as well as probabilistic
convergence results under strong-convexity. Our method provides a novel extension
to the well-known Gaussian smoothing technique to descent in subspaces of dimen-
sion greater than one, opening the doors to new analysis of Gaussian smoothing
when more than one directional derivative is used at each iteration. We also provide
a finite-dimensional variant of a special case of the Johnson–Lindenstrauss lemma.
Experimentally, we show that our method compares favorably to coordinate descent,
Gaussian smoothing, gradient descent and BFGS (when gradients are calculated via
forward-mode automatic differentiation) on problems from the machine learning
and shape optimization literature.
more »
« less