skip to main content

Search for: All records

Creators/Authors contains: "Wright, Stephen J."

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Free, publicly-accessible full text available July 13, 2024
  2. Abstract

    We consider variants of a recently developed Newton-CG algorithm for nonconvex problems (Royer, C. W. & Wright, S. J. (2018) Complexity analysis of second-order line-search algorithms for smooth nonconvex optimization. SIAM J. Optim., 28, 1448–1477) in which inexact estimates of the gradient and the Hessian information are used for various steps. Under certain conditions on the inexactness measures, we derive iteration complexity bounds for achieving $\epsilon $-approximate second-order optimality that match best-known lower bounds. Our inexactness condition on the gradient is adaptive, allowing for crude accuracy in regions with large gradients. We describe two variants of our approach, one in which the step size along the computed search direction is chosen adaptively, and another in which the step size is pre-defined. To obtain second-order optimality, our algorithms will make use of a negative curvature direction on some steps. These directions can be obtained, with high probability, using the randomized Lanczos algorithm. In this sense, all of our results hold with high probability over the run of the algorithm. We evaluate the performance of our proposed algorithms empirically on several machine learning models. Our approach is a first attempt to introduce inexact Hessian and/or gradient information into the Newton-CG algorithm of Royer & Wright (2018, Complexity analysis of second-order line-search algorithms for smooth nonconvex optimization. SIAM J. Optim., 28, 1448–1477).

    more » « less
  3. Abstract

    We study a model for adversarial classification based on distributionally robust chance constraints. We show that under Wasserstein ambiguity, the model aims to minimize the conditional value-at-risk of the distance to misclassification, and we explore links to adversarial classification models proposed earlier and to maximum-margin classifiers. We also provide a reformulation of the distributionally robust model for linear classification, and show it is equivalent to minimizing a regularized ramp loss objective. Numerical experiments show that, despite the nonconvexity of this formulation, standard descent methods appear to converge to the global minimizer for this problem. Inspired by this observation, we show that, for a certain class of distributions, the only stationary point of the regularized ramp loss minimization problem is the global minimizer.

    more » « less
  4. null (Ed.)
  5. null (Ed.)
    Abstract We study minimization of a structured objective function, being the sum of a smooth function and a composition of a weakly convex function with a linear operator. Applications include image reconstruction problems with regularizers that introduce less bias than the standard convex regularizers. We develop a variable smoothing algorithm, based on the Moreau envelope with a decreasing sequence of smoothing parameters, and prove a complexity of $${\mathcal {O}}(\epsilon ^{-3})$$ O ( ϵ - 3 ) to achieve an $$\epsilon $$ ϵ -approximate solution. This bound interpolates between the $${\mathcal {O}}(\epsilon ^{-2})$$ O ( ϵ - 2 ) bound for the smooth case and the $${\mathcal {O}}(\epsilon ^{-4})$$ O ( ϵ - 4 ) bound for the subgradient method. Our complexity bound is in line with other works that deal with structured nonsmoothness of weakly convex functions. 
    more » « less