NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Stochastic Gradients: Optimization, Simulation, Randomization, and Sensitivity Analysis

https://doi.org/10.1080/24725854.2025.2469839

Fu, Michael C; Hu, Jiaqiao; Scheinberg, Katya (April 2025, IISE Transactions)

Free, publicly-accessible full text available April 15, 2026
High Probability Complexity Bounds for Adaptive Step Search Based on Stochastic Oracles

https://doi.org/10.1137/22M1512764

Jin, Billy; Scheinberg, Katya; Xie, Miaolan (September 2024, SIAM Journal on Optimization)

Full Text Available
Sample complexity analysis for adaptive optimization algorithms with stochastic oracles

https://doi.org/10.1007/s10107-024-02078-z

Jin, Billy; Scheinberg, Katya; Xie, Miaolan (April 2024, Mathematical Programming)

Full Text Available
Stochastic Adaptive Regularization Method with Cubics: a High Probability Complexity Bound

https://doi.org/10.1109/WSC60868.2023.10408598

Scheinberg, Katya; Xie, Miaolan (December 2023, IEEE)

Full Text Available
First- and second-order high probability complexity bounds for trust-region methods with noisy oracles

https://doi.org/10.1007/s10107-023-01999-5

Cao, Liyuan; Berahas, Albert S.; Scheinberg, Katya (July 2023, Mathematical Programming)

Full Text Available
A Theoretical and Empirical Comparison of Gradient Approximations in Derivative-Free Optimization

https://doi.org/10.1007/s10208-021-09513-z

Berahas, Albert S.; Cao, Liyuan; Choromanski, Krzysztof; Scheinberg, Katya (May 2021, Foundations of Computational Mathematics)
null (Ed.)
In this paper, we analyze several methods for approximating gradients of noisy functions using only function values. These methods include finite differences, linear interpolation, Gaussian smoothing, and smoothing on a sphere. The methods differ in the number of functions sampled, the choice of the sample points, and the way in which the gradient approximations are derived. For each method, we derive bounds on the number of samples and the sampling radius which guarantee favorable convergence properties for a line search or fixed step size descent method. To this end, we use the results in Berahas et al. (Global convergence rate analysis of a generic line search algorithm with noise, arXiv:1910.04055, 2019) and show how each method can satisfy the sufficient conditions, possibly only with some sufficiently large probability at each iteration, as happens to be the case with Gaussian smoothing and smoothing on a sphere. Finally, we present numerical results evaluating the quality of the gradient approximations as well as their performance in conjunction with a line search derivative-free optimization algorithm.
more » « less
Full Text Available
Adaptive Stochastic Optimization: A Framework for Analyzing Stochastic Optimization Algorithms

https://doi.org/10.1109/MSP.2020.3003539

Curtis, Frank E.; Scheinberg, Katya (September 2020, IEEE Signal Processing Magazine)
null (Ed.)
Full Text Available
Inexact SARAH algorithm for stochastic optimization

https://doi.org/10.1080/10556788.2020.1818081

Nguyen, Lam M.; Scheinberg, Katya; Takáč, Martin (September 2020, Optimization Methods and Software)
null (Ed.)
Full Text Available
A Stochastic Line Search Method with Expected Complexity Analysis

https://doi.org/10.1137/18M1216250

Paquette, Courtney; Scheinberg, Katya (January 2020, SIAM Journal on Optimization)
null (Ed.)
Full Text Available
SGD and Hogwild! Convergence Without the Bounded Gradients Assumption

Nguyen, Lam M.; Nguyen, Phuong Ha; Dijk, Marten van; Richtárik, Peter; Scheinberg, Katya; Takáč, Martin (January 2018, Proceedings of Machine Learning Research)

Stochastic gradient descent (SGD) is the optimization algorithm of choice in many machine learning applications such as regularized empirical risk minimization and training deep neural networks. The classical analysis of convergence of SGD is carried out under the assumption that the norm of the stochastic gradient is uniformly bounded. While this might hold for some loss functions, it is always violated for cases where the objective function is strongly convex. In (Bottou et al.,2016) a new analysis of convergence of SGD is performed under the assumption that stochastic gradients are bounded with respect to the true gradient norm. Here we show that for stochastic problems arising in machine learning such bound always holds. Moreover, we propose an alternative convergence analysis of SGD with diminishing learning rate regime, which is results in more relaxed conditions that those in (Bottou et al.,2016). We then move on the asynchronous parallel setting, and prove convergence of the Hogwild! algorithm in the same regime, obtaining the first convergence results for this method in the case of diminished learning rate.
more » « less
Full Text Available

Search for: All records