Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Gradient descent (GD) is a collection of continuous optimization methods that have achieved immeasurable success in practice. Owing to data science applications, GD with diminishing step sizes has become a prominent variant. While this variant of GD has been well studied in the literature for objectives with globally Lipschitz continuous gradients or by requiring bounded iterates, objectives from data science problems do not satisfy such assumptions. Thus, in this work, we provide a novel global convergence analysis of GD with diminishing step sizes for differentiable nonconvex functions whose gradients are only locally Lipschitz continuous. Through our analysis, we generalize what is known about gradient descent with diminishing step sizes, including interesting topological facts, and we elucidate the varied behaviors that can occur in the previously overlooked divergence regime. Thus, we provide a general global convergence analysis of GD with diminishing step sizes under realistic conditions for data science problems.more » « lessFree, publicly-accessible full text available September 30, 2025
-
A sequential quadratic optimization algorithm is proposed for solving smooth nonlinear-equality-constrained optimization problems in which the objective function is defined by an expectation. The algorithmic structure of the proposed method is based on a step decomposition strategy that is known in the literature to be widely effective in practice, wherein each search direction is computed as the sum of a normal step (toward linearized feasibility) and a tangential step (toward objective decrease in the null space of the constraint Jacobian). However, the proposed method is unique from others in the literature in that it both allows the use of stochastic objective gradient estimates and possesses convergence guarantees even in the setting in which the constraint Jacobians may be rank-deficient. The results of numerical experiments demonstrate that the algorithm offers superior performance when compared with popular alternatives.more » « less
-
Parameter calibration aims to estimate unobservable parameters used in a computer model by using physical process responses and computer model outputs. In the literature, existing studies calibrate all parameters simultaneously using an entire data set. However, in certain applications, some parameters are associated with only a subset of data. For example, in the building energy simulation, cooling (heating) season parameters should be calibrated using data collected during the cooling (heating) season only. This study provides a new multiblock calibration approach that considers such heterogeneity. Unlike existing studies that build emulators for the computer model response, such as the widely used Bayesian calibration approach, we consider multiple loss functions to be minimized, each for a block of parameters that use the corresponding data set, and estimate the parameters using a nonlinear optimization technique. We present the convergence properties under certain conditions and quantify the parameter estimation uncertainties. The superiority of our approach is demonstrated through numerical studies and a real-world building energy simulation case study.
History: Bianca Maria Colosimo served as the senior editor for this article.
Funding: This work was partially supported by the National Science Foundation [Grants CMMI-1662553, CMMI-2226348, and CBET-1804321].
Data Ethics & Reproducibility Note: The code capsule is available on Code Ocean at https://codeocean.com/capsule/8623151/tree/v1 and in the e-Companion to this article (available at https://doi.org/10.1287/ijds.2023.0029 ).
-
null (Ed.)In this paper, we analyze several methods for approximating gradients of noisy functions using only function values. These methods include finite differences, linear interpolation, Gaussian smoothing, and smoothing on a sphere. The methods differ in the number of functions sampled, the choice of the sample points, and the way in which the gradient approximations are derived. For each method, we derive bounds on the number of samples and the sampling radius which guarantee favorable convergence properties for a line search or fixed step size descent method. To this end, we use the results in Berahas et al. (Global convergence rate analysis of a generic line search algorithm with noise, arXiv:1910.04055, 2019) and show how each method can satisfy the sufficient conditions, possibly only with some sufficiently large probability at each iteration, as happens to be the case with Gaussian smoothing and smoothing on a sphere. Finally, we present numerical results evaluating the quality of the gradient approximations as well as their performance in conjunction with a line search derivative-free optimization algorithm.more » « less