skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: A Robust Accelerated Optimization Algorithm for Strongly Convex Functions
This work proposes an accelerated first-order algorithm we call the Robust Momentum Method for optimizing smooth strongly convex functions. The algorithm has a single scalar parameter that can be tuned to trade off robustness to gradient noise versus worst-case convergence rate. At one extreme, the algorithm is faster than Nesterov's Fast Gradient Method by a constant factor but more fragile to noise. At the other extreme, the algorithm reduces to the Gradient Method and is very robust to noise. The algorithm design technique is inspired by methods from classical control theory and the resulting algorithm has a simple analytical form. Algorithm performance is verified on a series of numerical simulations in both noise-free and relative gradient noise cases.  more » « less
Award ID(s):
1656951 1750162
PAR ID:
10080865
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
American Control Conference
Page Range / eLocation ID:
1376 - 1381
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Adversarial training (AT) is a widely recognized defense mechanism to gain the robustness of deep neural networks against adversarial attacks. It is built on min-max optimization (MMO), where the minimizer (i.e., defender) seeks a robust model to minimize the worst-case training loss in the presence of adversarial examples crafted by the maximizer (i.e., attacker). However, the conventional MMO method makes AT hard to scale. Thus, FAST-AT (Wong et al., 2020) and other recent algorithms attempt to simplify MMO by replacing its maximization step with the single gradient sign-based attack generation step. Although easy to implement, FAST-AT lacks theoretical guarantees, and its empirical performance is unsatisfactory due to the issue of robust catastrophic overfitting when training with strong adversaries. In this paper, we advance FAST-AT from the fresh perspective of bi-level optimization (BLO). We first show that the commonly used FAST-AT is equivalent to using a stochastic gradient algorithm to solve a linearized BLO problem involving a sign operation. However, the discrete nature of the sign operation makes it difficult to understand the algorithm performance. Inspired by BLO, we design and analyze a new set of robust training algorithms termed Fast Bilevel AT (FAST-BAT), which effectively defends sign-based projected gradient descent (PGD) attacks without using any gradient sign method or explicit robust regularization. In practice, we show our method yields substantial robustness improvements over baselines across multiple models and datasets 
    more » « less
  2. In this paper, we study the robustness property of policy optimization (particularly Gauss–Newton gradient descent algorithm which is equivalent to the policy iteration in reinforcement learning) subject to noise at each iteration. By invoking the concept of input-to-state stability and utilizing Lyapunov’s direct method, it is shown that, if the noise is sufficiently small, the policy iteration algorithm converges to a small neighborhood of the optimal solution even in the presence of noise at each iteration. Explicit expressions of the upperbound on the noise and the size of the neighborhood to which the policies ultimately converge are provided. Based on Willems’ fundamental lemma, a learning-based policy iteration algorithm is proposed. The persistent excitation condition can be readily guaranteed by checking the rank of the Hankel matrix related to an exploration signal. The robustness of the learning-based policy iteration to measurement noise and unknown system disturbances is theoretically demonstrated by the input-to-state stability of the policy iteration. Several numerical simulations are conducted to demonstrate the efficacy of the proposed method. 
    more » « less
  3. Abstract The purpose of the current study was to introduce a Deep learning‐based Accelerated and Noise‐Suppressed Estimation (DANSE) method for reconstructing quantitative maps of biological tissue cellular‐specific,R2t*, and hemodynamic‐specific,R2’, metrics of quantitative gradient‐recalled echo (qGRE) MRI. The DANSE method adapts a supervised learning paradigm to train a convolutional neural network for robust estimation ofR2t*andR2’maps with significantly reduced sensitivity to noise and the adverse effects of macroscopic (B0) magnetic field inhomogeneities directly from the gradient‐recalled echo (GRE) magnitude images. TheR2t*andR2’maps for training were generated by means of a voxel‐by‐voxel fitting of a previously developed biophysical quantitative qGRE model accounting for tissue, hemodynamic, and B0‐inhomogeneities contributions to multigradient‐echo GRE signal using a nonlinear least squares (NLLS) algorithm. We show that the DANSE model efficiently estimates the aforementioned qGRE maps and preserves all the features of the NLLS approach with significant improvements including noise suppression and computation speed (from many hours to seconds). The noise‐suppression feature of DANSE is especially prominent for data with low signal‐to‐noise ratio (SNR ~ 50–100), where DANSE‐generatedR2t*andR2’maps had up to three times smaller errors than that of the NLLS method. The DANSE method enables fast reconstruction of qGRE maps with significantly reduced sensitivity to noise and magnetic field inhomogeneities. The DANSE method does not require any information about field inhomogeneities during application. It exploits spatial and gradient echo time‐dependent patterns in the GRE data and previously gained knowledge from the biophysical model, thus producing high quality qGRE maps, even in environments with high noise levels. These features along with fast computational speed can lead to broad qGRE clinical and research applications. 
    more » « less
  4. Robust Markov decision processes (RMDPs) provide a promising framework for computing reliable policies in the face of model errors. Many successful reinforcement learning algorithms build on variations of policy-gradient methods, but adapting these methods to RMDPs has been challenging. As a result, the applicability of RMDPs to large, practical domains remains limited. This paper proposes a new Double-Loop Robust Policy Gradient (DRPG), the first generic policy gradient method for RMDPs. In contrast with prior robust policy gradient algorithms, DRPG monotonically reduces approximation errors to guarantee convergence to a globally optimal policy in tabular RMDPs. We introduce a novel parametric transition kernel and solve the inner loop robust policy via a gradient-based method. Finally, our numerical results demonstrate the utility of our new algorithm and confirm its global convergence properties. 
    more » « less
  5. Variational quantum algorithms rely on the optimization of parameterized quantum circuits in noisy settings. The commonly used back-propagation procedure in classical machine learning is not directly applicable in this setting due to the collapse of quantum states after measurements. Thus, gradient estimations constitute a significant overhead in a gradient-based optimization of such quantum circuits. This paper introduces a random coordinate descent algorithm as a practical and easy-to-implement alternative to the full gradient descent algorithm. This algorithm only requires one partial derivative at each iteration. Motivated by the behavior of measurement noise in the practical optimization of parameterized quantum circuits, this paper presents an optimization problem setting that is amenable to analysis. Under this setting, the random coordinate descent algorithm exhibits the same level of stochastic stability as the full gradient approach, making it as resilient to noise. The complexity of the random coordinate descent method is generally no worse than that of the gradient descent and can be much better for various quantum optimization problems with anisotropic Lipschitz constants. Theoretical analysis and extensive numerical experiments validate our findings. Published by the American Physical Society2024 
    more » « less