 Award ID(s):
 1727757
 NSFPAR ID:
 10273075
 Date Published:
 Journal Name:
 international conference on learning representation
 Format(s):
 Medium: X
 Sponsoring Org:
 National Science Foundation
More Like this

Stochastic gradient descent (SGD) is the optimization algorithm of choice in many machine learning applications such as regularized empirical risk minimization and training deep neural networks. The classical analysis of convergence of SGD is carried out under the assumption that the norm of the stochastic gradient is uniformly bounded. While this might hold for some loss functions, it is always violated for cases where the objective function is strongly convex. In (Bottou et al.,2016) a new analysis of convergence of SGD is performed under the assumption that stochastic gradients are bounded with respect to the true gradient norm. Here we show that for stochastic problems arising in machine learning such bound always holds. Moreover, we propose an alternative convergence analysis of SGD with diminishing learning rate regime, which is results in more relaxed conditions that those in (Bottou et al.,2016). We then move on the asynchronous parallel setting, and prove convergence of the Hogwild! algorithm in the same regime, obtaining the first convergence results for this method in the case of diminished learning rate.more » « less

As predictive models are increasingly being deployed in highstakes decision making (e.g., loan approvals), there has been growing interest in posthoc techniques which provide recourse to affected individuals. These techniques generate recourses under the assumption that the underlying predictive model does not change. However, in practice, models are often regularly updated for a variety of reasons (e.g., dataset shifts), thereby rendering previously prescribed recourses ineffective. To address this problem, we propose a novel framework, RObust Algorithmic Recourse (ROAR), that leverages adversarial training for finding recourses that are robust to model shifts. To the best of our knowledge, this work proposes the first ever solution to this critical problem. We also carry out theoretical analysis which underscores the importance of constructing recourses that are robust to model shifts: 1) We quantify the probability of invalidation for recourses generated without accounting for model shifts. 2) We prove that the additional cost incurred due to the robust recourses output by our framework is bounded. Experimental evaluation on multiple synthetic and realworld datasets demonstrates the efficacy of the proposed framework.more » « less

Networks of stiff fibers govern the elasticity of biological structures such as the extracellular matrix of collagen.These networks are known to stiffen nonlinearly under shear or extensional strain. Recently, it has been shown that such stiffening is governed by a straincontrolled athermal but critical phase transition, from a floppy phase below the critical strain to a rigid phase above the critical strain. While this phase transition has been extensively studied numerically and experimentally, a complete analytical theory for this transition remains elusive. Here, we present an effective medium theory (EMT) for this mechanical phase transition of fiber networks. We extend a previous EMT appropriate for linear elasticity to incorporate nonlinear effects via an anharmonic Hamiltonian. The meanfield predictions of this theory, including the critical exponents, scaling relations and nonaffine fluctuations qualitatively agree with previous experimental and numerical results.more » « less

In this paper, we propose Nesterov Accelerated Shuffling Gradient (NASG), a new algorithm for the convex finitesum minimization problems. Our method integrates the traditional Nesterov’s acceleration momentum with different shuffling sampling schemes. We show that our algorithm has an improved rate of O(1/T ) using unified shuffling schemes, where T is the number of epochs. This rate is better than that of any other shuffling gradient methods in convex regime. Our convergence analysis does not require an assumption on bounded domain or a bounded gradient condition. For randomized shuffling schemes, we improve the convergence bound further. When employing some initial condition, we show that our method converges faster near the small neighborhood of the solution. Numerical simulations demonstrate the efficiency of our algorithm.more » « less

Abstract We investigate error bounds for numerical solutions of divergence structure linear elliptic partial differential equations (PDEs) on compact manifolds without boundary. Our focus is on a class of monotone finite difference approximations, which provide a strong form of stability that guarantees the existence of a bounded solution. In many settings including the Dirichlet problem, it is easy to show that the resulting solution error is proportional to the formal consistency error of the scheme. We make the surprising observation that this need not be true for PDEs posed on compact manifolds without boundary. We propose a particular class of approximation schemes built around an underlying monotone scheme with consistency error $O(h^{\alpha })$. By carefully constructing barrier functions, we prove that the solution error is bounded by $O(h^{\alpha /(d+1)})$ in dimension $d$. We also provide a specific example where this predicted convergence rate is observed numerically. Using these error bounds, we further design a family of provably convergent approximations to the solution gradient.