skip to main content


Title: Non-convex low-rank matrix recovery from corrupted random linear measurements
Recent work has demonstrated the effectiveness of gradient descent for recovering low-rank matrices from random linear measurements in a globally convergent manner. However, their performance is highly sensitive in the presence of outliers that may take arbitrary values, which is common in practice. In this paper, we propose a truncated gradient descent algorithm to improve the robustness against outliers, where the truncation is performed to rule out the contributions from samples that deviate significantly from the sample median. A restricted isometry property regarding the sample median is introduced to provide a theoretical footing of the proposed algorithm for the Gaussian orthogonal ensemble. Extensive numerical experiments are provided to validate the superior performance of the proposed algorithm.  more » « less
Award ID(s):
1650449
NSF-PAR ID:
10048415
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Sampling Theory and Applications (SampTA), 2017 International Conference on
Page Range / eLocation ID:
134 to 137
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Growth mixture modeling is a popular analytic tool for longitudinal data analysis. It detects latent groups based on the shapes of growth trajectories. Traditional growth mixture modeling assumes that outcome variables are normally distributed within each class. When data violate this normality assumption, however, it is well documented that the traditional growth mixture modeling mislead researchers in determining the number of latent classes as well as in estimating parameters. To address nonnormal data in growth mixture modeling, robust methods based on various nonnormal distributions have been developed. As a new robust approach, growth mixture modeling based on conditional medians has been proposed. In this article, we present the results of two simulation studies that evaluate the performance of the median-based growth mixture modeling in identifying the correct number of latent classes when data follow the normality assumption or have outliers. We also compared the performance of the median-based growth mixture modeling to the performance of traditional growth mixture modeling as well as robust growth mixture modeling based on t distributions. For identifying the number of latent classes in growth mixture modeling, the following three Bayesian model comparison criteria were considered: deviance information criterion, Watanabe-Akaike information criterion, and leave-one-out cross validation. For the median-based growth mixture modeling and t -based growth mixture modeling, our results showed that they maintained quite high model selection accuracy across all conditions in this study (ranged from 87 to 100%). In the traditional growth mixture modeling, however, the model selection accuracy was greatly influenced by the proportion of outliers. When sample size was 500 and the proportion of outliers was 0.05, the correct model was preferred in about 90% of the replications, but the percentage dropped to about 40% as the proportion of outliers increased to 0.15. 
    more » « less
  2. The mean squared error loss is widely used in many applications, including auto-encoders, multi-target regression, and matrix factorization, to name a few. Despite computational advantages due to its differentiability, it is not robust to outliers. In contrast, ā„“š¯‘¯ norms are known to be robust, but cannot be optimized via, e.g., stochastic gradient descent, as they are non-differentiable. We propose an algorithm inspired by so-called model-based optimization (MBO), which replaces a non-convex objective with a convex model function and alternates between optimizing the model function and updating the solution. We apply this to robust regression, proposing SADM, a stochastic variant of the Online Alternating Direction Method of Multipliers (OADM) to solve the inner optimization in MBO. We show that SADM converges with the rate š¯‘‚(logš¯‘‡/š¯‘‡) . Finally, we demonstrate experimentally (a) the robustness of ā„“š¯‘¯ norms to outliers and (b) the efficiency of our proposed model-based algorithms in comparison with gradient methods on autoencoders and multi-target regression. 
    more » « less
  3. We present a mathematical analysis of a non-convex energy landscape for robust subspace recovery. We prove that an underlying subspace is the only stationary point and local minimizer in a specified neighborhood under a deterministic condition on a dataset. If the deterministic condition is satisfied, we further show that a geodesic gradient descent method over the Grassmannian manifold can exactly recover the underlying subspace when the method is properly initialized. Proper initialization by principal component analysis is guaranteed with a simple deterministic condition. Under slightly stronger assumptions, the gradient descent method with a piecewise constant step-size scheme achieves linear convergence. The practicality of the deterministic condition is demonstrated on some statistical models of data, and the method achieves almost state-of-the-art recovery guarantees on the Haystack Model for different regimes of sample size and ambient dimension. In particular, when the ambient dimension is fixed and the sample size is large enough, we show that our gradient method can exactly recover the underlying subspace for any fixed fraction of outliers (less than 1). 
    more » « less
  4. Wasserstein gradient flows provide a powerful means of understanding and solving many diffusion equations. Specifically, Fokker-Planck equations, which model the diffusion of probability measures, can be understood as gradient descent over entropy functionals in Wasserstein space. This equivalence, introduced by Jordan, Kinderlehrer and Otto, inspired the so-called JKO scheme to approximate these diffusion processes via an implicit discretization of the gradient flow in Wasserstein space. Solving the optimization problem associated with each JKO step, however, presents serious computational challenges. We introduce a scalable method to approximate Wasserstein gradient flows, targeted to machine learning applications. Our approach relies on input-convex neural networks (ICNNs) to discretize the JKO steps, which can be optimized by stochastic gradient descent. Contrarily to previous work, our method does not require domain discretization or particle simulation. As a result, we can sample from the measure at each time step of the diffusion and compute its probability density. We demonstrate the performance of our algorithm by computing diffusions following the Fokker-Planck equation and apply it to unnormalized density sampling as well as nonlinear filtering. 
    more » « less
  5. Abstract Practical engineering designs typically involve many load cases. For topology optimization with many deterministic load cases, a large number of linear systems of equations must be solved at each optimization step, leading to an enormous computational cost. To address this challenge, we propose a mirror descent stochastic approximation (MD-SA) framework with various step size strategies to solve topology optimization problems with many load cases. We reformulate the deterministic objective function and gradient into stochastic ones through randomization, derive the MD-SA update, and develop algorithmic strategies. The proposed MD-SA algorithm requires only low accuracy in the stochastic gradient and thus uses only a single sample per optimization step (i.e., the sample size is always one). As a result, we reduce the number of linear systems to solve per step from hundreds to one, which drastically reduces the total computational cost, while maintaining a similar design quality. For example, for one of the design problems, the total number of linear systems to solve and wall clock time are reduced by factors of 223 and 22, respectively. 
    more » « less