Abstract We consider the problem of estimating the factors of a rank-$$1$$ matrix with i.i.d. Gaussian, rank-$$1$$ measurements that are nonlinearly transformed and corrupted by noise. Considering two prototypical choices for the nonlinearity, we study the convergence properties of a natural alternating update rule for this non-convex optimization problem starting from a random initialization. We show sharp convergence guarantees for a sample-split version of the algorithm by deriving a deterministic one-step recursion that is accurate even in high-dimensional problems. Notably, while the infinite-sample population update is uninformative and suggests exact recovery in a single step, the algorithm—and our deterministic one-step prediction—converges geometrically fast from a random initialization. Our sharp, non-asymptotic analysis also exposes several other fine-grained properties of this problem, including how the nonlinearity and noise level affect convergence behaviour. On a technical level, our results are enabled by showing that the empirical error recursion can be predicted by our deterministic one-step updates within fluctuations of the order $$n^{-1/2}$$ when each iteration is run with $$n$$ observations. Our technique leverages leave-one-out tools originating in the literature on high-dimensional $$M$$-estimation and provides an avenue for sharply analyzing complex iterative algorithms from a random initialization in other high-dimensional optimization problems with random data. 
                        more » 
                        « less   
                    
                            
                            Composite optimization for robust rank one bilinear sensing
                        
                    
    
            Abstract We consider the task of recovering a pair of vectors from a set of rank one bilinear measurements, possibly corrupted by noise. Most notably, the problem of robust blind deconvolution can be modeled in this way. We consider a natural nonsmooth formulation of the rank one bilinear sensing problem and show that its moduli of weak convexity, sharpness and Lipschitz continuity are all dimension independent, under favorable statistical assumptions. This phenomenon persists even when up to half of the measurements are corrupted by noise. Consequently, standard algorithms, such as the subgradient and prox-linear methods, converge at a rapid dimension-independent rate when initialized within a constant relative error of the solution. We complete the paper with a new initialization strategy, complementing the local search algorithms. The initialization procedure is both provably efficient and robust to outlying measurements. Numerical experiments, on both simulated and real data, illustrate the developed theory and methods. 
        more » 
        « less   
        
    
    
                            - PAR ID:
- 10233390
- Date Published:
- Journal Name:
- Information and Inference: A Journal of the IMA
- ISSN:
- 2049-8772
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            Finding an approximate second-order stationary point (SOSP) is a well-studied and fundamental problem in stochastic nonconvex optimization with many applications in machine learning. However, this problem is poorly understood in the presence of outliers, limiting the use of existing nonconvex algorithms in adversarial settings. In this paper, we study the problem of finding SOSPs in the strong contamination model, where a constant fraction of datapoints are arbitrarily corrupted. We introduce a general framework for efficiently finding an approximate SOSP with dimension-independent accuracy guarantees, using $$\widetilde{O}({D^2}/{\epsilon})$$ samples where $$D$$ is the ambient dimension and $$\epsilon$$ is the fraction of corrupted datapoints. As a concrete application of our framework, we apply it to the problem of low rank matrix sensing, developing efficient and provably robust algorithms that can tolerate corruptions in both the sensing matrices and the measurements. In addition, we establish a Statistical Query lower bound providing evidence that the quadratic dependence on $$D$$ in the sample complexity is necessary for computationally efficient algorithms.more » « less
- 
            Finding an approximate second-order stationary point (SOSP) is a well-studied and fundamental problem in stochastic nonconvex optimization with many applications in machine learning. However, this problem is poorly understood in the presence of outliers, limiting the use of existing nonconvex algorithms in adversarial settings. In this paper, we study the problem of finding SOSPs in the strong contamination model, where a constant fraction of datapoints are arbitrarily corrupted. We introduce a general framework for efficiently finding an approximate SOSP with dimension-independent accuracy guarantees, using O(D^2/\eps) samples where D is the ambient dimension and ǫ is the fraction of corrupted datapoints. As a concrete application of our framework, we apply it to the problem of low rank matrix sensing, developing efficient and provably robust algorithms that can tolerate corruptions in both the sensing matrices and the measurements. In addition, we establish a Statistical Query lower bound providing evidence that the quadratic dependence on D in the sample complexity is necessary for computationally efficient algorithms.more » « less
- 
            Robust PCA is a widely used statistical procedure to recover an underlying low-rank matrix with grossly corrupted observations. This work considers the problem of robust PCA as a nonconvex optimization problem on the manifold of low-rank matrices and proposes two algorithms based on manifold optimization. It is shown that, with a properly designed initialization, the proposed algorithms are guaranteed to converge to the underlying lowrank matrix linearly. Compared with a previous work based on the factorization of low-rank matrices Yi et al. (2016), the proposed algorithms reduce the dependence on the condition number of the underlying low-rank matrix theoretically. Simulations and real data examples con rm the competitive performance of our method.more » « less
- 
            We study the problem of estimating a large, low-rank matrix corrupted by additive noise of unknown covariance, assuming one has access to additional side information in the form of noise-only measurements. We study the Whiten-Shrink-reColour (WSC) workflow, where a ‘noise covariance whitening’ transformation is applied to the observations, followed by appropriate singular value shrinkage and a ‘noise covariance re-colouring’ transformation. We show that under the mean square error loss, a unique, asymptotically optimal shrinkage nonlinearity exists for the WSC denoising workflow, and calculate it in closed form. To this end, we calculate the asymptotic eigenvector rotation of the random spiked F-matrix ensemble, a result which may be of independent interest. With sufficiently many pure-noise measurements, our optimally tuned WSC denoising workflow outperforms, in mean square error, matrix denoising algorithms based on optimal singular value shrinkage that do not make similar use of noise-only side information; numerical experiments show that our procedure’s relative performance is particularly strong in challenging statistical settings with high dimensionality and large degree of heteroscedasticity.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    