This work studies our recently developed algorithm, decentralized alternating projected gradient descent algorithm (Dec-AltGDmin), for recovering a low rank (LR) matrix from independent columnwise linear projections in a decentralized setting. This means that the observed data is spread across L agents and there is no central coordinating node. Since this problem is non-convex and since it involves a subspace recovery step, most existing literature from decentralized optimization is not useful. We demonstrate using extensive numerical simulations and communication, time, and sample complexity comparisons that (i) existing decentralized gradient descent (GD) approaches fail, and (ii) other common solution approaches on LR recovery literature – projected GD, alternating GD and alternating minimization (AltMin) – either have a higher communication (and time) complexity or a higher sample complexity. Communication complexity is often the most important concern in decentralized learning.
more »
« less
Comparing Decentralized Gradient Descent Approaches and Guarantees
This work studies our recently developed decentralized algorithm, decentralized alternating projected gradient descent algorithm, called Dec-AltProjGDmin, for solving the following low-rank (LR) matrix recovery problem: recover an LR matrix from independent column-wise linear projections (LR column-wise Compressive Sensing). In recent work, we presented constructive convergence guarantees for Dec-AltProjGDmin under simple assumptions. By "constructive", we mean that the convergence time lower bound is provided for achieving any error level ε. However, our guarantee was stated for the equal neighbor consensus algorithm (at each iteration, each node computes the average of the data of all its neighbors) while most existing results do not assume the use of a specific consensus algorithm, but instead state guarantees in terms of the weights matrix eigenvalues. In order to compare with these results, we first modify our result to be in this form. Our second and main contribution is a theoretical and experimental comparison of our new result with the best existing one from the decentralized GD literature that also provides a convergence time bound for values of ε that are large enough. The existing guarantee is for a different problem setting and holds under different assumptions than ours and hence the comparison is not very clear cut. However, we are not aware of any other provably correct algorithms for decentralized LR matrix recovery in any other settings either.
more »
« less
- Award ID(s):
- 2213069
- PAR ID:
- 10437390
- Date Published:
- Journal Name:
- 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- Page Range / eLocation ID:
- 1 to 5
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
This paper introduces a novel framework called Mode-wise Principal Subspace Pursuit (MOP-UP) to extract hidden variations in both the row and column dimensions for matrix data. To enhance the understanding of the framework, we introduce a class of matrix-variate spiked covariance models that serve as inspiration for the development of the MOP-UP algorithm. The MOP-UP algorithm consists of two steps: Average Subspace Capture (ASC) and Alternating Projection. These steps are specifically designed to capture the row-wise and column-wise dimension-reduced subspaces which contain the most informative features of the data. ASC utilizes a novel average projection operator as initialization and achieves exact recovery in the noiseless setting. We analyse the convergence and non-asymptotic error bounds of MOP-UP, introducing a blockwise matrix eigenvalue perturbation bound that proves the desired bound, where classic perturbation bounds fail. The effectiveness and practical merits of the proposed framework are demonstrated through experiments on both simulated and real datasets. Lastly, we discuss generalizations of our approach to higher-order data.more » « less
-
null (Ed.)We study the low rank regression problem y = Mx + ε, where x and y are d1 and d2 dimensional vectors respectively. We consider the extreme high-dimensional setting where the number of observations n is less than d1 + d2. Existing algorithms are designed for settings where n is typically as large as rank(M)(d1+d2). This work provides an efficient algorithm which only involves two SVD, and establishes statistical guarantees on its performance. The algorithm decouples the problem by first estimating the precision matrix of the features, and then solving the matrix de-noising problem. To complement the upper bound, we introduce new techniques for establishing lower bounds on the performance of any algorithm for this problem. Our preliminary experiments confirm that our algorithm often out-performs existing baseline, and is always at least competitive.more » « less
-
This work considers two related learning problems in a federated attack-prone setting – federated principal com- ponents analysis (PCA) and federated low rank column-wise sensing (LRCS). The node attacks are assumed to be Byzan- tine which means that the attackers are omniscient and can collude. We introduce a novel provably Byzantine-resilient communication-efficient and sample-efficient algorithm, called Subspace-Median, that solves the PCA problem and is a key part of the solution for the LRCS problem. We also study the most natural Byzantine-resilient solution for federated PCA, a geometric median based modification of the federated power method, and explain why it is not useful. Our second main contribution is a complete alternating gradient descent (GD) and minimization (altGDmin) algorithm for Byzantine-resilient horizontally federated LRCS and sample and communication complexity guarantees for it. Extensive simulation experiments are used to corroborate our theoretical guarantees. The ideas that we develop for LRCS are easily extendable to other LR recovery problems as well.more » « less
-
Estimating the ε-approximate quantiles or ranks of a stream is a fundamental task in data monitoring. Given a stream x_1,..., x_n from a universe \mathcalU with total order, an additive-error quantile sketch \mathcalM allows us to approximate the rank of any query y\in \mathcalU up to additive ε n error. In 2001, Greenwald and Khanna gave a deterministic algorithm (GK sketch) that solves the ε-approximate quantiles estimation problem using O(ε^-1 łog(ε n)) space \citegreenwald2001space ; recently, this algorithm was shown to be optimal by Cormode and Vesleý in 2020 \citecormode2020tight. However, due to the intricacy of the GK sketch and its analysis, over-simplified versions of the algorithm are implemented in practical applications, often without any known theoretical guarantees. In fact, it has remained an open question whether the GK sketch can be simplified while maintaining the optimal space bound. In this paper, we resolve this open question by giving a simplified deterministic algorithm that stores at most (2 + o(1))ε^-1 łog (ε n) elements and solves the additive-error quantile estimation problem; as a side benefit, our algorithm achieves a smaller constant factor than the \frac11 2 ε^-1 łog(ε n) space bound in the original GK sketch~\citegreenwald2001space. Our algorithm features an easier analysis and still achieves the same optimal asymptotic space complexity as the original GK sketch. Lastly, our simplification enables an efficient data structure implementation, with a worst-case runtime of O(łog(1/ε) + łog łog (ε n)) per-element for the ordinary ε-approximate quantile estimation problem. Also, for the related weighted'' quantile estimation problem, we give efficient data structures for our simplified algorithm which guarantee a worst-case per-element runtime of O(łog(1/ε) + łog łog (ε W_n/w_\textrmmin )), achieving an improvement over the previous upper bound of \citeassadi2023generalizing.more » « less
An official website of the United States government

