Distributed Least Squares in Small Space via Sketching and Bias Reduction

Garg, Sachin; Tan, Kevin; Dereziński, Michał

Citation Details

Matrix sketching is a powerful tool for reducing the size of large data matrices. Yet there are fundamental limitations to this size reduction when we want to recover an accurate estimator for a task such as least square regression. We show that these limitations can be circumvented in the distributed setting by designing sketching methods that minimize the bias of the estimator, rather than its error. In particular, we give a sparse sketching method running in optimal space and current matrix multiplication time, which recovers a nearly-unbiased least squares estimator using two passes over the data. This leads to new communication-efficient distributed averaging algorithms for least squares and related tasks, which directly improve on several prior approaches. Our key novelty is a new bias analysis for sketched least squares, giving a sharp characterization of its dependence on the sketch sparsity. The techniques include new higher moment restricted Bai-Silverstein inequalities, which are of independent interest to the non-asymptotic analysis of deterministic equivalents for random matrices that arise from sketching. more »

Award ID(s):: 2338655

PAR ID:: 10582040

Author(s) / Creator(s):: Garg, Sachin; Tan, Kevin; Dereziński, Michał

Publisher / Repository:: Advances in Neural Information Processing Systems (NeurIPS 2024)

Date Published:: 2024-12-10

Volume:: 37

Page Range / eLocation ID:: 73745--73782

Format(s):: Medium: X

Location:: Vancouver, Canada

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this