skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 11:00 PM ET on Friday, May 2 until 12:00 AM ET on Saturday, May 3 due to maintenance. We apologize for the inconvenience.


Title: Learning Linear Models Using Distributed Iterative Hessian Sketching
This work considers the problem of learning the Markov parameters of a linear system from ob- served data. Recent non-asymptotic system identification results have characterized the sample complexity of this problem in the single and multi-rollout setting. In both instances, the number of samples required in order to obtain acceptable estimates can produce optimization problems with an intractably large number of decision variables for a second-order algorithm. We show that a randomized and distributed Newton algorithm based on Hessian-sketching can produce ε-optimal solutions and converges geometrically. Moreover, the algorithm is trivially parallelizable. Our re- sults hold for a variety of sketching matrices and we illustrate the theory with numerical examples.  more » « less
Award ID(s):
2144634
PAR ID:
10390333
Author(s) / Creator(s):
;
Editor(s):
Firoozi, R.; Mehr, N.; Yel, E.; Antonova, R; Bohg, J.; Schwager, M.; Kochenderfer, M.
Date Published:
Journal Name:
Proceedings of Machine Learning Research
Volume:
168
ISSN:
2640-3498
Page Range / eLocation ID:
1-14
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. In second-order optimization, a potential bottleneck can be computing the Hessian matrix of the optimized function at every iteration. Randomized sketching has emerged as a powerful technique for constructing estimates of the Hessian which can be used to perform approximate Newton steps. This involves multiplication by a random sketching matrix, which introduces a trade-off between the computational cost of sketching and the convergence rate of the optimization algorithm. A theoretically desirable but practically much too expensive choice is to use a dense Gaussian sketching matrix, which produces unbiased estimates of the exact Newton step and which offers strong problem-independent convergence guarantees. We show that the Gaussian sketching matrix can be drastically sparsified, significantly reducing the computational cost of sketching, without substantially affecting its convergence properties. This approach, called Newton LESS, is based on a recently introduced sketching technique: LEverage Score Sparsified (LESS) embeddings. We prove that Newton-LESS enjoys nearly the same problem-independent local convergence rate as Gaussian embeddings, not just up to constant factors but even down to lower order terms, for a large class of optimization tasks. In particular, this leads to a new state-of-the-art convergence result for an iterative least squares solver. Finally, we extend LESS embeddings to include uniformly sparsified random sign matrices which can be implemented efficiently and which perform well in numerical experiments. 
    more » « less
  2. Abstract Linear regression is a classic method of data analysis. In recent years, sketching—a method of dimension reduction using random sampling, random projections or both—has gained popularity as an effective computational approximation when the number of observations greatly exceeds the number of variables. In this paper, we address the following question: how does sketching affect the statistical properties of the solution and key quantities derived from it? To answer this question, we present a projector-based approach to sketched linear regression that is exact and that requires minimal assumptions on the sketching matrix. Therefore, downstream analyses hold exactly and generally for all sketching schemes. Additionally, a projector-based approach enables derivation of key quantities from classic linear regression that account for the combined model- and algorithm-induced uncertainties. We demonstrate the usefulness of a projector-based approach in quantifying and enabling insight on excess uncertainties and bias-variance decompositions for sketched linear regression. Finally, we demonstrate how the insights from our projector-based analyses can be used to produce practical sketching diagnostics to aid the design of judicious sketching schemes. 
    more » « less
  3. We present a new class of preconditioned iterative methods for solving linear systems of the form Ax = b. Our methods are based on constructing a low-rank Nyström approximation to A using sparse random matrix sketching. This approximation is used to construct a preconditioner, which itself is inverted quickly using additional levels of random sketching and preconditioning. We prove that the convergence of our methods depends on a natural average condition number of A, which improves as the rank of the Nyström approximation increases. Concretely, this allows us to obtain faster runtimes for a number of fundamental linear algebraic problems: 1. We show how to solve any n × n linear system that is well-conditioned except for k outlying large singular values in Õ (n2.065 + kω) time, improving on a recent result of [Derezmski, Yang, STOC 2024] for all k ≳ n0.78. 2. We give the first Õ (n2 + dλω) time algorithm for solving a regularized linear system (A+λΙ)x = b, where A is positive semidefinite with effective dimension dλ = tr(A(A + λΙ)-1). This problem arises in applications like Gaussian process regression. 3. We give faster algorithms for approximating Schatten p-norms and other matrix norms. For example, for the Schatten 1-norm (nuclear norm), we give an algorithm that runs in Õ (n2.11) time, improving on an Õ (n2.18) method of [Musco et al., ITCS 2018]. All results are proven in the real RAM model of computation. Interestingly, previous state-of-the-art algorithms for most of the problems above relied on stochastic iterative methods, like stochastic coordinate and gradient descent. Our work takes a completely different approach, instead leveraging tools from matrix sketching. 
    more » « less
  4. We present a new class of preconditioned iterative methods for solving linear systems of the form Ax=b. Our methods are based on constructing a low-rank Nyström approximation to A using sparse random matrix sketching. This approximation is used to construct a preconditioner, which itself is inverted quickly using additional levels of random sketching and preconditioning. We prove that the convergence of our methods depends on a natural average condition number of A, which improves as the rank of the Nyström approximation increases. Concretely, this allows us to obtain faster runtimes for a number of fundamental linear algebraic problems: 1. We show how to solve any n×n linear system that is well-conditioned except for k outlying large singular values in O~(n^2.065+k^ω) time, improving on a recent result of [Dereziński, Yang, STOC 2024] for all k≳n^0.78. 2. We give the first O~(n^2+d_λ^ω) time algorithm for solving a regularized linear system (A+λI)x=b, where A is positive semidefinite with effective dimension d_λ=tr(A(A+λI)^{−1}). This problem arises in applications like Gaussian process regression. 3. We give faster algorithms for approximating Schatten p-norms and other matrix norms. For example, for the Schatten 1-norm (nuclear norm), we give an algorithm that runs in O~(n ^{2.11}) time, improving on an O~(n ^{2.18}) method of [Musco et al., ITCS 2018]. All results are proven in the real RAM model of computation. Interestingly, previous state-of-the-art algorithms for most of the problems above relied on stochastic iterative methods, like stochastic coordinate and gradient descent. Our work takes a completely different approach, instead leveraging tools from matrix sketching. 
    more » « less
  5. Perspective sketching is a skill that is required for a variety of jobs including, but not limited to, architectural design, graphic design, and engineering. Sketching however, is a difficult skill to grasp for people early and can take a while to learn. Recently, there have been many intelligent tutoring systems (ITSs) designed to help improve people’s drawing skills. The feedback system for the perspective drawing lessons in SketchTivity, one such ITS, is currently limited to smoothness, speed, and accuracy of the lines. Our team plans to improve upon this feedback system so that the feedback provided to a user is now more nuanced as well as more actionable to reaffirm future learning. To evaluate our system we will conduct a user study with 40 students that involves going through several sketching lessons and then sketching a street corner in 2-D perspective. We plan to run a between-subjects user study with our participants to determine if our adjustment has any effect on the improvement of sketching skills and the usability of the application. We hope to determine that providing the user with data for their smoothness, speed, and accuracy after four sketching prompts can cause an overall improvement in the students’ scores in comparison to at the end of their sketching session. The algorithm that we created to identify a student’s potential issue we hope will be able to provide accurate, actionable feedback in most situations. The visual alterations we made to SketchTivity we expect to have a positive impact on the perspective feedback system and alter the students sketching performance. In future iterations the algorithm should be further refined and the data collected from the students sketches should be further developed to provide more data to create more actionable recommendations for improved sketching performance and retention. 
    more » « less