Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to nonfederal websites. Their policies may differ from this site.

We investigate the adversarial robustness of streaming algorithms. In this context, an algorithm is considered robust if its performance guarantees hold even if the stream is chosen adaptively by an adversary that observes the outputs of the algorithm along the stream and can react in an online manner. While deterministic streaming algorithms are inherently robust, many central problems in the streaming literature do not admit sublinearspace deterministic algorithms; on the other hand, classical spaceefficient randomized algorithms for these problems are generally not adversarially robust. This raises the natural question of whether there exist efficient adversarially robust (randomized) streaming algorithms for these problems. In this work, we show that the answer is positive for various important streaming problems in the insertiononly model, including distinct elements and more generally F p estimation, F p heavy hitters, entropy estimation, and others. For all of these problems, we develop adversarially robust (1+ε)approximation algorithms whose required space matches that of the best known nonrobust algorithms up to a poly(log n , 1/ε) multiplicative factor (and in some cases even up to a constant factor). Towards this end, we develop several generic tools allowing one to efficiently transform a nonrobust streaming algorithm into a robust onemore »Free, publiclyaccessible full text available April 30, 2023

We study the $\ell_p$ regression problem, which requires finding $\mathbf{x}\in\mathbb R^{d}$ that minimizes $\\mathbf{A}\mathbf{x}\mathbf{b}\_p$ for a matrix $\mathbf{A}\in\mathbb R^{n \times d}$ and response vector $\mathbf{b}\in\mathbb R^{n}$. There has been recent interest in developing subsampling methods for this problem that can outperform standard techniques when $n$ is very large. However, all known subsampling approaches have run time that depends exponentially on $p$, typically, $d^{\mathcal{O}(p)}$, which can be prohibitively expensive. We improve on this work by showing that for a large class of common \emph{structured matrices}, such as combinations of lowrank matrices, sparse matrices, and Vandermonde matrices, there are subsampling based methods for $\ell_p$ regression that depend polynomially on $p$. For example, we give an algorithm for $\ell_p$ regression on Vandermonde matrices that runs in time $\mathcal{O}(n\log^3 n+(dp^2)^{0.5+\omega}\cdot\text{polylog}\,n)$, where $\omega$ is the exponent of matrix multiplication. The polynomial dependence on $p$ crucially allows our algorithms to extend naturally to efficient algorithms for $\ell_\infty$ regression, via approximation of $\ell_\infty$ by $\ell_{\mathcal{O}(\log n)}$. Of practical interest, we also develop a new subsampling algorithm for $\ell_p$ regression for arbitrary matrices, which is simpler than previous approaches for $p \ge 4$.Free, publiclyaccessible full text available April 25, 2023

Free, publiclyaccessible full text available January 1, 2023

Free, publiclyaccessible full text available January 1, 2023

Free, publiclyaccessible full text available January 1, 2023

Free, publiclyaccessible full text available January 1, 2023

Free, publiclyaccessible full text available January 1, 2023

Free, publiclyaccessible full text available January 1, 2023

We consider algorithms with access to an unknown matrix M ε F n×d via matrixvector products , namely, the algorithm chooses vectors v 1 , ⃛ , v q , and observes Mv 1 , ⃛ , Mv q . Here the v i can be randomized as well as chosen adaptively as a function of Mv 1 , ⃛ , Mv i1 . Motivated by applications of sketching in distributed computation, linear algebra, and streaming models, as well as connections to areas such as communication complexity and property testing, we initiate the study of the number q of queries needed to solve various fundamental problems. We study problems in three broad categories, including linear algebra, statistics problems, and graph problems. For example, we consider the number of queries required to approximate the rank, trace, maximum eigenvalue, and norms of a matrix M; to compute the AND/OR/Parity of each column or row of M, to decide whether there are identical columns or rows in M or whether M is symmetric, diagonal, or unitary; or to compute whether a graph defined by M is connected or trianglefree. We also show separations for algorithms that are allowed to obtain matrixvector products onlymore »

Free, publiclyaccessible full text available January 1, 2023