Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to nonfederal websites. Their policies may differ from this site.

Free, publiclyaccessible full text available June 2, 2024

We present CausalSim, a causal framework for unbiased tracedriven simulation. Current tracedriven simulators assume that the interventions being simulated (e.g., a new algorithm) would not affect the validity of the traces. However, realworld traces are often biased by the choices algorithms make during trace collection, and hence replaying traces under an intervention may lead to incorrect results. CausalSim addresses this challenge by learning a causal model of the system dynamics and latent factors capturing the underlying system conditions during trace collection. It learns these models using an initial randomized control trial (RCT) under a fixed set of algorithms, and then applies them to remove biases from trace data when simulating new algorithms. Key to CausalSim is mapping unbiased tracedriven simulation to a tensor completion problem with extremely sparse observations. By exploiting a basic distributional invariance property present in RCT data, CausalSim enables a novel tensor completion method despite the sparsity of observations. Our extensive evaluation of CausalSim on both real and synthetic datasets, including more than ten months of real data from the Puffer video streaming system shows it improves simulation accuracy, reducing errors by 53% and 61% on average compared to expertdesigned and supervised learning baselines. Moreover, CausalSim provides markedly different insights about ABR algorithms compared to the biased baseline simulator, which we validate with a real deploymentmore » « less

We study the task of learning state representations from potentially highdimensional observations, with the goal of controlling an unknown partially observable system. We pursue a direct latent model learning approach, where a dynamic model in some latent state space is learned by predicting quantities directly related to planning (e.g., costs) without reconstructing the observations. In particular, we focus on an intuitive costdriven state representation learning method for solving Linear Quadratic Gaussian (LQG) control, one of the most fundamental partially observable control problems. As our main results, we establish finitesample guarantees of finding a nearoptimal state representation function and a nearoptimal controller using the directly learned latent model. To the best of our knowledge, despite various empirical successes, prior to this work it was unclear if such a costdriven latent model learner enjoys finitesample guarantees. Our work underscores the value of predicting multistep costs, an idea that is key to our theory, and notably also an idea that is known to be empirically valuable for learning state representations.more » « less

Sampling edges from a graph in sublinear time is a fundamental problem and a powerful subroutine for designing sublineartime algorithms. Suppose we have access to the vertices of the graph and know a constantfactor approximation to the number of edges. An algorithm for pointwise εapproximate edge sampling with complexity has been given by Eden and Rosenbaum [SOSA 2018]. This has been later improved by Tetek and Thorup [STOC 2022] to . At the same time, time is necessary. We close the problem, by giving an algorithm with complexity for the task of sampling an edge exactly uniformly.more » « less

Algorithmic Decorrelation and Planted Clique in Dependent Random Graphs: The Case of Extra TrianglesWe aim to understand the extent to which the noise distribution in a planted signalplusnoise problem impacts its computational complexity. To that end, we consider the planted clique and planted dense subgraph problems, but in a different ambient graph. Instead of Erd˝osR´enyi G(n, p), which has independent edges, we take the ambient graph to be the random graph with triangles (RGT) obtained by adding triangles to G(n, p). We show that the RGT can be efficiently mapped to the corresponding G(n, p), and moreover, that the planted clique (or dense subgraph) is approximately preserved under this mapping. This constitutes the first averagecase reduction transforming dependent noise to independent noise. Together with the easier direction of mapping the ambient graph from Erd˝osR´enyi to RGT, our results yield a strong equivalence between models. In order to prove our results, we develop a new general framework for reasoning about the validity of averagecase reductions based on low sensitivity to perturbations.more » « less

An εapproximate quantile sketch over a stream of n inputs approximates the rank of any query point q—that is, the number of input points less than q—up to an additive error of εn, generally with some probability of at least 1−1/ poly(n), while consuming o(n) space. While the celebrated KLL sketch of Karnin, Lang, and Liberty achieves a provably optimal quantile approximation algorithm over worstcase streams, the approximations it achieves in practice are often far from optimal. Indeed, the most commonly used technique in practice is Dunning’s tdigest, which often achieves much better approximations than KLL on realworld data but is known to have arbitrarily large errors in the worst case. We apply interpolation techniques to the streaming quantiles problem to attempt to achieve better approximations on realworld data sets than KLL while maintaining similar guarantees in the worst case.more » « less

We study statistical/computational tradeoffs for the following density estimation problem: given kdistributionsv1,...,vk overadiscretedomain of size n, and sampling access to a distribution p, identify vi that is “close” to p. Our main result is the first data structure that, given a sublinear (in n) number of samples from p, identifies vi in time sublinear in k. We also give an improved version of the algorithm of (Acharya et al., 2018) that reports vi in time linear in k. The experimental evaluation of the latter algorithm shows that it achieves a significant reduction in the number of operations needed to achieve a given accuracy compared to prior work.more » « less

Understanding the shape of a distribution of data is of interest to people in a great variety of fields, as it may affect the types of algorithms used for that data. We study one such problem in the framework of {\em distribution property testing}, characterizing the number of samples required to to distinguish whether a distribution has a certain property or is far from having that property. In particular, given samples from a distribution, we seek to characterize the tail of the distribution, that is, understand how many elements appear infrequently. We develop an algorithm based on a careful bucketing scheme that distinguishes lighttailed distributions from nonlighttailed ones with respect to a definition based on the hazard rate, under natural smoothness and ordering assumptions. We bound the number of samples required for this test to succeed with high probability in terms of the parameters of the problem, showing that it is polynomial in these parameters. Further, we prove a hardness result that implies that this problem cannot be solved without any assumptions.more » « less

Kernel matrices, as well as weighted graphs represented by them, are ubiquitous objects in machine learning, statistics and other related fields. The main drawback of using kernel methods (learning and inference using kernel matrices) is efficiency – given n input points, most kernelbased algorithms need to materialize the full n × n kernel matrix before performing any subsequent computation, thus incurring Ω(n^2) runtime. Breaking this quadratic barrier for various problems has therefore, been a subject of extensive research efforts. We break the quadratic barrier and obtain subquadratic time algorithms for several fundamental linearalgebraic and graph processing primitives, including approximating the top eigenvalue and eigenvector, spectral sparsification, solving lin ear systems, local clustering, lowrank approximation, arboricity estimation and counting weighted triangles. We build on the recently developed Kernel Density Estimation framework, which (after preprocessing in time subquadratic in n) can return estimates of row/column sums of the kernel matrix. In particular, we de velop efficient reductions from weighted vertex and weighted edge sampling on kernel graphs, simulating random walks on kernel graphs, and importance sampling on matrices to Kernel Density Estimation and show that we can generate samples from these distributions in sublinear (in the support of the distribution) time. Our reductions are the central ingredient in each of our applications and we believe they may be of independent interest. We empirically demonstrate the efficacy of our algorithms on lowrank approximation (LRA) and spectral sparsi fication, where we observe a 9x decrease in the number of kernel evaluations over baselines for LRA and a 41x reduction in the graph size for spectral sparsification.more » « less