skip to main content


Search for: All records

Award ID contains: 2022448

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Free, publicly-accessible full text available June 2, 2024
  2. We present CausalSim, a causal framework for unbiased trace-driven simulation. Current trace-driven simulators assume that the interventions being simulated (e.g., a new algorithm) would not affect the validity of the traces. However, real-world traces are often biased by the choices algorithms make during trace collection, and hence replaying traces under an intervention may lead to incorrect results. CausalSim addresses this challenge by learning a causal model of the system dynamics and latent factors capturing the underlying system conditions during trace collection. It learns these models using an initial randomized control trial (RCT) under a fixed set of algorithms, and then applies them to remove biases from trace data when simulating new algorithms. Key to CausalSim is mapping unbiased trace-driven simulation to a tensor completion problem with extremely sparse observations. By exploiting a basic distributional invariance property present in RCT data, CausalSim enables a novel tensor completion method despite the sparsity of observations. Our extensive evaluation of CausalSim on both real and synthetic datasets, including more than ten months of real data from the Puffer video streaming system shows it improves simulation accuracy, reducing errors by 53% and 61% on average compared to expert-designed and supervised learning baselines. Moreover, CausalSim provides markedly different insights about ABR algorithms compared to the biased baseline simulator, which we validate with a real deployment 
    more » « less
  3. We study the task of learning state representations from potentially high-dimensional observations, with the goal of controlling an unknown partially observable system. We pursue a direct latent model learning approach, where a dynamic model in some latent state space is learned by predicting quantities directly related to planning (e.g., costs) without reconstructing the observations. In particular, we focus on an intuitive cost-driven state representation learning method for solving Linear Quadratic Gaussian (LQG) control, one of the most fundamental partially observable control problems. As our main results, we establish finite-sample guarantees of finding a near-optimal state representation function and a near-optimal controller using the directly learned latent model. To the best of our knowledge, despite various empirical successes, prior to this work it was unclear if such a cost-driven latent model learner enjoys finite-sample guarantees. Our work underscores the value of predicting multi-step costs, an idea that is key to our theory, and notably also an idea that is known to be empirically valuable for learning state representations. 
    more » « less
  4. Sampling edges from a graph in sublinear time is a fundamental problem and a powerful subroutine for designing sublinear-time algorithms. Suppose we have access to the vertices of the graph and know a constant-factor approximation to the number of edges. An algorithm for pointwise ε-approximate edge sampling with complexity has been given by Eden and Rosenbaum [SOSA 2018]. This has been later improved by Tetek and Thorup [STOC 2022] to . At the same time, time is necessary. We close the problem, by giving an algorithm with complexity for the task of sampling an edge exactly uniformly. 
    more » « less
  5. We aim to understand the extent to which the noise distribution in a planted signal-plusnoise problem impacts its computational complexity. To that end, we consider the planted clique and planted dense subgraph problems, but in a different ambient graph. Instead of Erd˝os-R´enyi G(n, p), which has independent edges, we take the ambient graph to be the random graph with triangles (RGT) obtained by adding triangles to G(n, p). We show that the RGT can be efficiently mapped to the corresponding G(n, p), and moreover, that the planted clique (or dense subgraph) is approximately preserved under this mapping. This constitutes the first average-case reduction transforming dependent noise to independent noise. Together with the easier direction of mapping the ambient graph from Erd˝os-R´enyi to RGT, our results yield a strong equivalence between models. In order to prove our results, we develop a new general framework for reasoning about the validity of average-case reductions based on low sensitivity to perturbations. 
    more » « less
  6. An ε-approximate quantile sketch over a stream of n inputs approximates the rank of any query point q—that is, the number of input points less than q—up to an additive error of εn, generally with some probability of at least 1−1/ poly(n), while consuming o(n) space. While the celebrated KLL sketch of Karnin, Lang, and Liberty achieves a provably optimal quantile approximation algorithm over worst-case streams, the approximations it achieves in practice are often far from optimal. Indeed, the most commonly used technique in practice is Dunning’s t-digest, which often achieves much better approximations than KLL on realworld data but is known to have arbitrarily large errors in the worst case. We apply interpolation techniques to the streaming quantiles problem to attempt to achieve better approximations on real-world data sets than KLL while maintaining similar guarantees in the worst case. 
    more » « less
  7. We study statistical/computational tradeoffs for the following density estimation problem: given kdistributionsv1,...,vk overadiscretedomain of size n, and sampling access to a distribution p, identify vi that is “close” to p. Our main result is the first data structure that, given a sublinear (in n) number of samples from p, identifies vi in time sublinear in k. We also give an improved version of the algorithm of (Acharya et al., 2018) that reports vi in time linear in k. The experimental evaluation of the latter algorithm shows that it achieves a significant reduction in the number of operations needed to achieve a given accuracy compared to prior work. 
    more » « less
  8. Understanding the shape of a distribution of data is of interest to people in a great variety of fields, as it may affect the types of algorithms used for that data. We study one such problem in the framework of {\em distribution property testing}, characterizing the number of samples required to to distinguish whether a distribution has a certain property or is far from having that property. In particular, given samples from a distribution, we seek to characterize the tail of the distribution, that is, understand how many elements appear infrequently. We develop an algorithm based on a careful bucketing scheme that distinguishes light-tailed distributions from non-light-tailed ones with respect to a definition based on the hazard rate, under natural smoothness and ordering assumptions. We bound the number of samples required for this test to succeed with high probability in terms of the parameters of the problem, showing that it is polynomial in these parameters. Further, we prove a hardness result that implies that this problem cannot be solved without any assumptions. 
    more » « less
  9. Kernel matrices, as well as weighted graphs represented by them, are ubiquitous objects in machine learning, statistics and other related fields. The main drawback of using kernel methods (learning and inference using kernel matrices) is efficiency – given n input points, most kernel-based algorithms need to materialize the full n × n kernel matrix before performing any subsequent computation, thus incurring Ω(n^2) runtime. Breaking this quadratic barrier for various problems has therefore, been a subject of extensive research efforts. We break the quadratic barrier and obtain subquadratic time algorithms for several fundamental linear-algebraic and graph processing primitives, including approximating the top eigenvalue and eigenvector, spectral sparsification, solving lin- ear systems, local clustering, low-rank approximation, arboricity estimation and counting weighted triangles. We build on the recently developed Kernel Density Estimation framework, which (after preprocessing in time subquadratic in n) can return estimates of row/column sums of the kernel matrix. In particular, we de- velop efficient reductions from weighted vertex and weighted edge sampling on kernel graphs, simulating random walks on kernel graphs, and importance sampling on matrices to Kernel Density Estimation and show that we can generate samples from these distributions in sublinear (in the support of the distribution) time. Our reductions are the central ingredient in each of our applications and we believe they may be of independent interest. We empirically demonstrate the efficacy of our algorithms on low-rank approximation (LRA) and spectral sparsi- fication, where we observe a 9x decrease in the number of kernel evaluations over baselines for LRA and a 41x reduction in the graph size for spectral sparsification. 
    more » « less