skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Precise Error Estimation for Sketch-based Flow Measurement
As a class of approximate measurement approaches, sketching algorithms have significantly improved the estimation of network flow information using limited resources. While these algorithms enjoy sound error-bound analysis under worst-case scenarios, their actual errors can vary significantly with the incoming flow distribution, making their traditional error bounds too "loose" to be useful in practice. In this paper, we propose a simple yet rigorous error estimation method to more precisely analyze the errors for posterior sketch queries by leveraging the knowledge from the sketch counters. This approach will enable network operators to understand how accurate the current measurements are and make appropriate decisions accordingly (e.g., identify potential heavy users or answer "what-if" questions to better provision resources). Theoretical analysis and trace-driven experiments show that our estimated bounds on sketch errors are much tighter than previous ones and match the actual error bounds in most cases.  more » « less
Award ID(s):
2107086 2106946
PAR ID:
10343435
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
Proceedings of the 21st ACM Internet Measurement Conference (IMC '21)
Page Range / eLocation ID:
113 to 121
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Backward error analysisoffers a method for assessing the quality of numerical programs in the presence of floating-point rounding errors. However, techniques from the numerical analysis literature for quantifying backward error require substantial human effort, and there are currently no tools or automated methods for statically deriving sound backward error bounds. To address this gap, we propose Bean, a typed first-order programming language designed to express quantitative bounds on backward error. Bean’s type system combines a graded coeffect system with strict linearity to soundly track the flow of backward error through programs. We prove the soundness of our system using a novel categorical semantics, where every Bean program denotes a triple of related transformations that together satisfy a backward error guarantee. To illustrate Bean’s potential as a practical tool for automated backward error analysis, we implement a variety of standard algorithms from numerical linear algebra in Bean, establishing fine-grained backward error bounds via typing in a compositional style. We also develop a prototype implementation of Bean that infers backward error bounds automatically. Our evaluation shows that these inferred bounds match worst-case theoretical relative backward error bounds from the literature, underscoring Bean’s utility in validating a key property of numerical programs:numerical stability. 
    more » « less
  2. Due to mainstream adoption of cloud computing and its rapidly increasing usage of energy, the efficient management of cloud computing resources has become an important issue. A key challenge in managing the resources lies in the volatility of their demand. While there have been a wide variety of online algorithms (e.g. Receding Horizon Control, Online Balanced Descent) designed, it is hard for cloud operators to pick the right algorithm. In particular, these algorithms vary greatly on their usage of predictions and performance guarantees. This paper aims at studying an automatic algorithm selection scheme in real time. To do this, we empirically study the prediction errors from real-world cloud computing traces. Results show that prediction errors are distinct from different prediction algorithms, across virtual machines, and over the time horizon. Based on these observations, we propose a simple prediction error model and prove upper bounds on the dynamic regret of several online algorithms. We then apply the empirical and theoretical results to create a simple online meta-algorithm that chooses the best algorithm on the fly. Numerical simulations demonstrate that the performance of the designed policy is close to that of the best algorithm in hindsight. 
    more » « less
  3. Per-flow size measurement is key to many streaming applications and management systems, particularly in high-speed networks. Performing such measurement on the data plane of a network device at the line rate requires on-chip memory and computing resources that are shared by other key network functions. It leads to the need for very compact and fast data structures, called sketches, which trade off space for accuracy. Such a need also arises in other application context for extremely large data sets. The goal of sketch design is two-fold: to measure flow size as accurately as possible and to do so as efficiently as possible (for low overhead and thus high processing throughput). The existing sketches can be broadly categorized to multi-update sketches and single update sketches. The former are more accurate but carry larger overhead. The latter incur small overhead but their accuracy is poor. This paper proposes a Single update Sketch with a Variable counter Structure (SSVS), a new sketch design which is several times faster than the existing multi-update sketches with comparable accuracy, and is several times more accurate than the existing single update sketches with comparable overhead. The new sketch design embodies several technical contributions that integrate the enabling properties from both multi-update sketches and single update sketches in a novel structure that effectively controls the measurement error with minimum processing overhead. 
    more » « less
  4. Estimating the ε-approximate quantiles or ranks of a stream is a fundamental task in data monitoring. Given a stream x_1,..., x_n from a universe \mathcalU with total order, an additive-error quantile sketch \mathcalM allows us to approximate the rank of any query y\in \mathcalU up to additive ε n error. In 2001, Greenwald and Khanna gave a deterministic algorithm (GK sketch) that solves the ε-approximate quantiles estimation problem using O(ε^-1 łog(ε n)) space \citegreenwald2001space ; recently, this algorithm was shown to be optimal by Cormode and Vesleý in 2020 \citecormode2020tight. However, due to the intricacy of the GK sketch and its analysis, over-simplified versions of the algorithm are implemented in practical applications, often without any known theoretical guarantees. In fact, it has remained an open question whether the GK sketch can be simplified while maintaining the optimal space bound. In this paper, we resolve this open question by giving a simplified deterministic algorithm that stores at most (2 + o(1))ε^-1 łog (ε n) elements and solves the additive-error quantile estimation problem; as a side benefit, our algorithm achieves a smaller constant factor than the \frac11 2 ε^-1 łog(ε n) space bound in the original GK sketch~\citegreenwald2001space. Our algorithm features an easier analysis and still achieves the same optimal asymptotic space complexity as the original GK sketch. Lastly, our simplification enables an efficient data structure implementation, with a worst-case runtime of O(łog(1/ε) + łog łog (ε n)) per-element for the ordinary ε-approximate quantile estimation problem. Also, for the related weighted'' quantile estimation problem, we give efficient data structures for our simplified algorithm which guarantee a worst-case per-element runtime of O(łog(1/ε) + łog łog (ε W_n/w_\textrmmin )), achieving an improvement over the previous upper bound of \citeassadi2023generalizing. 
    more » « less
  5. The Tucker tensor decomposition is a natural extension of the singular value decomposition (SVD) to multiway data. We propose to accelerate Tucker tensor decomposition algorithms by using randomization and parallelization. We present two algorithms that scale to large data and many processors, significantly reduce both computation and communication cost compared to previous deterministic and randomized approaches, and obtain nearly the same approximation errors. The key idea in our algorithms is to perform randomized sketches with Kronecker-structured random matrices, which reduces computation compared to unstructured matrices and can be implemented using a fundamental tensor computational kernel. We provide probabilistic error analysis of our algorithms and implement a new parallel algorithm for the structured randomized sketch. Our experimental results demonstrate that our combination of randomization and parallelization achieves accurate Tucker decompositions much faster than alternative approaches. We observe up to a 16X speedup over the fastest deterministic parallel implementation on 3D simulation data. 
    more » « less