skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Hierarchical Virtual Bitmaps for Spread Estimation in Traffic Measurement
This paper introduces a hierarchical traffic model for spread measurement of network traffic flows. The hierarchical model, which aggregates lower level flows into higher-level flows in a hierarchical structure, will allow us to measure network traffic at different granularities at once to support diverse traffic analysis from a grand view to fine-grained details. The spread of a flow is the number of distinct elements (under measurement) in the flow, where the flow label (that identifies packets belonging to the flow) and the elements (which are defined based on application need) can be found in packet headers or payload. Traditional flow spread estimators are designed without hierarchical traffic modeling in mind, and incur high overhead when they are applied to each level of the traffic hierarchy. In this paper, we propose a new Hierarchical Virtual bitmap Estimator (HVE) that performs simultaneous multi-level traffic measurement, at the same cost of a traditional estimator, without degrading measurement accuracy. We implement the proposed solution and perform experiments based on real traffic traces. The experimental results demonstrate that HVE improves measurement throughput by 43% to 155%, thanks to the reduction of perpacket processing overhead. For small to medium flows, its measurement accuracy is largely similar to traditional estimators that work at one level at a time. For large aggregate and base flows, its accuracy is better, with up to 97% smaller error in our experiments.  more » « less
Award ID(s):
1909077 1719222
PAR ID:
10297513
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
Proceedings of 6th International Conference on Networks, Communications, Wireless, and Mobile Computing
Page Range / eLocation ID:
221 to 238
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Measuring flow spread in real time from large, high-rate data streams has numerous practical applications, where a data stream is modeled as a sequence of data items from different flows and the spread of a flow is the number of distinct items in the flow. Past decades have witnessed tremendous performance improvement for single-flow spread estimation. However, when dealing with numerous flows in a data stream, it remains a significant challenge to measure per-flow spread accurately while reducing memory footprint. The goal of this paper is to introduce new multi-flow spread estimation designs that incur much smaller processing overhead and query overhead than the state of the art, yet achieves significant accuracy improvement in spread estimation. We formally analyze the performance of these new designs. We implement them in both hardware and software, and use real-world data traces to evaluate their performance in comparison with the state of the art. The experimental results show that our best sketch significantly improves over the best existing work in terms of estimation accuracy, data item processing throughput, and online query throughput. 
    more » « less
  2. Data streaming has many applications in network monitoring, web services, e-commerce, stock trading, social networks, and distributed sensing. This paper introduces a new problem of real-time burst detection in flow spread, which differs from the traditional problem of burst detection in flow size. It is practically significant with potential applications in cybersecurity, network engineering, and trend identification on the Internet. It is a challenging problem because estimating flow spread requires us to remember all past data items and detecting bursts in real time requires us to minimize spread estimation overhead, which was not the priority in most prior work. This paper provides the first efficient, real-time solution for spread burst detection. It is designed based on a new real-time super spreader identifier, which outperforms the state of the art in terms of both accuracy and processing overhead. The super spreader identifier is in turn based on a new sketch design for real-time spread estimation, which outperforms the best existing sketches. 
    more » « less
  3. Fast and accurate knowledge of power flows and power injections is needed for a variety of applications in the electric grid. Phasor measurement units (PMUs) can be used to directly compute them at high speeds; however, a large number of PMUs will be needed for computing all the flows and injections. Similarly, if they are calculated from the outputs of a linear state estimator, then their accuracy will deteriorate due to the quadratic relationship between voltage and power. This paper employs machine learning to perform fast and accurate flow and injection estimation in power systems that are sparsely observed by PMUs. We train a deep neural network (DNN) to learn the mapping function between PMU measurements and power flows/injections. The relation between power flows and injections is incorporated into the DNN by adding a linear constraint to its loss function. The results obtained using the IEEE 118-bus system indicate that the proposed approach performs more accurate flow/injection estimation in severely unobservable power systems compared to other data-driven methods. 
    more » « less
  4. null (Ed.)
    Core-Stateless Fair Queueing (CSFQ) is a scalable algorithm proposed more than two decades ago to achieve fair queueing without keeping per-flow state in the network. Unfortunately, CSFQ did not take off, in part because it required protocol changes (i.e., adding new fields to the packet header), and hardware support to process packets at line rate. In this paper, we argue that two emerging trends are making CSFQ relevant again: (1) cloud computing which makes it feasible to change the protocol within the same datacenter or across datacenters owned by the same provider, and (2) programmable switches which can implement sophisticated packet processing at line rate. To this end, we present the first realization of CSFQ using programmable switches. In addition, we generalize CSFQ to a multi-level hierarchy, which naturally captures the traffic in today's datacenters, e.g., tenants at the first level and flows of each tenant at the second level of the hierarchy. We call this scheduler Hierarchical Core-Stateless Fair Queueing (HCSFQ), and show that it is able to accurately approximate hierarchical fair queueing. HCSFQ is highly scalable: it uses just a single FIFO queue, does not perform per-packet scheduling, and only needs to maintain state for the interior nodes of the hierarchy. We present analytical results to prove the lower bounds of HCSFQ. Our testbed experiments and large-scale simulations show that CSFQ and HCSFQ can provide fair bandwidth allocation and ensure isolation. 
    more » « less
  5. Experimentation tools facilitate exploration of Tor performance and security research problems and allow researchers to safely and privately conduct Tor experiments without risking harm to real Tor users. However, researchers using these tools configure them to generate network traffic based on simplifying assumptions and outdated measurements and without understanding the efficacy of their configuration choices. In this work, we design a novel technique for dynamically learning Tor network traffic models using hidden Markov modeling and privacy-preserving measurement techniques. We conduct a safe but detailed measurement study of Tor using 17 relays (~2% of Tor bandwidth) over the course of 6 months, measuring general statistics and models that can be used to generate a sequence of streams and packets. We show how our measurement results and traffic models can be used to generate traffic flows in private Tor networks and how our models are more realistic than standard and alternative network traffic generation~methods. 
    more » « less