The increasing ubiquity of network traffic and the new online applications’ deployment has increased traffic analysis complexity. Traditionally, network administrators rely on recognizing well-known static ports for classifying the traffic flowing their networks. However, modern network traffic uses dynamic ports and is transported over secure application-layer protocols (e.g., HTTPS, SSL, and SSH). This makes it a challenging task for network administrators to identify online applications using traditional port-based approaches. One way for classifying the modern network traffic is to use machine learning (ML) to distinguish between the different traffic attributes such as packet count and size, packet inter-arrival time, packet send–receive ratio, etc. This paper presents the design and implementation of NetScrapper, a flow-based network traffic classifier for online applications. NetScrapper uses three ML models, namely K-Nearest Neighbors (KNN), Random Forest (RF), and Artificial Neural Network (ANN), for classifying the most popular 53 online applications, including Amazon, Youtube, Google, Twitter, and many others. We collected a network traffic dataset containing 3,577,296 packet flows with different 87 features for training, validating, and testing the ML models. A web-based user-friendly interface is developed to enable users to either upload a snapshot of their network traffic to NetScrapper or sniff the network traffic directly from the network interface card in real time. Additionally, we created a middleware pipeline for interfacing the three models with the Flask GUI. Finally, we evaluated NetScrapper using various performance metrics such as classification accuracy and prediction time. Most notably, we found that our ANN model achieves an overall classification accuracy of 99.86% in recognizing the online applications in our dataset.
more »
« less
Leo: Online ML-based Traffic Classification at Multi-Terabit Line Rate
Online traffic classification enables critical applications such as network intrusion detection and prevention, providing Quality-of-Service, and real-time IoT analytics. However, with increasing network speeds, it has become extremely challenging to analyze and classify traffic online. In this paper, we present Leo, a system for online traffic classification at multi-terabit line rates. At its core, Leo implements an online machine learning (ML) model for traffic classification, namely the decision tree, in the network switch's data plane. Leo's design is fast (can classify packets at switch's line rate), scalable (can automatically select a resource-efficient design for the class of decision tree models a user wants to support), and runtime programmable (the model can be updated on-the-fly without switch downtime), while achieving high model accuracy. We implement Leo on top of Intel Tofino switches. Our evaluations show that Leo is able to classify traffic at line rate with nominal latency overhead, can scale to model sizes more than twice as large as state-of-the-art data plane ML classification systems, while achieving classification accuracy on-par with an offline traffic classifier.
more »
« less
- Award ID(s):
- 2239829
- PAR ID:
- 10570724
- Publisher / Repository:
- USENIX Symposium on Networked Systems Design and Implementation (NSDI)
- Date Published:
- ISBN:
- 978-1-939133-39-7
- Page Range / eLocation ID:
- 1573 - 1591
- Format(s):
- Medium: X
- Location:
- Santa Clara, CA
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract In today’s interconnected world, network traffic is replete with adversarial attacks. As technology evolves, these attacks are also becoming increasingly sophisticated, making them even harder to detect. Fortunately, artificial intelligence (AI) and, specifically machine learning (ML), have shown great success in fast and accurate detection, classification, and even analysis of such threats. Accordingly, there is a growing body of literature addressing how subfields of AI/ML (e.g., natural language processing (NLP)) are getting leveraged to accurately detect evasive malicious patterns in network traffic. In this paper, we delve into the current advancements in ML-based network traffic classification using image visualization. Through a rigorous experimental methodology, we first explore the process of network traffic to image conversion. Subsequently, we investigate how machine learning techniques can effectively leverage image visualization to accurately classify evasive malicious traces within network traffic. Through the utilization of production-level tools and utilities in realistic experiments, our proposed solution achieves an impressive accuracy rate of 99.48% in detecting fileless malware, which is widely regarded as one of the most elusive classes of malicious software.more » « less
-
Recently, a multi-agent based network automation architecture has been proposed. The architecture is named multi-agent based network automation of the network management system (MANA-NMS). The architectural framework introduced atomized network functions (ANFs). ANFs should be autonomous, atomic, and intelligent agents. Such agents should be implemented as an independent decision element, using machine/deep learning (ML/DL) as an internal cognitive and reasoning part. Using these atomic and intelligent agents as a building block, a MANA-NMS can be composed using the appropriate functions. As a continuation toward implementation of the architecture MANA-NMS, this paper presents a network traffic prediction agent (NTPA) and a network traffic classification agent (NTCA) for a network traffic management system. First, an NTPA is designed and implemented using DL algorithms, i.e., long short-term memory (LSTM), gated recurrent unit (GRU), multilayer perceptrons (MLPs), and convolutional neural network (CNN) algorithms as a reasoning and cognitive part of the agent. Similarly, an NTCA is designed using decision tree (DT), K-nearest neighbors (K-NN), support vector machine (SVM), and naive Bayes (NB) as a cognitive component in the agent design. We then measure the NTPA prediction accuracy, training latency, prediction latency, and computational resource consumption. The results indicate that the LSTM-based NTPA outperforms compared to GRU, MLP, and CNN-based NTPA in terms of prediction accuracy, and prediction latency. We also evaluate the accuracy of the classifier, training latency, classification latency, and computational resource consumption of NTCA using the ML models. The performance evaluation shows that the DT-based NTCA performs the best.more » « less
-
A programmable data plane composed of P4 switches, smartNICs, and hosts running software network functions can provide new opportunities for network security. Much work in this area has focused on monitoring high volume traffic such as denial of service attacks or heavy-hitter detection. However, slow attacks that carefully use small amounts of traffic to have a highly negative effect are much more challenging to detect since they typically require fine-grained analysis of all flows. Our work is exploring how a programmable data plane can provide accurate attack detection at nearly line rate while overcoming challenges such as the limited memory space available on network devices.more » « less
-
With the development of space-air-ground integrated networks, Low Earth Orbit (LEO) satellite networks are envisioned to play a crucial role in providing data transmission services in the 6G era. However, the increasing number of connected devices leads to a surge in data volume and bursty traffic patterns. Ensuring the communication stability of LEO networks has thus become essential. While Lyapunov optimization has been applied to network optimization for decades and can guarantee stability when traffic rates remain within the capacity region, its applicability in LEO satellite networks is limited due to the bursty and dynamic nature of LEO network traffic. To address this issue, we propose a robust Lyapunov optimization method to ensure stability in LEO satellite networks. We theoretically show that for a stabilizable network system, traffic rates do not have to always stay within the capacity region at every time slot. Instead, the network can accommodate temporary capacity region violations, while ensuring the long-term network stability. Extensive simulations under various traffic conditions validate the effectiveness of the robust Lyapunov optimization method, demonstrating that LEO satellite networks can maintain stability under finite violations of the capacity region.more » « less
An official website of the United States government

