The increasing ubiquity of network traffic and the new online applications’ deployment has increased traffic analysis complexity. Traditionally, network administrators rely on recognizing well-known static ports for classifying the traffic flowing their networks. However, modern network traffic uses dynamic ports and is transported over secure application-layer protocols (e.g., HTTPS, SSL, and SSH). This makes it a challenging task for network administrators to identify online applications using traditional port-based approaches. One way for classifying the modern network traffic is to use machine learning (ML) to distinguish between the different traffic attributes such as packet count and size, packet inter-arrival time, packet send–receive ratio, etc. This paper presents the design and implementation of NetScrapper, a flow-based network traffic classifier for online applications. NetScrapper uses three ML models, namely K-Nearest Neighbors (KNN), Random Forest (RF), and Artificial Neural Network (ANN), for classifying the most popular 53 online applications, including Amazon, Youtube, Google, Twitter, and many others. We collected a network traffic dataset containing 3,577,296 packet flows with different 87 features for training, validating, and testing the ML models. A web-based user-friendly interface is developed to enable users to either upload a snapshot of their network traffic to NetScrapper or sniff the network traffic directly from the network interface card in real time. Additionally, we created a middleware pipeline for interfacing the three models with the Flask GUI. Finally, we evaluated NetScrapper using various performance metrics such as classification accuracy and prediction time. Most notably, we found that our ANN model achieves an overall classification accuracy of 99.86% in recognizing the online applications in our dataset.
more »
« less
Encrypted Network Traffic Analysis and Classification Utilizing Machine Learning
Encryption is a fundamental security measure to safeguard data during transmission to ensure confidentiality while at the same time posing a great challenge for traditional packet and traffic inspection. In response to the proliferation of diverse network traffic patterns from Internet-of-Things devices, websites, and mobile applications, understanding and classifying encrypted traffic are crucial for network administrators, cybersecurity professionals, and policy enforcement entities. This paper presents a comprehensive survey of recent advancements in machine-learning-driven encrypted traffic analysis and classification. The primary goals of our survey are two-fold: First, we present the overall procedure and provide a detailed explanation of utilizing machine learning in analyzing and classifying encrypted network traffic. Second, we review state-of-the-art techniques and methodologies in traffic analysis. Our aim is to provide insights into current practices and future directions in encrypted traffic analysis and classification, especially machine-learning-based analysis.
more »
« less
- Award ID(s):
- 2325452
- PAR ID:
- 10548804
- Publisher / Repository:
- MDPI
- Date Published:
- Journal Name:
- Sensors
- Volume:
- 24
- Issue:
- 11
- ISSN:
- 1424-8220
- Page Range / eLocation ID:
- 3509
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Traffic classification has been studied for two decades and applied to a wide range of applications from QoS provisioning and billing in ISPs to security-related applications in firewalls and intrusion detection systems. Port-based, data packet inspection, and classical machine learning methods have been used extensively in the past, but their accuracy have been declined due to the dramatic changes in the Internet traffic, particularly the increase in encrypted traffic. With the proliferation of deep learning methods, researchers have recently investigated these methods for traffic classification task and reported high accuracy. In this article, we introduce a general framework for deep-learning-based traffic classification. We present commonly used deep learning methods and their application in traffic classification tasks. Then, we discuss open problems, challenges, and opportunities for traffic classification.more » « less
-
Traffic classification has been studied for two decades and applied to a wide range of applications from QoS provisioning and billing in ISPs to security-related applications in firewalls and intrusion detection systems. Port-based, data packet inspection, and classical machine learning methods have been used extensively in the past, but their accuracy have been declined due to the dramatic changes in the Internet traffic, particularly the increase in encrypted traffic. With the proliferation of deep learning methods, researchers have recently investigated these methods for traffic classification task and reported high accuracy. In this article, we introduce a general framework for deep-learning-based traffic classification. We present commonly used deep learning methods and their application in traffic classification tasks. Then, we discuss openmore » « less
-
With the increase in data transmissions and network traffic over the years, there has been an increase in concerns about protecting network data and information from snooping. With this concern, encryptions are incorporated into network protocols. From wireless protocols to web and phone applications, systems that handle the going and coming of data on the network have applied different kinds of encryptions to protect the confidentiality and integrity of their data transfers. The addition of encryptions poses a new question. What will be observed from encrypted traffic data? This work in progress research delivers an in-depth overview of the ZigBee protocol and analyzes encrypted ZigBee traffic on the ZigBee network. From our analysis, we developed possible strategies for ZigBee traffic analysis. Adopting the proposed strategy makes it possible to detect encrypted traffic activities and patterns of use on the ZigBee network. To the best of our knowledge, this is the first work that tries to understand encrypted ZigBee traffic. By understanding what can be gained from encrypted traffic, this work will benefit the security and privacy of the ZigBee protocol.more » « less
-
Website Fingerprinting (WF) is a traffic analysis attack that enables an eavesdropper to infer the victim's web activity even when encrypted and even when using the Tor anonymity system. Using deep learning classifiers, the attack can reach up to 98% accuracy. Existing WF defenses are either too expensive in terms of bandwidth and latency overheads (e.g. 2-3 times as large or slow) or ineffective against the latest attacks. In this work, we explore a novel defense based on the idea of adversarial examples that have been shown to undermine machine learning classifiers in other domains. Our Adversarial Traces defense adds padding to a Tor traffic trace in a manner that reliably fools the classifier into classifying it as coming from a different site. The technique drops the accuracy of the state-of-the-art attack from 98% to 60%, while incurring a reasonable 47% bandwidth overhead, showing its promise as a possible defense for Tor.more » « less
An official website of the United States government

