The recent spate of cyber attacks towards Internet of Things (IoT) devices in smart homes calls for effective techniques to understand, characterize, and unveil IoT device activities.
In this paper, we present a new system, named IoTAthena, to unveil IoT device activities from raw network traffic consisting of timestamped IP packets. IoTAthena characterizes each IoT
device activity using an activity signature consisting of an ordered sequence of IP packets with inter-packet time intervals. IoTAthena has two novel polynomial time algorithms, sigMatch
and actExtract. For any given signature, sigMatch can capture all matches of the signature in the raw network traffic. Using sigMatch as a subfunction, actExtract can accurately unveil the sequence of various IoT device activities from the raw network traffic. Using the network traffic of heterogeneous IoT devices collected at the router of a real-world smart home testbed and a public IoT dataset, we demonstrate that IoTAthena is able to characterize and generate activity signatures of IoT device activities and accurately unveil the sequence of IoT device activities from raw network traffic.
more »
« less
This content will become publicly available on January 1, 2025
Detecting Smart Home Device Activities Using Packet-Level Signatures from Encrypted Traffic
Despite the significant benefits of the widespread adoption of smart home Internet of Things (IoT) devices, these devices are known to be vulnerable to active and passive attacks. Existing literature has demonstrated the ability to infer the activities of these devices by analyzing their network traffic. In this study, we introduce a packet-based signature generation and detection system that can identify specific events associated with IoT devices by extracting simple features from raw encrypted network traffic. Unlike existing techniques that depend on specific time windows, our approach automatically determines the optimal number of packets to generate unique signatures, making it more resilient to network jitters. We evaluate the effectiveness, uniqueness, and correctness of our signatures by training and testing our system using four public datasets and an emulated dataset with varying network delays, verifying known signatures and discovering new ones. Our system achieved an average recall and precision of 98-99% and 98-100%, respectively, demonstrating the effectiveness and feasibility of using packet-level signatures to detect IoT device activities.
more »
« less
- Award ID(s):
- 2219866
- PAR ID:
- 10534500
- Publisher / Repository:
- IEEE
- Date Published:
- Journal Name:
- IEEE Transactions on Dependable and Secure Computing
- ISSN:
- 1545-5971
- Page Range / eLocation ID:
- 1 to 12
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Recent advances in cyber-physical systems, artificial intelligence, and cloud computing have driven the wide deployments of Internet-of-things (IoT) in smart homes. As IoT devices often directly interact with the users and environments, this paper studies if and how we could explore the collective insights from multiple heterogeneous IoT devices to infer user activities for home safety monitoring and assisted living. Specifically, we develop a new system, namely IoTMosaic, to first profile diverse user activities with distinct IoT device event sequences, which are extracted from smart home network traffic based on their TCP/IP data packet signatures. Given the challenges of missing and out-of-order IoT device events due to device malfunctions or varying network and system latencies, IoTMosaic further develops simple yet effective approximate matching algorithms to identify user activities from real-world IoT network traffic. Our experimental results on thousands of user activities in the smart home environment over two months show that our proposed algorithms can infer different user activities from IoT network traffic in smart homes with the overall accuracy, precision, and recall of 0.99, 0.99, and 1.00, respectively.more » « less
-
Understanding network traffic characteristics of IoT devices plays a critical role in improving both the performance and security of IoT devices, including IoT device identification, classification, and anomaly detection. Although a number of existing research efforts have developed machine-learning based algorithms to help address the challenges in improving the security of IoT devices, none of them have provided detailed studies on the network traffic characteristics of IoT devices. In this paper we collect and analyze the network traffic generated in a typical smart homes environment consisting of a set of common IoT (and non-IoT) devices. We analyze the network traffic characteristics of IoT devices from three complementary aspects: remote network servers and port numbers that IoT devices connect to, flow-level traffic characteristics such as flow duration, and packet-level traffic characteristics such as packet inter-arrival time. Our study provides critical insights into the operational and behavioral characteristics of IoT devices, which can help develop more effective security and performance algorithms for IoT devices.more » « less
-
Over the years, honeypots emerged as an important security tool to understand attacker intent and deceive attackers to spend time and resources. Recently, honeypots are being deployed for Internet of things (IoT) devices to lure attackers, and learn their behavior. However, most of the existing IoT honeypots, even the high interaction ones, are easily detected by an attacker who can observe honeypot traffic due to lack of real network traffic originating from the honeypot. This implies that, to build better honeypots and enhance cyber deception capabilities, IoT honeypots need to generate realistic network traffic flows. To achieve this goal, we propose a novel deep learning based approach for generating traffic flows that mimic real network traffic due to user and IoT device interactions.A key technical challenge that our approach overcomes is scarcity of device-specific IoT traffic data to effectively train a generator.We address this challenge by leveraging a core generative adversarial learning algorithm for sequences along with domain specific knowledge common to IoT devices.Through an extensive experimental evaluation with 18 IoT devices, we demonstrate that the proposed synthetic IoT traffic generation tool significantly outperforms state of the art sequence and packet generators in remaining indistinguishable from real traffic even to an adaptive attacker.more » « less
-
Synthetic traffic generation can produce sufficient data for model training of various traffic analysis tasks for IoT networks with few costs and ethical concerns. However, with the increasing functionalities of the latest smart devices, existing approaches can neither customize the traffic generation of various device functions nor generate traffic that preserves the sequentiality among packets as the real traffic. To address these limitations, this paper proposes IoTGemini, a novel framework for high-quality IoT traffic generation, which consists of a Device Modeling Module and a Traffic Generation Module. In the Device Modeling Module, we propose a method to obtain the profiles of the device functions and network behaviors, enabling IoTGemini to customize the traffic generation like using a real IoT device. In the Traffic Generation Module, we design a Packet Sequence Generative Adversarial Network (PS-GAN), which can generate synthetic traffic with high fidelity of both per-packet fields and sequential relationships. We set up a real-world IoT testbed to evaluate IoTGemini. The experiment result shows that IoTGemini can achieve great effectiveness in device modeling, high fidelity of synthetic traffic generation, and remarkable usability to downstream tasks on different traffic datasets and downstream traffic analysis tasks.more » « less