skip to main content

Title: IoTFlowGenerator: Crafting Synthetic IoT Device Traffic Flows for Cyber Deception
Over the years, honeypots emerged as an important security tool to understand attacker intent and deceive attackers to spend time and resources. Recently, honeypots are being deployed for Internet of things (IoT) devices to lure attackers, and learn their behavior. However, most of the existing IoT honeypots, even the high interaction ones, are easily detected by an attacker who can observe honeypot traffic due to lack of real network traffic originating from the honeypot. This implies that, to build better honeypots and enhance cyber deception capabilities, IoT honeypots need to generate realistic network traffic flows. To achieve this goal, we propose a novel deep learning based approach for generating traffic flows that mimic real network traffic due to user and IoT device interactions.A key technical challenge that our approach overcomes is scarcity of device-specific IoT traffic data to effectively train a generator.We address this challenge by leveraging a core generative adversarial learning algorithm for sequences along with domain specific knowledge common to IoT devices.Through an extensive experimental evaluation with 18 IoT devices, we demonstrate that the proposed synthetic IoT traffic generation tool significantly outperforms state of the art sequence and packet generators in remaining indistinguishable from real traffic even to an adaptive attacker.

more » « less
Award ID(s):
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
Date Published:
Journal Name:
The International FLAIRS Conference Proceedings
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Billions of devices in the Internet of Things (IoT) are inter-connected over the internet and communicate with each other or end users. IoT devices communicate through messaging bots. These bots are important in IoT systems to automate and better manage the work flows. IoT devices are usually spread across many applications and are able to capture or generate substantial influx of big data. The integration of IoT with cloud computing to handle and manage big data, requires considerable security measures in order to prevent cyber attackers from adversarial use of such large amount of data. An attacker can simply utilize the messaging bots to perform malicious activities on a number of devices and thus bots pose serious cybersecurity hazards for IoT devices. Hence, it is important to detect the presence of malicious bots in the network. In this paper we propose an evidence theory-based approach for malicious bot detection. Evidence Theory, a.k.a. Dempster Shafer Theory (DST) is a probabilistic reasoning tool and has the unique ability to handle uncertainty, i.e. in the absence of evidence. It can be applied efficiently to identify a bot, especially when the bots have dynamic or polymorphic behavior. The key characteristic of DST is that the detection system may not need any prior information about the malicious signatures and profiles. In this work, we propose to analyze the network flow characteristics to extract key evidence for bot traces. We then quantify these pieces of evidence using apriori algorithm and apply DST to detect the presence of the bots. 
    more » « less
  2. Green wireless networks Wake-up radio Energy harvesting Routing Markov decision process Reinforcement learning 1. Introduction With 14.2 billions of connected things in 2019, over 41.6 billions expected by 2025, and a total spending on endpoints and services that will reach well over $1.1 trillion by the end of 2026, the Internet of Things (IoT) is poised to have a transformative impact on the way we live and on the way we work [1–3]. The vision of this ‘‘connected continuum’’ of objects and people, however, comes with a wide variety of challenges, especially for those IoT networks whose devices rely on some forms of depletable energy support. This has prompted research on hardware and software solutions aimed at decreasing the depen- dence of devices from ‘‘pre-packaged’’ energy provision (e.g., batteries), leading to devices capable of harvesting energy from the environment, and to networks – often called green wireless networks – whose lifetime is virtually infinite. Despite the promising advances of energy harvesting technologies, IoT devices are still doomed to run out of energy due to their inherent constraints on resources such as storage, processing and communica- tion, whose energy requirements often exceed what harvesting can provide. The communication circuitry of prevailing radio technology, especially, consumes relevant amount of energy even when in idle state, i.e., even when no transmissions or receptions occur. Even duty cycling, namely, operating with the radio in low energy consumption ∗ Corresponding author. E-mail address: (G. Koutsandria). (sleep) mode for pre-set amounts of time, has been shown to only mildly alleviate the problem of making IoT devices durable [4]. An effective answer to eliminate all possible forms of energy consumption that are not directly related to communication (e.g., idle listening) is provided by ultra low power radio triggering techniques, also known as wake-up radios [5,6]. Wake-up radio-based networks allow devices to remain in sleep mode by turning off their main radio when no communication is taking place. Devices continuously listen for a trigger on their wake-up radio, namely, for a wake-up sequence, to activate their main radio and participate to communication tasks. Therefore, devices wake up and turn their main radio on only when data communication is requested by a neighboring device. Further energy savings can be obtained by restricting the number of neighboring devices that wake up when triggered. This is obtained by allowing devices to wake up only when they receive specific wake-up sequences, which correspond to particular protocol requirements, including distance from the destina- tion, current energy status, residual energy, etc. This form of selective awakenings is called semantic addressing [7]. Use of low-power wake-up radio with semantic addressing has been shown to remarkably reduce the dominating energy costs of communication and idle listening of traditional radio networking [7–12]. This paper contributes to the research on enabling green wireless networks for long lasting IoT applications. Specifically, we introduce a ABSTRACT This paper presents G-WHARP, for Green Wake-up and HARvesting-based energy-Predictive forwarding, a wake-up radio-based forwarding strategy for wireless networks equipped with energy harvesting capabilities (green wireless networks). Following a learning-based approach, G-WHARP blends energy harvesting and wake-up radio technology to maximize energy efficiency and obtain superior network performance. Nodes autonomously decide on their forwarding availability based on a Markov Decision Process (MDP) that takes into account a variety of energy-related aspects, including the currently available energy and that harvestable in the foreseeable future. Solution of the MDP is provided by a computationally light heuristic based on a simple threshold policy, thus obtaining further computational energy savings. The performance of G-WHARP is evaluated via GreenCastalia simulations, where we accurately model wake-up radios, harvestable energy, and the computational power needed to solve the MDP. Key network and system parameters are varied, including the source of harvestable energy, the network density, wake-up radio data rate and data traffic. We also compare the performance of G-WHARP to that of two state-of-the-art data forwarding strategies, namely GreenRoutes and CTP-WUR. Results show that G-WHARP limits energy expenditures while achieving low end-to-end latency and high packet delivery ratio. Particularly, it consumes up to 34% and 59% less energy than CTP-WUR and GreenRoutes, respectively. 
    more » « less
  3. The security of Internet-of-Things (IoT) devices in the residential environment is important due to their widespread presence in homes and their sensing and actuation capabilities. However, securing IoT devices is challenging due to their varied designs, deployment longevity, multiple manufacturers, and potentially limited availability of long-term firmware updates. Attackers have exploited this complexity by specifically targeting IoT devices, with some recent high-profile cases affecting millions of devices. In this work, we explore access control mechanisms that tightly constrain access to devices at the residential router, with the goal of precluding access that is inconsistent with legitimate users' goals. Since many residential IoT devices are controlled via applications on smartphones, we combine application sensors on phones with sensors at residential routers to analyze workflows. We construct stateful filters at residential routers that can require user actions within a registered smartphone to enable network access to an IoT device. In doing so, we constrain network packets only to those that are consistent with the user's actions. In our experiments, we successfully identified 100% of malicious traffic while correctly allowing more than 98% of legitimate network traffic. The approach works across device types and manufacturers with straightforward API and state machine construction for each new device workflow. 
    more » « less
  4. The Internet has been experiencing immense growth in multimedia traffic from mobile devices. The increase in traffic presents many challenges to user-centric networks, network operators, and service providers. Foremost among these challenges is the inability of networks to determine the types of encrypted traffic and thus the level of network service the traffic needs for maintaining an acceptable quality of experience. Therefore, end devices are a natural fit for performing traffic classification since end devices have more contextual information about the device usage and traffic. This paper proposes a novel approach that classifies multimedia traffic types produced and consumed on mobile devices. The technique relies on a mobile device’s detection of its multimedia context characterized by its utilization of different media input/output components, e.g., camera, microphone, and speaker. We develop an algorithm, MediaSense, which senses the states of multiple I/O components and identifies the specific multimedia context of a mobile device in real-time. We demonstrate that MediaSense classifies encrypted multimedia traffic in real-time as accurately as deep learning approaches and with even better generalizability. 
    more » « less
  5. Sural, Shamik ; Lu, Haibing (Ed.)

    Modern network infrastructures are in a constant state of transformation, in large part due to the exponential growth of Internet of Things (IoT) devices. The unique properties of IoT-connected networks, such as heterogeneity and non-standardized protocol, have created critical security holes and network mismanagement. In this paper we propose a new measurement tool, Intrinsic Dimensionality (ID), to aid in analyzing and classifying network traffic. A proxy for dataset complexity, ID can be used to understand the network as a whole, aiding in tasks such as network management and provisioning. We use ID to evaluate several modern network datasets empirically. Showing that, for network and device-level data, generated using IoT methodologies, the ID of the data fits into a low dimensional representation. Additionally we explore network data complexity at the sample level using Local Intrinsic Dimensionality (LID) and propose a novel unsupervised intrusion detection technique, the Weighted Hamming LID Estimator. We show that the algortihm performs better on IoT network datasets than the Autoencoder, KNN, and Isolation Forests. Finally, we propose the use of synthetic data as an additional tool for both network data measurement as well as intrusion detection. Synthetically generated data can aid in building a more robust network dataset, while also helping in downstream tasks such as machine learning based intrusion detection models. We explore the effects of synthetic data on ID measurements, as well as its role in intrusion detection systems.

    more » « less