skip to main content


This content will become publicly available on November 10, 2024

Title: The intrinsic dimensionality of network datasets and its applications1

Modern network infrastructures are in a constant state of transformation, in large part due to the exponential growth of Internet of Things (IoT) devices. The unique properties of IoT-connected networks, such as heterogeneity and non-standardized protocol, have created critical security holes and network mismanagement. In this paper we propose a new measurement tool, Intrinsic Dimensionality (ID), to aid in analyzing and classifying network traffic. A proxy for dataset complexity, ID can be used to understand the network as a whole, aiding in tasks such as network management and provisioning. We use ID to evaluate several modern network datasets empirically. Showing that, for network and device-level data, generated using IoT methodologies, the ID of the data fits into a low dimensional representation. Additionally we explore network data complexity at the sample level using Local Intrinsic Dimensionality (LID) and propose a novel unsupervised intrusion detection technique, the Weighted Hamming LID Estimator. We show that the algortihm performs better on IoT network datasets than the Autoencoder, KNN, and Isolation Forests. Finally, we propose the use of synthetic data as an additional tool for both network data measurement as well as intrusion detection. Synthetically generated data can aid in building a more robust network dataset, while also helping in downstream tasks such as machine learning based intrusion detection models. We explore the effects of synthetic data on ID measurements, as well as its role in intrusion detection systems.

 
more » « less
Award ID(s):
1822118 2123761 1715458
NSF-PAR ID:
10499066
Author(s) / Creator(s):
; ; ;
Editor(s):
Sural, Shamik; Lu, Haibing
Publisher / Repository:
IOS Press
Date Published:
Journal Name:
Journal of Computer Security
Volume:
31
Issue:
6
ISSN:
0926-227X
Page Range / eLocation ID:
679 to 704
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The Internet of Things (IoT) is revolutionizing society by connect- ing people and devices seamlessly and providing enhanced user experience and functionalities. However, the unique properties of IoT networks, such as heterogeneity and non-standardized protocol, have created critical security holes and network mismanagement. We propose a new measurement tool for IoT network data to aid in analyzing and classifying such network traffic. We use evidence from both security and machine learning research, which suggests that the complexity of a dataset can be used as a metric to determine the trustworthiness of data. We test the complexity of IoT networks using Intrinsic Dimensionality (ID), a theoretical complexity mea- surement based on the observation that a few variables can often describe high dimensional datasets. We use ID to evaluate four mod- ern IoT network datasets empirically, showing that, for network and device-level data generated using IoT methodologies, the ID of the data fits into a low dimensional representation; this makes such data amenable to the use of machine learning algorithms for anomaly detection. 
    more » « less
  2. Abstract

    With the rapid development of the Internet of Things (IoT), network security challenges are becoming more and more complex, and the scale of intrusion attacks against the network is gradually increasing. Therefore, researchers have proposed Intrusion Detection Systems and constantly designed more effective systems to defend against attacks. One issue to consider is using limited computing power to process complex network data efficiently. In this paper, we take the AWID dataset as an example, propose an efficient data processing method to mitigate the interference caused by redundant data and design a lightweight deep learning-based model to analyze and predict the data category. Finally, we achieve an overall accuracy of 99.77% and an accuracy of 97.95% for attacks on the AWID dataset, with a detection rate of 99.98% for the injection attack. Our model has low computational overhead and a fast response time after training, ensuring the feasibility of applying to edge nodes with weak computational power in the IoT.

     
    more » « less
  3. With the proliferation of IoT devices, securing the IoT-based network is of paramount importance. IoT-based networks consist of diversely purposed IoT devices. This diversity of IoT devices necessitates diverse dataset analysis to ensure effective implementation of Machine Learning (ML)-based cybersecurity. However, much-demanded real-world IoT data is still in short supply for active ML-based IoT data analysis. This paper presents an in-depth analysis of the real-time IoT-23 dataset [9]. Exhaustive analysis of all 20 scenarios of the IoT-23 dataset reveals the consistency between feature selection methods and detection algorithms. The proposed ML-based intrusion detection system (ML-IDS) achieves significant improvement in detection accuracy, which in some scenarios reaches 100%. Our analysis also demonstrates that the required number of features for a high detection rate of greater than 99% remains small, i.e., 2 or 3, enabling ML-IDS implementation even with resource-constrained IoT processors. 
    more » « less
  4. Ayahiko Niimi, Future University-Hakodate (Ed.)
    Traditional Network Intrusion Detection Systems (NIDS) encounter difficulties due to the exponential growth of network traffic data and modern attacks' requirements. This paper presents a novel network intrusion classification framework using transfer learning from the VGG-16 pre-trained model. The framework extracts feature leveraging pre-trained weights trained on the ImageNet dataset in the initial step, and finally, applies a deep neural network to the extracted features for intrusion classification. We applied the presented framework on NSL-KDD, a benchmark dataset for network intrusion, to evaluate the proposed framework's performance. We also implemented other pre-trained models such as VGG19, MobileNet, ResNet-50, and Inception V3 to evaluate and compare performance. This paper also displays both binary classification (normal vs. attack) and multi-class classification (classifying types of attacks) for network intrusion detection. The experimental results show that feature extraction using VGG-16 outperforms other pre-trained models producing better accuracy, precision, recall, and false alarm rates. 
    more » « less
  5. null (Ed.)
    Network intrusion detection systems (NIDSs) play an essential role in the defense of computer networks by identifying a computer networks' unauthorized access and investigating potential security breaches. Traditional NIDSs encounters difficulties to combat newly created sophisticated and unpredictable security attacks. Hence, there is an increasing need for automatic intrusion detection solution that can detect malicious activities more accurately and prevent high false alarm rates (FPR). In this paper, we propose a novel network intrusion detection framework using a deep neural network based on the pretrained VGG-16 architecture. The framework, TL-NID (Transfer Learning for Network Intrusion Detection), is a two-step process where features are extracted in the first step, using VGG-16 pre-trained on ImageNet dataset and in the 2 nd step a deep neural network is applied to the extracted features for classification. We applied TL-NID on NSL-KDD, a benchmark dataset for network intrusion, to evaluate the performance of the proposed framework. The experimental results show that our proposed method can effectively learn from the NSL-KDD dataset with producing a realistic performance in terms of accuracy, precision, recall, and false alarm. This study also aims to motivate security researchers to exploit different state-of-the-art pre-trained models for network intrusion detection problems through valuable knowledge transfer. 
    more » « less