skip to main content

Title: Bayesian Hyperparameter Optimization for Deep Neural Network-Based Network Intrusion Detection
Traditional network intrusion detection approaches encounter feasibility and sustainability issues to combat modern, sophisticated, and unpredictable security attacks. Deep neural networks (DNN) have been successfully applied for intrusion detection problems. The optimal use of DNN-based classifiers requires careful tuning of the hyper-parameters. Manually tuning the hyperparameters is tedious, time-consuming, and computationally expensive. Hence, there is a need for an automatic technique to find optimal hyperparameters for the best use of DNN in intrusion detection. This paper proposes a novel Bayesian optimization-based framework for the automatic optimization of hyperparameters, ensuring the best DNN architecture. We evaluated the performance of the proposed framework on NSL-KDD, a benchmark dataset for network intrusion detection. The experimental results show the framework’s effectiveness as the resultant DNN architecture demonstrates significantly higher intrusion detection performance than the random search optimization-based approach in terms of accuracy, precision, recall, and f1-score.
; ; ; ; ; ; ; ; ;
Award ID(s):
2100134 1723586 2100115 1723578
Publication Date:
Journal Name:
2021 IEEE International Conference on Big Data (Big Data)
Page Range or eLocation-ID:
5413 to 5419
Sponsoring Org:
National Science Foundation
More Like this
  1. Network intrusion detection systems (NIDSs) play an essential role in the defense of computer networks by identifying a computer networks' unauthorized access and investigating potential security breaches. Traditional NIDSs encounters difficulties to combat newly created sophisticated and unpredictable security attacks. Hence, there is an increasing need for automatic intrusion detection solution that can detect malicious activities more accurately and prevent high false alarm rates (FPR). In this paper, we propose a novel network intrusion detection framework using a deep neural network based on the pretrained VGG-16 architecture. The framework, TL-NID (Transfer Learning for Network Intrusion Detection), is a two-step process where features are extracted in the first step, using VGG-16 pre-trained on ImageNet dataset and in the 2 nd step a deep neural network is applied to the extracted features for classification. We applied TL-NID on NSL-KDD, a benchmark dataset for network intrusion, to evaluate the performance of the proposed framework. The experimental results show that our proposed method can effectively learn from the NSL-KDD dataset with producing a realistic performance in terms of accuracy, precision, recall, and false alarm. This study also aims to motivate security researchers to exploit different state-of-the-art pre-trained models for network intrusion detection problems throughmore »valuable knowledge transfer.« less
  2. Application-layer transfer configurations play a crucial role in achieving desirable performance in high-speed networks. However, finding the optimal configuration for a given transfer task is a difficult problem as it depends on various factors including dataset characteristics, network settings, and background traffic. The state-of-the-art transfer tuning solutions rely on real-time sample transfers to evaluate various configurations and estimate the optimal one. However, existing approaches to run sample transfers incur high delay and measurement errors, thus significantly limit the efficiency of the transfer tuning algorithms. In this paper, we introduce adaptive feed forward deep neural network (DNN) to minimize the error rate of sample transfers without increasing their execution time. We ran 115K file transfers in four different high-speed networks and used their logs to train an adaptive DNN that can quickly and accurately predict the throughput of sample transfers by analyzing instantaneous throughput values. The results gathered in various networks with rich set of transfer configurations indicate that the proposed model reduces error rate by up to 50% compared to the state-of-the-art solutions while keeping the execution time low. We also show that one can further reduce delay or error rate by tuning hyperparameters of the model to meet specificmore »needs of user or application. Finally, transfer learning analysis reveals that the model developed in one network would yield accurate results in other networks with similar transfer convergence characteristics, alleviating the needs to run an extensive data collection and model derivation efforts for each network.« less
  3. Detecting small objects (e.g., manhole covers, license plates, and roadside milestones) in urban images is a long-standing challenge mainly due to the scale of small object and background clutter. Although convolution neural network (CNN)-based methods have made significant progress and achieved impressive results in generic object detection, the problem of small object detection remains unsolved. To address this challenge, in this study we developed an end-to-end network architecture that has three significant characteristics compared to previous works. First, we designed a backbone network module, namely Reduced Downsampling Network (RD-Net), to extract informative feature representations with high spatial resolutions and preserve local information for small objects. Second, we introduced an Adjustable Sample Selection (ADSS) module which frees the Intersection-over-Union (IoU) threshold hyperparameters and defines positive and negative training samples based on statistical characteristics between generated anchors and ground reference bounding boxes. Third, we incorporated the generalized Intersection-over-Union (GIoU) loss for bounding box regression, which efficiently bridges the gap between distance-based optimization loss and area-based evaluation metrics. We demonstrated the effectiveness of our method by performing extensive experiments on the public Urban Element Detection (UED) dataset acquired by Mobile Mapping Systems (MMS). The Average Precision (AP) of the proposed method was 81.71%,more »representing an improvement of 1.2% compared with the popular detection framework Faster R-CNN.« less
  4. Architecture reverse engineering has become an emerging attack against deep neural network (DNN) implemen- tations. Several prior works have utilized side-channel leakage to recover the model architecture while the an DNN is executing on a hardware acceleration platform. In this work, we target an open- source deep-learning accelerator, Versatile Tensor Accelerator (VTA), and utilize electromagnetic (EM) side-channel leakage to comprehensively learn the association between DNN architecture configurations and EM emanations. We also consider the holistic system – including the low-level tensor program code of the VTA accelerator on a Xilinx FPGA, and explore the effect of such low- level configurations on the EM leakage. Our study demonstrates that both the optimization and configuration of tensor programs will affect the EM side-channel leakage. Gaining knowledge of the association between low-level tensor program and the EM emanations, we propose NNReArch, a lightweight tensor program scheduling framework against side- channel-based DNN model architecture reverse engineering. Specifically, NNReArch targets reshaping the EM traces of different DNN operators, through scheduling the tensor program execution of the DNN model so as to confuse the adversary. NNReArch is a comprehensive protection framework supporting two modes, a balanced mode that strikes a balance between the DNN model confidentialitymore »and execution performance, and a secure mode where the most secure setting is chosen. We imple- ment and evaluate the proposed framework on the open-source VTA with state-of-the-art DNN architectures. The experimental results demonstrate that NNReArch can efficiently enhance the model architecture security with a small performance overhead. In addition, the proposed obfuscation technique makes reverse engineering of the DNN architecture significantly harder.« less
  5. Typical cybersecurity solutions emphasize on achieving defense functionalities. However, execution efficiency and scalability are equally important, especially for real-world deployment. Straightforward mappings of cybersecurity applications onto HPC platforms may significantly underutilize the HPC devices’ capacities. On the other hand, the sophisticated implementations are quite difficult: they require both in-depth understandings of cybersecurity domain-specific characteristics and HPC architecture and system model. In our work, we investigate three sub-areas in cybersecurity, including mobile software security, network security, and system security. They have the following performance issues, respectively: 1) The flow- and context-sensitive static analysis for the large and complex Android APKs are incredibly time-consuming. Existing CPU-only frameworks/tools have to set a timeout threshold to cease the program analysis to trade the precision for performance. 2) Network intrusion detection systems (NIDS) use automata processing as its searching core and requires line-speed processing. However, achieving high-speed automata processing is exceptionally difficult in both algorithm and implementation aspects. 3) It is unclear how the cache configurations impact time-driven cache side-channel attacks’ performance. This question remains open because it is difficult to conduct comparative measurement to study the impacts. In this dissertation, we demonstrate how application-specific characteristics can be leveraged to optimize implementations on various typesmore »of HPC for faster and more scalable cybersecurity executions. For example, we present a new GPU-assisted framework and a collection of optimization strategies for fast Android static data-flow analysis that achieve up to 128X speedups against the plain GPU implementation. For network intrusion detection systems (IDS), we design and implement an algorithm capable of eliminating the state explosion in out-of-order packet situations, which reduces up to 400X of the memory overhead. We also present tools for improving the usability of Micron’s Automata Processor. To study the cache configurations’ impact on time-driven cache side-channel attacks’ performance, we design an approach to conducting comparative measurement. We propose a quantifiable success rate metric to measure the performance of time-driven cache attacks and utilize the GEM5 platform to emulate the configurable cache.« less