skip to main content


Title: Impact of Dataset and Model Parameters on Machine Learning Performance for the Detection of GPS Spoofing Attacks on Unmanned Aerial Vehicles
GPS spoofing attacks are a severe threat to unmanned aerial vehicles. These attacks manipulate the true state of the unmanned aerial vehicles, potentially misleading the system without raising alarms. Several techniques, including machine learning, have been proposed to detect these attacks. Most of the studies applied machine learning models without identifying the best hyperparameters, using feature selection and importance techniques, and ensuring that the used dataset is unbiased and balanced. However, no current studies have discussed the impact of model parameters and dataset characteristics on the performance of machine learning models; therefore, this paper fills this gap by evaluating the impact of hyperparameters, regularization parameters, dataset size, correlated features, and imbalanced datasets on the performance of six most commonly known machine learning techniques. These models are Classification and Regression Decision Tree, Artificial Neural Network, Random Forest, Logistic Regression, Gaussian Naïve Bayes, and Support Vector Machine. Thirteen features extracted from legitimate and simulated GPS attack signals are used to perform this investigation. The evaluation was performed in terms of four metrics: accuracy, probability of misdetection, probability of false alarm, and probability of detection. The results indicate that hyperparameters, regularization parameters, correlated features, dataset size, and imbalanced datasets adversely affect a machine learning model’s performance. The results also show that the Classification and Regression Decision Tree classifier has an accuracy of 99.99%, a probability of detection of 99.98%, a probability of misdetection of 0.2%, and a probability of false alarm of 1.005%, after removing correlated features and using tuned parameters in a balanced dataset. Random Forest can achieve an accuracy of 99.94%, a probability of detection of 99.6%, a probability of misdetection of 0.4%, and a probability of false alarm of 1.01% in similar conditions.  more » « less
Award ID(s):
2006674
NSF-PAR ID:
10440042
Author(s) / Creator(s):
Date Published:
Journal Name:
Applied science
Volume:
13
Issue:
1
ISSN:
0885-1549
Page Range / eLocation ID:
383-398
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. GPS spoofing attacks are a severe threat to unmanned aerial vehicles. These attacks manipulate the true state of the unmanned aerial vehicles, potentially misleading the system without raising alarms. Several techniques, including machine learning, have been proposed to detect these attacks. Most of the studies applied machine learning models without identifying the best hyperparameters, using feature selection and importance techniques, and ensuring that the used dataset is unbiased and balanced. However, no current studies have discussed the impact of model parameters and dataset characteristics on the performance of machine learning models; therefore, this paper fills this gap by evaluating the impact of hyperparameters, regularization parameters, dataset size, correlated features, and imbalanced datasets on the performance of six most commonly known machine learning techniques. These models are Classification and Regression Decision Tree, Artificial Neural Network, Random Forest, Logistic Regression, Gaussian Naïve Bayes, and Support Vector Machine. Thirteen features extracted from legitimate and simulated GPS attack signals are used to perform this investigation. The evaluation was performed in terms of four metrics: accuracy, probability of misdetection, probability of false alarm, and probability of detection. The results indicate that hyperparameters, regularization parameters, correlated features, dataset size, and imbalanced datasets adversely affect a machine learning model’s performance. The results also show that the Classification and Regression Decision Tree classifier has an accuracy of 99.99%, a probability of detection of 99.98%, a probability of misdetection of 0.2%, and a probability of false alarm of 1.005%, after removing correlated features and using tuned parameters in a balanced dataset. Random Forest can achieve an accuracy of 99.94%, a probability of detection of 99.6%, a probability of misdetection of 0.4%, and a probability of false alarm of 1.01% in similar conditions. 
    more » « less
  2. With the increasing use of Unmanned Aerial Vehicles in military and civilian applications, the security of this technology has become one of the critical concerns. UAVs’ positioning and navigation activities are highly dependent on Global Positioning Systems as they provide accurate locations for these vehicles. However, due to the civilian GPS signals being open and unencrypted, malicious users can target them in multiple ways, including by launching Global Positioning System spoofing attacks. To address this security issue, numerous techniques have been proposed to detect and classify these attacks, including supervised machine learning techniques. However, no studies have focused on unsupervised models to detect these attacks. In this paper, we compare the performance of several supervised models with that of unsupervised models in terms of accuracy, probability of detection, probability of misdetection, probability of false alarm, processing time, training time, prediction time, and memory size. The supervised models are Gaussian Naïve Bayes, Classification and Regression Decision Tree, Logistic Regression, Random Forest, Linear-Support Vector Machine, and Artificial Neural Network. The unsupervised models are Principal Component Analysis, K-means clustering, and Autoencoder. The results show that the Classification and Regression Decision Tree model outperforms the other supervised and unsupervised models in detecting and classifying GPS spoofing attacks. 
    more » « less
  3. Unmanned aerial vehicles are prone to several cyber-attacks, including Global Positioning System spoofing. Several techniques have been proposed for detecting such attacks. However, the recurrence and frequent Global Positioning System spoofing incidents show a need for effective security solutions to protect unmanned aerial vehicles. In this paper, we propose two dynamic selection techniques, Metric Optimized Dynamic selector and Weighted Metric Optimized Dynamic selector, which identify the most effective classifier for the detection of such attacks. We develop a one-stage ensemble feature selection method to identify and discard the correlated and low importance features from the dataset. We implement the proposed techniques using ten machine-learning models and compare their performance in terms of four evaluation metrics: accuracy, probability of detection, probability of false alarm, probability of misdetection, and processing time. The proposed techniques dynamically choose the classifier with the best results for detecting attacks. The results indicate that the proposed dynamic techniques outperform the existing ensemble models with an accuracy of 99.6%, a probability of detection of 98.9%, a probability of false alarm of 1.56%, a probability of misdetection of 1.09%, and a processing time of 1.24 s. 
    more » « less
  4. Unmanned Aerial Vehicles have been widely used in military and civilian areas. The positioning and return-to-home tasks of UAVs deliberately depend on Global Positioning Systems (GPS). However, the civilian GPS signals are not encrypted, which can motivate numerous cyber-attacks on UAVs, including Global Positioning System spoofing attacks. In these spoofing attacks, a malicious user transmits counterfeit GPS signals. Numerous studies have proposed techniques to detect these attacks. However, these techniques have some limitations, including low probability of detection, high probability of misdetection, and high probability of false alarm. In this paper, we investigate and compare the performances of three ensemble-based machine learning techniques, namely bagging, stacking, and boosting, in detecting GPS attacks. The evaluation metrics are the accuracy, probability of detection, probability of misdetection, probability of false alarm, memory size, processing time, and prediction time per sample. The results show that the stacking model has the best performance compared to the two other ensemble models in terms of all the considered evaluation metrics. 
    more » « less
  5. Unmanned Aerial Systems (UAS) heavily depend on the Global Positioning System (GPS) for navigation. However, the unencrypted civilian GPS signals are subject to different types of threats, including GPS spoofing attacks. In this paper, we evaluate five instance-based learning models for GPS spoofing detection in UAS, namely K Nearest Neighbor, Radius Neighbor, Linear Support Vector Machine (SVM), C-SVM, and Nu-SVM. We used software-defined radio units to collect and extract features from satellite signals. Then, we simulated three types of GPS spoofing attacks specifically the simplistic, intermediate, and sophisticated attacks. The evaluation results show that Nu-SVM outperforms the other instance learning classifiers in terms of accuracy, probability of detection, probability of false alarm, and probability of misdetection. In addition, the model shows good computational performance regarding memory usage and processing time in the detection phase. 
    more » « less