Internet of Things (IoT) devices have increased drastically in complexity and prevalence within the last decade. Alongside the proliferation of IoT devices and applications, attacks targeting them have gained popularity. Recent large-scale attacks such as Mirai and VPNFilter highlight the lack of comprehensive defenses for IoT devices. Existing security solutions are inadequate against skilled adversaries with sophisticated and stealthy attacks against IoT devices. Powerful provenance-based intrusion detection systems have been successfully deployed in resource-rich servers and desktops to identify advanced stealthy attacks. However, IoT devices lack the memory, storage, and computing resources to directly apply these provenance analysis techniques on the device. This paper presents ProvIoT, a novel federated edge-cloud security framework that enables on-device syscall-level behavioral anomaly detection in IoT devices. ProvIoT applies federated learning techniques to overcome data and privacy limitations while minimizing network overhead. Infrequent on-device training of the local model requires less than 10% CPU overhead; syncing with the global models requires sending and receiving 2MB over the network. During normal offline operation, ProvIoT periodically incurs less than 10% CPU overhead and less than 65MB memory usage for data summarization and anomaly detection. Our evaluation shows that ProvIoT detects fileless malware and stealthy APT attacks with an average F1 score of 0.97 in heterogeneous real-world IoT applications. ProvIoT is a step towards extending provenance analysis to resource-constrained IoT devices, beginning with well-resourced IoT devices such as the RaspberryPi, Jetson Nano, and Google TPU. 
                        more » 
                        « less   
                    
                            
                            System Call Processing Using Lightweight NLP for IoT Behavioral Malware Detection
                        
                    
    
            Although much of the work in behaviorally detecting malware lies in collecting the best explanatory data and using the most efficacious machine learning models, the processing of the data can sometimes prove to be the most important step in the data pipeline. In this work, we collect kernel-level system calls on a resource-constrained Internet of Things (IoT) device, apply lightweight Natural Language Processing (NLP) techniques to the data, and feed this processed data to two simple machine learning classification models: Logistic Regression (LR) and a Neural Network (NN). For the data processing, we group the system calls into n-grams that are sorted by the timestamp in which they are recorded. To demonstrate the effectiveness, or lack thereof, of using n-grams, we deploy two types of malware onto the IoT device: a Denial-of-Service (DoS) attack, and an Advanced Persistent Threat (APT) malware. We examine the effects of using lightweight NLP on malware like the DoS and the stealthy APT malware. For stealthier malware, such as the APT, using more advanced, but far more resource-intensive, NLP techniques will likely increase detection capability, which is saved for future work. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 1816387
- PAR ID:
- 10465253
- Date Published:
- Journal Name:
- Ubiquitous Security (UbiSec 2022)
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            Security research on smart devices mostly focuses on malware installation and activation, privilege escalation, remote control, financial charges, personal information stealing, and permission use. Less attention has been paid to the deceptive mechanisms, which are critical for the success of malware on smart devices. Generally, malware first gets installed and then continues operating on the device without attracting suspicion from users. To do so, smart device malware uses various techniques to conceal itself, e.g., hiding activity, muting the phone, and deleting call logs. In this work, we developed an approach to semi-automatically reveal unknown malware hiding techniques. First, it extracts SMH behaviors from malware descriptions by using natural language processing techniques. Second, it maps SMH behaviors to SMH-related APIs based on the analysis of API documents. Third, it performs static analysis on the malware apps that contain unknown SMH behaviors to extract the code segments related to the SMH API calls. For those verified SMH code segments, we describe the techniques used for unknown SMH behaviors based on the code segments. Our experiment tested 119 malware apps with hiding behaviors. The F-measure is 85.58%, indicating that our approach is quite effective.more » « less
- 
            Internet-of-Things (IoT) devices are vulnerable to malware and require new mitigation techniques due to their limited resources. To that end, previous research has used periodic Remote Attestation (RA) or Traffic Analysis (T A) to detect malware in IoT devices. However, RA is expensive, and TA only raises suspicion without confirming malware presence. To solve this, we design MADEA, the first system that blends RA and T A to offer a comprehensive approach to malware detection for the IoT ecosystem. T A builds profiles of expected packet traces during benign operations of each device and then uses them to detect malware from network traffic in realtime. RA confirms the presence or absence of malware on the device. MADEA achieves 100% true positive rate. It also outperforms other approaches with 160× faster detection time. Finally, without MADEA, effective periodic RA can consume at least ∼14× the amount of energy that a device needs in one hour.more » « less
- 
            Abstract—Less attention has been paid to the deceptive mechanisms of malware on smart devices. Smart device malware uses various techniques to conceal itself, e.g., hiding activity, muting the phone, and deleting call logs. In this work, we developed a novel approach to semi-automatically detect malware hiding behaviors. To more effectively and thoroughly detect malware hiding behaviors, our prototype checks multiple mediums, including vision, sound, vibration, phone calls, messages, and system logs. Our experiments show that the approach can detect malware hiding behaviors. The F-measure is 87.7%, indicating that our approach is quite effective.more » « less
- 
            Inertial navigation provides a small footprint, low-power, and low-cost pathway for localization in GPS-denied environments on extremely resource-constrained Internet-of-Things (IoT) platforms. Traditionally, application-specific heuristics and physics-based kinematic models are used to mitigate the curse of drift in inertial odometry. These techniques, albeit lightweight, fail to handle domain shifts and environmental non-linearities. Recently, deep neural-inertial sequence learning has shown superior odometric resolution in capturing non-linear motion dynamics without human knowledge over heuristic-based methods. These AI-based techniques are data-hungry, suffer from excessive resource usage, and cannot guarantee following the underlying system physics. This paper highlights the unique methods, opportunities, and challenges in porting real-time AI-enhanced inertial navigation algorithms onto IoT platforms. First, we discuss how platform-aware neural architecture search coupled with ultra-lightweight model backbones can yield neural-inertial odometry models that are 31–134 x smaller yet achieve or exceed the localization resolution of state-of-the-art AI-enhanced techniques. The framework can generate models suitable for locating humans, animals, underwater sensors, aerial vehicles, and precision robots. Next, we showcase how techniques from neurosymbolic AI can yield physics-informed and interpretable neural-inertial navigation models. Afterward, we present opportunities for fine-tuning pre-trained odometry models in a new domain with as little as 1 minute of labeled data, while discussing inexpensive data collection and labeling techniques. Finally, we identify several open research challenges that demand careful consideration moving forward.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    