NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Detecting and Classifying Self-Deleting Windows Malware Using Prefetch Files

https://doi.org/10.1109/CCWC54503.2022.9720874

Duby, Adam; Taylor, Teryl; Bloom, Gedare; Zhuang, Yanyan (January 2022, 2022 IEEE 12th Annual Computing and Communication Workshop and Conference (CCWC))

Malware detection and analysis can be a burdensome task for incident responders. As such, research has turned to machine learning to automate malware detection and malware family classification. Existing work extracts and engineers static and dynamic features from the malware sample to train classifiers. Despite promising results, such techniques assume that the analyst has access to the malware executable file. Self-deleting malware invalidates this assumption and requires analysts to find forensic evidence of malware execution for further analysis. In this paper, we present and evaluate an approach to detecting malware that executed on a Windows target and further classify the malware into its associated family to provide semantic insight. Specifically, we engineer features from the Windows prefetch file, a file system forensic artifact that archives process information. Results show that it is possible to detect the malicious artifact with 99% accuracy; furthermore, classifying the malware into a fine-grained family has comparable performance to techniques that require access to the original executable. We also provide a thorough security discussion of the proposed approach against adversarial diversity.
more » « less
Full Text Available
Shining New Light on Useful Features for Network Intrusion Detection Algorithms

https://doi.org/10.1109/CCNC49033.2022.9700654

Lawrence, Heather; Ezeobi, Uchenna; Bloom, Gedare; Zhuang, Yanyan (January 2022, 2022 IEEE 19th Annual Consumer Communications Networking Conference (CCNC))

Network intrusion detection systems (NIDS) today must quickly provide visibility into anomalous behavior on a growing amount of data. Meanwhile different data models have evolved over time, each providing a different set of features to classify attacks. Defenders have limited time to retrain classifiers, while the scale of data and feature mismatch between data models can affect the ability to periodically retrain. Much work has focused on classification accuracy yet feature selection is a key part of machine learning that, when optimized, reduces the training time and can increase accuracy by removing poorly performing features that introduce noise. With a larger feature space, the pursuit of more features is not as valuable as selecting better features. In this paper, we use an ensemble approach of filter methods to rank features followed by a voting technique to select a subset of features. We evaluate our approach using three datasets to show that, across datasets and network topologies, similar features have a trivial effect on classifier accuracy after removal. Our approach identifies poorly performing features to remove in a classifier-agnostic manner that can significantly save time for periodic retraining of production NIDS.
more » « less
Full Text Available
Malware Family Classification via Residual Prefetch Artifacts

https://doi.org/10.1109/CCNC49033.2022.9700530

Duby, Adam; Taylor, Teryl; Zhuang, Yanyan (January 2022, Proc. 19th IEEE Consumer Communications and Networking Conference (CCNC’22))

Full Text Available
Towards Return Parity in Markov Decision Processes

Chi, J.; Shen, J.; Dai, X.; Zhang, W; Tian, Y.; Zhao, H. (January 2022, 25th International Conference on Artificial Intelligence and Statistics (AISTATS 2022))

Full Text Available
Intent Classification and Slot Filling for Privacy Policies

Ahmad, W.; Chi, J.; Le, T.; Norton, T.; Tian, Y.; Chang, K. (January 2021, International journal of computational linguistics and natural language processing)

Full Text Available
Unified pre-training for program understanding and generation.

https://doi.org/10.18653/v1/2021.naacl-main.211

Ahmad, W. U.; Chakraborty, S.; Ray, B.; Chang, K. W. (January 2021, Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies)

Full Text Available
VerHealth: Vetting Medical Voice Applications through Policy Enforcement

https://doi.org/10.1145/3432233

Shezan, Faysal Hossain; Hu, Hang; Wang, Gang; Tian, Yuan (December 2020, Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies)

Healthcare applications on Voice Personal Assistant System (e.g., Amazon Alexa), have shown a great promise to deliver personalized health services via a conversational interface. However, concerns are also raised about privacy, safety, and service quality. In this paper, we propose VerHealth, to systematically assess health-related applications on Alexa for how well they comply with existing privacy and safety policies. VerHealth contains a static module and a dynamic module based on machine learning that can trigger and detect violation behaviors hidden deep in the interaction threads. We use VerHealth to analyze 813 health-related applications on Alexa by sending over 855,000 probing questions and analyzing 863,000 responses. We also consult with three medical school students (domain experts) to confirm and assess the potential violations. We show that violations are quite common, e.g., 86.36% of them miss disclaimers when providing medical information; 30.23% of them store user physical or mental health data without approval. Domain experts believe that the applications' medical suggestions are often factually-correct but are of poor relevance, and applications should have asked more questions before providing suggestions for over half of the cases. Finally, we use our results to discuss possible directions for improvements.
more » « less
Full Text Available
TKPERM: Cross-platform Permission Knowledge Transfer to Detect Overprivileged Third-party Applications

https://doi.org/10.14722/ndss.2020.24287

Shezan, Faysal Hossain; Cheng, Kaiming; Zhang, Zhen; Cao, Yinzhi; Tian, Yuan (January 2020, Network and Distributed Systems Security (NDSS) Symposium)

Permission-based access control enables users to manage and control their sensitive data for third-party applications. In an ideal scenario, third-party application includes enough details to illustrate the usage of such data, while the reality is that many descriptions of third-party applications are vague about their security or privacy activities. As a result, users are left with insufficient details when granting sensitive data to these applications. Prior works, such as WHYPER and AutoCog, have addressed the aforementioned problem via a so-called permission correlation system. Such a system correlates third-party applications' description with their requested permissions and determines an application as overprivileged if a mismatch is found. However, although prior works are successful on their own platforms, such as Android eco-system, they are not directly applicable to new platforms, such as Chrome extensions and IFTTT, without extensive data labeling and parameter tuning. In this paper, we design, implement, and evaluate a novel system, called TKPERM, which transfers knowledges of permission correlation systems across platforms. Our key idea is that these varied platforms with different use cases---like smartphones, IoTs, and desktop browsers---are all user-facing and thus allow the knowledges to be transferrable across platforms. Particularly, we adopt a greedy selection algorithm that picks the best source domains to transfer to the target permission on a new platform. TKPERM achieves 90.02% overall F1 score after transfer, which is 12.62% higher than the one of a model trained directly on the target domain without transfer. Particularly, TKPERM has 91.83% F1 score on IFTTT, 89.13% F1 score on Chrome-Extension, and 89.1% F1 score on SmartThings. TKPERM also successfully identified many real-world overprivileged applications, such as a gaming hub requesting location permissions without legitimate use.
more » « less
Full Text Available
Findings: PolicyQA: A Reading Comprehension Dataset for Privacy Policies

Ahmad, A.; Chi, J.; Tian, Y.; Chang, K. (January 2020, Conference on Empirical Methods in Natural Language Processing)
null (Ed.)
Full Text Available
Read Between the Lines: An Empirical Measurement of Sensitive Applications of Voice Personal Assistant Systems

https://doi.org/10.1145/3366423.3380179

Shezan, Faysal; Hu, Hang; Wang, Jiamin; Wang, Gang; Tian, Yuan (January 2020, The Web Conference (WWW))

Full Text Available

« Prev Next »

Search for: All records