You Can’t Judge a Binary by Its Header: Data-Code Separation for Non-Standard ARM Binaries using Pseudo Labels

Benkraouda, Hadjer; Diwan, Nirav; Wang, Gang

Citation Details

This content will become publicly available on May 12, 2026

You Can’t Judge a Binary by Its Header: Data-Code Separation for Non-Standard ARM Binaries using Pseudo Labels

Static binary analysis is critical to various security tasks such as vulnerability discovery and malware detection. In recent years, binary analysis has faced new challenges as vendors of the Internet of Things (IoT) and Industrial Control Systems (ICS) continue to introduce customized or non-standard binary formats that existing tools cannot readily process. Reverse-engineering each of the new formats is costly as it requires extensive expertise and analysts’ time. In this paper, we investigate the first step to automate the analysis of non-standard binaries, which is to recognize the bytes representing “code” from “data” (i.e., data-code separation). We propose Loadstar, and its key idea is to use the abundant labeled data from standard binaries to train a classifier and adapt it for processing unlabeled non-standard binaries. We use a pseudo-label-based method for domain adaption and leverage knowledge-inspired rules for pseudo-label correction, which serves as the guardrail for the adaption process. A key advantage of the system is that it does not require labeling any non-standard binaries. Using three datasets of non-standard PLC binaries, we evaluate Loadstar and show it outperforms existing tools in terms of both accuracy and processing speed. We will share the tool (open source) with the community. more »

Award ID(s):: 2229876

PAR ID:: 10594582

Author(s) / Creator(s):: Benkraouda, Hadjer; Diwan, Nirav; Wang, Gang

Publisher / Repository:: In Proceedings of the 46th IEEE Symposium on Security and Privacy (IEEE SP)

Date Published:: 2025-05-12

Format(s):: Medium: X

Location:: San Francisco, CA

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
This content will become publicly available on May 12, 2026
Conference Paper:
The DOI is not currently available.

More Like this