DAISY: Dynamic-Analysis-Induced Source Discovery for Sensitive Data

Zhang, Xueling; Heaps, John; Slavin, Rocky; Niu, Jianwei; Breaux, Travis D.; Wang, Xiaoyin

doi:10.1145/3569936

Citation Details

DAISY: Dynamic-Analysis-Induced Source Discovery for Sensitive Data

Mobile apps are widely used and often process users’ sensitive data. Many taint analysis tools have been applied to analyze sensitive information flows and report data leaks in apps. These tools require a list of sources (where sensitive data is accessed) as input, and researchers have constructed such lists within the Android platform by identifying Android API methods that allow access to sensitive data. However, app developers may also define methods or use third-party library’s methods for accessing data. It is difficult to collect such source methods because they are unique to the apps, and there are a large number of third-party libraries available on the market that evolve over time. To address this problem, we propose DAISY, a Dynamic-Analysis-Induced Source discoverY approach for identifying methods that return sensitive information from apps and third-party libraries. Trained on an automatically labeled data set of methods and their calling context, DAISY identifies sensitive methods in unseen apps. We evaluated DAISY on real-world apps and the results show that DAISY can achieve an overall precision of 77.9% when reporting the most confident results. Most of the identified sources and leaks cannot be detected by existing technologies. more »

Award ID(s):: 2007718 1846467 2221843 1948244 1736209

PAR ID:: 10397875

Author(s) / Creator(s):: Zhang, Xueling; Heaps, John; Slavin, Rocky; Niu, Jianwei; Breaux, Travis D.; Wang, Xiaoyin

Date Published:: 2022-10-29

Journal Name:: ACM Transactions on Software Engineering and Methodology

ISSN:: 1049-331X

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
https://doi.org/10.1145/3569936

More Like this