SPT-code: sequence-to-sequence pre-training for learning source code representations

Niu, Changan; Li, Chuanyi; Ng, Vincent; Ge, Jidong; Huang, Liguo; Luo, Bin

doi:10.1145/3510003.3510096

Android mobile applications collect information in various ways to provide users with functionalities and services. An Android app's permission manifest and privacy policy are documents that provide users with guidelines about what information type is being collected. However, the information types mentioned in these files are often abstract and does not include the fine grained information types being collected through user input fields in applications. Existing approaches focus on API calls in the application code and are able to reveal what information types are being collected. However, they are unable to identify the information types based on direct user input as a major source of private information. In this paper, we propose to direct apply natural language processing approach to Android layout code to identify information types associated with input fields in applications.

More Like this