skip to main content


Title: Amandroid: A Precise and General Inter-component Data Flow Analysis Framework for Security Vetting of Android Apps
We present a new approach to static analysis for security vetting of Android apps and a general framework called Amandroid. Amandroid determines points-to information for all objects in an Android app component in a flow and context-sensitive (user-configurable) way and performs data flow and data dependence analysis for the component. Amandroid also tracks inter-component communication activities. It can stitch the component-level information into the app-level information to perform intra-app or inter-app analysis. In this article, (a) we show that the aforementioned type of comprehensive app analysis is completely feasible in terms of computing resources with modern hardware, (b) we demonstrate that one can easily leverage the results from this general analysis to build various types of specialized security analyses—in many cases the amount of additional coding needed is around 100 lines of code, and (c) the result of those specialized analyses leveraging Amandroid is at least on par and often exceeds prior works designed for the specific problems, which we demonstrate by comparing Amandroid’s results with those of prior works whenever we can obtain the executable of those tools. Since Amandroid’s analysis directly handles inter-component control and data flows, it can be used to address security problems that result from interactions among multiple components from either the same or different apps. Amandroid’s analysis is sound in that it can provide assurance of the absence of the specified security problems in an app with well-specified and reasonable assumptions on Android runtime system and its library.  more » « less
Award ID(s):
1718214
NSF-PAR ID:
10063123
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
ACM transactions on information and system security
Volume:
21
Issue:
3
ISSN:
1094-9224
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Android’s flexible communication model allows interactions among third-party apps, but it also leads to inter-app security vulnerabilities. Specifically, malicious apps can eavesdrop on interactions between other apps or exploit the functionality of those apps, which can expose a user’s sensitive information to attackers. While the state-of-the-art tools have focused on detecting inter-app vulnerabilities in Android, they neither accurately analyze realistically large numbers of apps nor effectively deliver the identified issues to users. This paper presents SEALANT, a novel tool that combines static analysis and visualization techniques that, together, enable accurate identification of inter-app vulnerabilities as well as their systematic visualization. SEALANT statically analyzes architectural information of a given set of apps, infers vulnerable communication channels where inter-app attacks can be launched, and visualizes the identified information in a compositional representation. SEALANT has been demonstrated to accurately identify inter-app vulnerabilities from hundreds of real-world Android apps and to effectively deliver the identified information to users. 
    more » « less
  2. It takes great effort to manually or semi-automatically convert free-text phenotype narratives (e.g., morphological descriptions in taxonomic works) to a computable format before they can be used in large-scale analyses. We argue that neither a manual curation approach nor an information extraction approach based on machine learning is a sustainable solution to produce computable phenotypic data that are FAIR (Findable, Accessible, Interoperable, Reusable) (Wilkinson et al. 2016). This is because these approaches do not scale to all biodiversity, and they do not stop the publication of free-text phenotypes that would need post-publication curation. In addition, both manual and machine learning approaches face great challenges: the problem of inter-curator variation (curators interpret/convert a phenotype differently from each other) in manual curation, and keywords to ontology concept translation in automated information extraction, make it difficult for either approach to produce data that are truly FAIR. Our empirical studies show that inter-curator variation in translating phenotype characters to Entity-Quality statements (Mabee et al. 2007) is as high as 40% even within a single project. With this level of variation, curated data integrated from multiple curation projects may still not be FAIR. The key causes of this variation have been identified as semantic vagueness in original phenotype descriptions and difficulties in using standardized vocabularies (ontologies). We argue that the authors describing characters are the key to the solution. Given the right tools and appropriate attribution, the authors should be in charge of developing a project's semantics and ontology. This will speed up ontology development and improve the semantic clarity of the descriptions from the moment of publication. In this presentation, we will introduce the Platform for Author-Driven Computable Data and Ontology Production for Taxonomists, which consists of three components: a web-based, ontology-aware software application called 'Character Recorder,' which features a spreadsheet as the data entry platform and provides authors with the flexibility of using their preferred terminology in recording characters for a set of specimens (this application also facilitates semantic clarity and consistency across species descriptions); a set of services that produce RDF graph data, collects terms added by authors, detects potential conflicts between terms, dispatches conflicts to the third component and updates the ontology with resolutions; and an Android mobile application, 'Conflict Resolver,' which displays ontological conflicts and accepts solutions proposed by multiple experts. a web-based, ontology-aware software application called 'Character Recorder,' which features a spreadsheet as the data entry platform and provides authors with the flexibility of using their preferred terminology in recording characters for a set of specimens (this application also facilitates semantic clarity and consistency across species descriptions); a set of services that produce RDF graph data, collects terms added by authors, detects potential conflicts between terms, dispatches conflicts to the third component and updates the ontology with resolutions; and an Android mobile application, 'Conflict Resolver,' which displays ontological conflicts and accepts solutions proposed by multiple experts. Fig. 1 shows the system diagram of the platform. The presentation will consist of: a report on the findings from a recent survey of 90+ participants on the need for a tool like Character Recorder; a methods section that describes how we provide semantics to an existing vocabulary of quantitative characters through a set of properties that explain where and how a measurement (e.g., length of perigynium beak) is taken. We also report on how a custom color palette of RGB values obtained from real specimens or high-quality specimen images, can be used to help authors choose standardized color descriptions for plant specimens; and a software demonstration, where we show how Character Recorder and Conflict Resolver can work together to construct both human-readable descriptions and RDF graphs using morphological data derived from species in the plant genus Carex (sedges). The key difference of this system from other ontology-aware systems is that authors can directly add needed terms to the ontology as they wish and can update their data according to ontology updates. a report on the findings from a recent survey of 90+ participants on the need for a tool like Character Recorder; a methods section that describes how we provide semantics to an existing vocabulary of quantitative characters through a set of properties that explain where and how a measurement (e.g., length of perigynium beak) is taken. We also report on how a custom color palette of RGB values obtained from real specimens or high-quality specimen images, can be used to help authors choose standardized color descriptions for plant specimens; and a software demonstration, where we show how Character Recorder and Conflict Resolver can work together to construct both human-readable descriptions and RDF graphs using morphological data derived from species in the plant genus Carex (sedges). The key difference of this system from other ontology-aware systems is that authors can directly add needed terms to the ontology as they wish and can update their data according to ontology updates. The software modules currently incorporated in Character Recorder and Conflict Resolver have undergone formal usability studies. We are actively recruiting Carex experts to participate in a 3-day usability study of the entire system of the Platform for Author-Driven Computable Data and Ontology Production for Taxonomists. Participants will use the platform to record 100 characters about one Carex species. In addition to usability data, we will collect the terms that participants submit to the underlying ontology and the data related to conflict resolution. Such data allow us to examine the types and the quantities of logical conflicts that may result from the terms added by the users and to use Discrete Event Simulation models to understand if and how term additions and conflict resolutions converge. We look forward to a discussion on how the tools (Character Recorder is online at http://shark.sbs.arizona.edu/chrecorder/public) described in our presentation can contribute to producing and publishing FAIR data in taxonomic studies. 
    more » « less
  3. Mobile devices today provide a hardware-protected mode called Trusted Execution Environment (TEE) to help protect users from a compromised OS and hypervisor. Today TEE can only be leveraged either by vendor apps or by developers who work with the vendor. Since vendors consider third-party app code untrusted inside the TEE, to allow an app to leverage TEE, app developers have to write the app code in a tailored way to work with the vendor’s SDK. We proposed a novel design to integrate TEE with mobile OS to allow any app to leverage the TEE. Our design incorporates TEE support at the OS level, allowing apps to leverage the TEE without adding app-specific code into the TEE, and while using existing interface to interact with the mobile OS. We implemented our design, called TruZ-Droid, by integrating TrustZone TEE with the Android OS. TruZ-Droid allows apps to leverage the TEE to protect the following: (i) user’s secret input and confirmation, and (ii) sending of user’s secrets to the authorized server. We built a prototype using the TrustZone-enabled HiKey board to evaluate our design. We demonstrated TruZ-Droid’s effectiveness by adding new security features to existing apps to protect user’s sensitive information and attest user’s confirmation. TruZ-Droid’s real-world use case evaluation shows that apps can leverage TrustZone while using existing OS APIs. Our usability study proves that users can correctly interact with TruZ-Droid to protect their security sensitive activities and data. 
    more » « less
  4. The European General Data Protection Regulation (GDPR) mandates a data controller (e.g., an app developer) to provide all information specified in Articles (Arts.) 13 and 14 to data subjects (e.g., app users) regarding how their data are being processed and what are their rights. While some studies have started to detect the fulfillment of GDPR requirements in a privacy policy, their exploration only focused on a subset of mandatory GDPR requirements. In this paper, our goal is to explore the state of GDPR-completeness violations in mobile apps' privacy policies. To achieve our goal, we design the PolicyChecker framework by taking a rule and semantic role based approach. PolicyChecker automatically detects completeness violations in privacy policies based not only on all mandatory GDPR requirements but also on all if-applicable GDPR requirements that will become mandatory under specific conditions. Using PolicyChecker, we conduct the first large-scale GDPR-completeness violation study on 205,973 privacy policies of Android apps in the UK Google Play store. PolicyChecker identified 163,068 (79.2%) privacy policies containing data collection statements; therefore, such policies are regulated by GDPR requirements. However, the majority (99.3%) of them failed to achieve the GDPR-completeness with at least one unsatisfied requirement; 98.1% of them had at least one unsatisfied mandatory requirement, while 73.0% of them had at least one unsatisfied if-applicable requirement logic chain. We conjecture that controllers' lack of understanding of some GDPR requirements and their poor practices in composing a privacy policy can be the potential major causes behind the GDPR-completeness violations. We further discuss recommendations for app developers to improve the completeness of their apps' privacy policies to provide a more transparent personal data processing environment to users. 
    more » « less
  5. null (Ed.)
    Residential proxy has emerged as a service gaining popularity recently, in which proxy providers relay their customers’ network traffic through millions of proxy peers under their control. We find that many of these proxy peers are mobile devices, whose role in the proxy network can have significant security implications since mobile devices tend to be privacy and resource-sensitive. However, little effort has been made so far to understand the extent of their involvement, not to mention how these devices are recruited by the proxy network and what security and privacy risks they may pose. In this paper, we report the first measurement study on the mobile proxy ecosystem. Our study was made possible by a novel measurement infrastructure, which enabled us to identify proxy providers, to discover proxy SDKs (software development kits), to detect Android proxy apps built upon the proxy SDKs, to harvest proxy IP addresses, and to understand proxy traffic. The information collected through this infrastructure has brought to us new understandings of this ecosystem and important security discoveries. More specifically, 4 proxy providers were found to offer app developers mobile proxy SDKs as a competitive app monetization channel, with $50K per month per 1M MAU (monthly active users). 1,701 Android APKs (belonging to 963 Android apps) turn out to have integrated those proxy SDKs, with most of them available on Google Play with at least 300M installations in total. Furthermore, 48.43% of these APKs are flagged by at least 5 anti-virus engines as malicious, which could explain why 86.60% of the 963 Android apps have been removed from Google Play by Oct 2019. Besides, while these apps display user consent dialogs on traffic relay, our user study indicates that the user consent texts are quite confusing. We even discover a proxy SDK that stealthily relays traffic without showing any notifications. We also captured 625K cellular proxy IPs, along with a set of suspicious activities observed in proxy traffic such as ads fraud. We have reported our findings to affected parties, offered suggestions, and proposed the methodologies to detect proxy apps and proxy traffic. 
    more » « less