skip to main content

Title: Do You Get What You Pay For? Comparing The Privacy Behaviors of Free vs. Paid Apps
It is commonly assumed that the availability of “free” mobile apps comes at the cost of consumer privacy, and that paying for apps could offer consumers protection from behavioral advertising and long-term tracking. This work empirically evaluates the validity of this assumption by investigating the degree to which “free” apps and their paid premium versions differ in their bundled code, their declared permissions, and their data collection behaviors and privacy practices. We compare pairs of free and paid apps using a combination of static and dynamic analysis. We also examine the differences in the privacy policies within pairs. We rely on static analysis to determine the requested permissions and third-party SDKs in each app; we use dynamic analysis to detect sensitive data collected by remote services at the network traffic level; and we compare text versions of privacy policies to identify differences in the disclosure of data collection behaviors. In total, we analyzed 1,505 pairs of free Android apps and their paid counterparts, with free apps randomly drawn from the Google Play Store’s category-level top charts. Our results show that over our corpus of free and paid pairs, there is no clear evidence that paying for an app will guarantee protection from more » extensive data collection. Specifically, 48% of the paid versions reused all of the same third-party libraries as their free versions, while 56% of the paid versions inherited all of the free versions’ Android permissions to access sensitive device resources (when considering free apps that include at least one third-party library and request at least one Android permission). Additionally, our dynamic analysis reveals that 38% of the paid apps exhibit all of the same data collection and transmission behaviors as their free counterparts. Our exploration of privacy policies reveals that only 45% of the pairs provide a privacy policy of some sort, and less than 1% of the pairs overall have policies that differ between free and paid versions. « less
; ; ; ; ; ; ;
Award ID(s):
Publication Date:
Journal Name:
The Workshop on Technology and Consumer Protection (ConPro ’19)
Sponsoring Org:
National Science Foundation
More Like this
  1. It is commonly assumed that “free” mobile apps come at the cost of consumer privacy and that paying for apps could offer consumers protection from behavioral advertising and long-term tracking. This work empirically evaluates the validity of this assumption by comparing the privacy practices of free apps and their paid premium versions, while also gauging consumer expectations surrounding free and paid apps. We use both static and dynamic analysis to examine 5,877 pairs of free Android apps and their paid counterparts for differences in data collection practices and privacy policies between pairs. To understand user expectations for paid apps, we conducted a 998-participant online survey and found that consumers expect paid apps to have better security and privacy behaviors. However, there is no clear evidence that paying for an app will actually guarantee protection from extensive data collection in practice. Given that the free version had at least one thirdparty library or dangerous permission, respectively, we discovered that 45% of the paid versions reused all of the same third-party libraries as their free versions, and 74% of the paid versions had all of the dangerous permissions held by the free app. Likewise, our dynamic analysis revealed that 32% of themore »paid apps exhibit all of the same data collection and transmission behaviors as their free counterparts. Finally, we found that 40% of apps did not have a privacy policy link in the Google Play Store and that only 3.7% of the pairs that did reflected differences between the free and paid versions.« less
  2. The dominant privacy framework of the information age relies on notions of “notice and consent.” That is, service providers will disclose, often through privacy policies, their data collection practices, and users can then consent to their terms. However, it is unlikely that most users comprehend these disclosures, which is due in no small part to ambiguous, deceptive, and misleading statements. By comparing actual collection and sharing practices to disclosures in privacy policies, we demonstrate the scope of the problem. Through analysis of 68,051 apps from the Google Play Store, their corresponding privacy policies, and observed data transmissions, we investigated the potential misrepresentations of apps in the Designed For Families (DFF) program, inconsistencies in disclosures regarding third-party data sharing, as well as contradictory disclosures about secure data transmissions. We find that of the 8,030 DFF apps (i.e., apps directed at children), 9.1% claim that their apps are not directed at children, while 30.6% claim to have no knowledge that the received data comes from children. In addition, we observe that 10.5% of 68,051 apps share personal identifiers with third-party service providers, yet do not declare any in their privacy policies, and only 22.2% of the apps explicitly name third parties. Thismore »ultimately makes it not only difficult, but in most cases impossible, for users to establish where their personal data is being processed. Furthermore, we find that 9,424 apps do not use TLS when transmitting personal identifiers, yet 28.4% of these apps claim to take measures to secure data transfer. Ultimately, these divergences between disclosures and actual app behaviors illustrate the ridiculousness of the notice and consent framework.« less
  3. Furnell, Steven (Ed.)
    A huge amount of personal and sensitive data is shared on Facebook, which makes it a prime target for attackers. Adversaries can exploit third-party applications connected to a user’s Facebook profile (i.e., Facebook apps) to gain access to this personal information. Users’ lack of knowledge and the varying privacy policies of these apps make them further vulnerable to information leakage. However, little has been done to identify mismatches between users’ perceptions and the privacy policies of Facebook apps. We address this challenge in our work. We conducted a lab study with 31 participants, where we received data on how they share information in Facebook, their Facebook-related security and privacy practices, and their perceptions on the privacy aspects of 65 frequently-used Facebook apps in terms of data collection, sharing, and deletion. We then compared participants’ perceptions with the privacy policy of each reported app. Participants also reported their expectations about the types of information that should not be collected or shared by any Facebook app. Our analysis reveals significant mismatches between users’ privacy perceptions and reality (i.e., privacy policies of Facebook apps), where we identified over-optimism not only in users’ perceptions of information collection, but also on their self-efficacy in protectingmore »their information in Facebook despite experiencing negative incidents in the past. To the best of our knowledge, this is the first study on the gap between users’ privacy perceptions around Facebook apps and the reality. The findings from this study offer directions for future research to address that gap through designing usable, effective, and personalized privacy notices to help users to make informed decisions about using Facebook apps.« less
  4. Modern smartphone platforms implement permission-based models to protect access to sensitive data and system resources. However, apps can circumvent the permission model and gain access to protected data without user consent by using both covert and side channels. Side channels present in the implementation of the permission system allow apps to access protected data and system resources without permission; whereas covert channels enable communication between two colluding apps so that one app can share its permission-protected data with another app lacking those permissions. Both pose threats to user privacy. In this work, we make use of our infrastructure that runs hundreds of thousands of apps in an instrumented environment. This testing environment includes mechanisms to monitor apps' runtime behaviour and network traffic. We look for evidence of side and covert channels being used in practice by searching for sensitive data being sent over the network for which the sending app did not have permissions to access it. We then reverse engineer the apps and third-party libraries responsible for this behaviour to determine how the unauthorized access occurred. We also use software fingerprinting methods to measure the static prevalence of the technique that we discover among other apps in our corpus.more »Using this testing environment and method, we uncovered a number of side and covert channels in active use by hundreds of popular apps and third-party SDKs to obtain unauthorized access to both unique identifiers as well as geolocation data. We have responsibly disclosed our findings to Google and have received a bug bounty for our work.« less
  5. Mobile apps nowadays are often packaged with third-party ad libraries to monetize user data. Many mobile ad networks exploit these mobile apps to extract sensitive real-time geographical data about the users for location-based targeted advertising. However, the massive collection of sensitive information by the ad networks has raised serious privacy concerns. Unfortunately, the extent and granularity of private data collection of the location-based ad networks remain obscure. In this work, we present a mobile tracking measurement study to characterize the severity and significance of location-based private data collection in mobile ad networks, by using an automated fine-grained data collection instrument running across different geographical areas. We perform extensive threat assessments for different ad networks using 1,100 popular apps running across 10 different cities. This study discovers that the number of location-based ads tend to be positively correlated with the population density of locations, ad networks' data collection behaviors differ across different locations, and most ad networks are capable of collecting precise location data. Detailed analysis further reveals the significant impact of geolocation on the tracking behavior of targeted ads, and a noteworthy security concern for advertising organizations to aggregate different types of private user data across multiple apps for amore »better targeted ad experience.« less