The increasing societal concern for consumer information privacy has led to the enforcement of privacy regulations worldwide. In an effort to adhere to privacy regulations such as the General Data Protection Regulation (GDPR), many companies’ privacy policies have become increasingly lengthy and complex. In this study, we adopted the computational design science paradigm to design a novel privacy policy evolution analytics framework to help identify how companies change and present their privacy policies based on privacy regulations. The framework includes a self-attentive annotation system (SAAS) that automatically annotates paragraph-length segments in privacy policies to help stakeholders identify data practices of interest for further investigation. We rigorously evaluated SAAS against state-of-the-art machine learning (ML) and deep learning (DL)-based methods on a well-established privacy policy dataset, OPP-115. SAAS outperformed conventional ML and DL models in terms of F1-score by statistically significant margins. We demonstrate the proposed framework’s practical utility with an in-depth case study of GDPR’s impact on Amazon’s privacy policies. The case study results indicate that Amazon’s post-GDPR privacy policy potentially violates a fundamental principle of GDPR by causing consumers to exert more effort to find information about first-party data collection. Given the increasing importance of consumer information privacy, the proposed framework has important implications for regulators and companies. We discuss several design principles followed by the SAAS that can help guide future design science-based e-commerce, health, and privacy research.
more »
« less
Health Compliance Through a Transparent Supply Chain
Privacy regimes are increasingly taking center stage for bringing up cases against violators or introducing new regulations to safeguard consumer rights. Health regulations mostly predate most of the generic privacy regulations. However, we still see how health entities fail to meet regulatory requirements. Prior work suggests that third-party code is responsible for a significant portion of these violations. Hence, we propose using Software Bills of Materials (SBOM) as an effective intervention for communicating compliance limitations and expectations surrounding third-party code to help developers make informed decisions.
more »
« less
- Award ID(s):
- 2217771
- PAR ID:
- 10545904
- Publisher / Repository:
- IEEE Workshop on Technology and Consumer Protection (ConPro ’24)
- Date Published:
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Sri Lanka recently passed its first privacy legislation covering a wide range of sectors, including health. As a precursor for effective stakeholder engagement in the health domain to understand the most effective way to implement legislation in healthcare, we have analyzed 41 popular mobile apps and web portals. We found that 78% of the tested systems have third-party domains receiving sensitive health data with minimal visibility to the consumers. We discuss how this will create potential issues in preparing for the new privacy legislation.more » « less
-
Abstract: Health data is considered to be sensitive and personal; both governments and software platforms have enacted specific measures to protect it. Consumer apps that collect health data are becoming more popular, but raise new privacy concerns as they collect unnecessary data, share it with third parties, and track users. However, developers of these apps are not necessarily knowingly endangering users’ privacy; some may simply face challenges working with health features. To scope these challenges, we qualitatively analyzed 269 privacy-related posts on Stack Overflow by developers of health apps for Android- and iOS-based systems. We found that health-specific access control structures (e.g., enhanced requirements for permissions and authentication) underlie several privacy-related challenges developers face. The specific nature of problems often differed between the platforms, for example additional verification steps for Android developers, or confusing feedback about incorrectly formulated permission scopes for iOS. Developers also face problems introduced by third-party libraries. Official documentation plays a key part in understanding privacy requirements, but in some cases, may itself cause confusion. We discuss implications of our findings and propose ways to improve developers’ experience of working with health-related features -- and consequently to improve the privacy of their apps’ end users.more » « less
-
It is commonly assumed that the availability of “free” mobile apps comes at the cost of consumer privacy, and that paying for apps could offer consumers protection from behavioral advertising and long-term tracking. This work empirically evaluates the validity of this assumption by investigating the degree to which “free” apps and their paid premium versions differ in their bundled code, their declared permissions, and their data collection behaviors and privacy practices. We compare pairs of free and paid apps using a combination of static and dynamic analysis. We also examine the differences in the privacy policies within pairs. We rely on static analysis to determine the requested permissions and third-party SDKs in each app; we use dynamic analysis to detect sensitive data collected by remote services at the network traffic level; and we compare text versions of privacy policies to identify differences in the disclosure of data collection behaviors. In total, we analyzed 1,505 pairs of free Android apps and their paid counterparts, with free apps randomly drawn from the Google Play Store’s category-level top charts. Our results show that over our corpus of free and paid pairs, there is no clear evidence that paying for an app will guarantee protection from extensive data collection. Specifically, 48% of the paid versions reused all of the same third-party libraries as their free versions, while 56% of the paid versions inherited all of the free versions’ Android permissions to access sensitive device resources (when considering free apps that include at least one third-party library and request at least one Android permission). Additionally, our dynamic analysis reveals that 38% of the paid apps exhibit all of the same data collection and transmission behaviors as their free counterparts. Our exploration of privacy policies reveals that only 45% of the pairs provide a privacy policy of some sort, and less than 1% of the pairs overall have policies that differ between free and paid versions.more » « less
-
Identifying privacy-sensitive data leaks by mobile applications has been a topic of great research interest for the past decade. Technically, such data flows are not “leaks” if they are disclosed in a privacy policy. To address this limitation in automated analysis, recent work has combined program analysis of applications with analysis of privacy policies to determine the flow-to-policy consistency, and hence violations thereof. However, this prior work has a fundamental weakness: it does not differentiate the entity (e.g., first-party vs. third-party) receiving the privacy-sensitive data. In this paper, we propose POLICHECK, which formalizes and implements an entity-sensitive flow-to-policy consistency model. We use POLICHECK to study 13,796 applications and their privacy policies and find that up to 42.4% of applications either incorrectly disclose or omit disclosing their privacy-sensitive data flows. Our results also demonstrate the significance of considering entities: without considering entity, prior approaches would falsely classify up to 38.4% of applications as having privacy-sensitive data flows consistent with their privacy policies. These false classifications include data flows to third-parties that are omitted (e.g., the policy states only the first-party collects the data type), incorrect (e.g., the policy states the third-party does not collect the data type), and ambiguous (e.g., the policy has conflicting statements about the data type collection). By defining a novel automated, entity-sensitive flow-to-policy consistency analysis, POLICHECK provides the highest-precision method to date to determine if applications properly disclose their privacy-sensitive behaviors.more » « less