skip to main content


Title: User-friendly yet rarely read: A case study on the redesign of an online HIPAA authorization
In this paper we describe the iterative evaluation and refinement of a consent flow for a chatbot being developed by a large U.S. health insurance company. This chatbot’s use of a cloud service provider triggers a requirement for users to agree to a HIPAA authorization. We highlight remote usability study and online survey findings indicating that simplifying the interface and language of the consent flow can improve the user experience and help users who read the content understand how their data may be used. However, we observe that most users in our studies, even those using our improved consent flows, missed important information in the authorization until we asked them to review it again. We also show that many people are overconfident about the privacy and security of healthcare data and that many people believe HIPAA protects in far more contexts than it actually does. Given that our redesigns following best practices did not produce many meaningful improvements in informed consent, we argue for the need for research on alternate approaches to health data disclosures such as standardized disclosures; methods borrowed from clinical research contexts such as multimedia formats, quizzes, and conversational approaches; and automated privacy assistants.  more » « less
Award ID(s):
2150217
NSF-PAR ID:
10392520
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Proceedings on Privacy Enhancing Technologies
Volume:
2022
Issue:
3
ISSN:
2299-0984
Page Range / eLocation ID:
558 to 581
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. In the United States, sensitive health information is protected under the Health Insurance Portability and Accountability Act (HIPAA). This act limits the disclosure of Protected Health Information (PHI) without the patient’s consent or knowledge. However, as medical care becomes web-integrated, many providers have chosen to use third-party web trackers for measurement and marketing purposes. This presents a security concern: third-party JavaScript requested by an online healthcare system can read the website’s contents, and ensuring PHI is not unintentionally or maliciously leaked becomes difficult. In this paper, we investigate health information breaches in online medical records, focusing on 459 online patient portals and 4 telehealth websites. We find 14% of patient portals include Google Analytics, which reveals (at a minimum) the fact that the user visited the health provider website, while 5 portals and 4 telehealth websites con- tained JavaScript-based services disclosing PHI, including medications and lab results, to third parties. The most significant PHI breaches were on behalf of Google and Facebook trackers. In the latter case, an estimated 4.5 million site visitors per month were potentially exposed to leaks of personal information (names, phone numbers) and medical information (test results, medications). We notified healthcare providers of the PHI breaches and found only 15.7% took action to correct leaks. Healthcare operators lacked the technical expertise to identify PHI breaches caused by third-party trackers. After notifying Epic, a healthcare portal vendor, of the PHI leaks, we received a prompt response and observed extensive mitigation across providers, suggesting vendor notification is an effective intervention against PHI disclosures. 
    more » « less
  2. The dominant privacy framework of the information age relies on notions of “notice and consent.” That is, service providers will disclose, often through privacy policies, their data collection practices, and users can then consent to their terms. However, it is unlikely that most users comprehend these disclosures, which is due in no small part to ambiguous, deceptive, and misleading statements. By comparing actual collection and sharing practices to disclosures in privacy policies, we demonstrate the scope of the problem. Through analysis of 68,051 apps from the Google Play Store, their corresponding privacy policies, and observed data transmissions, we investigated the potential misrepresentations of apps in the Designed For Families (DFF) program, inconsistencies in disclosures regarding third-party data sharing, as well as contradictory disclosures about secure data transmissions. We find that of the 8,030 DFF apps (i.e., apps directed at children), 9.1% claim that their apps are not directed at children, while 30.6% claim to have no knowledge that the received data comes from children. In addition, we observe that 10.5% of 68,051 apps share personal identifiers with third-party service providers, yet do not declare any in their privacy policies, and only 22.2% of the apps explicitly name third parties. This ultimately makes it not only difficult, but in most cases impossible, for users to establish where their personal data is being processed. Furthermore, we find that 9,424 apps do not use TLS when transmitting personal identifiers, yet 28.4% of these apps claim to take measures to secure data transfer. Ultimately, these divergences between disclosures and actual app behaviors illustrate the ridiculousness of the notice and consent framework. 
    more » « less
  3. Patient-generated health data (PGHD), created and captured from patients via wearable devices and mobile apps, are proliferating outside of clinical settings. Examples include sleep tracking, fitness trackers, continuous glucose monitors, and RFID-enabled implants, with many additional biometric or health surveillance applications in development or envisioned. These data are included in growing stockpiles of personal health data being mined for insight via big data analytics and artificial intelligence/deep learning technologies. Governing these data resources to facilitate patient care and health research while preserving individual privacy and autonomy will be challenging, as PGHD are the least regulated domains of digitalized personal health data (U.S. Department of Health and Human Services, 2018). When patients themselves collect digitalized PGHD using “apps” provided by technology firms, these data fall outside of conventional health data regulation, such as HIPAA. Instead, PGHD are maintained primarily on the information technology infrastructure of vendors, and data are governed under the IT firm’s own privacy policies and within the firm’s intellectual property rights. Dominant narratives position these highly personal data as valuable resources to transform healthcare, stimulate innovation in medical research, and engage individuals in their health and healthcare. However, ensuring privacy, security, and equity of benefits from PGHD will be challenging. PGHD can be aggregated and, despite putative “deidentification,” be linked with other health, economic, and social data for predictive analytics. As large tech companies enter the healthcare sector (e.g., Google Health is partnering with Ascension Health to analyze the PHI of millions of people across 21 U.S. states), the lack of harmonization between regulatory regimes may render existing safeguards to preserve patient privacy and control over their PHI ineffective. While healthcare providers are bound to adhere to health privacy laws, Big Tech comes under more relaxed regulatory regimes that will facilitate monetizing PGHD. We explore three existing data protection regimes relevant to PGHD in the United States that are currently in tension with one another: federal and state health-sector laws, data use and reuse for research and innovation, and industry self-regulation by large tech companies We then identify three types of structures (organizational, regulatory, technological/algorithmic), which synergistically could help enact needed regulatory oversight while limiting the friction and economic costs of regulation. This analysis provides a starting point for further discussions and negotiations among stakeholders and regulators to do so. 
    more » « less
  4. Patient-generated health data (PGHD), created and captured from patients via wearable devices and mobile apps, are proliferating outside of clinical settings. Examples include sleep tracking, fitness trackers, continuous glucose monitors, and RFID-enabled implants, with many additional biometric or health surveillance applications in development or envisioned. These data are included in growing stockpiles of personal health data being mined for insight via big data analytics and artificial intelligence/deep learning technologies. Governing these data resources to facilitate patient care and health research while preserving individual privacy and autonomy will be challenging, as PGHD are the least regulated domains of digitalized personal health data (U.S. Department of Health and Human Services, 2018). When patients themselves collect digitalized PGHD using “apps” provided by technology firms, these data fall outside of conventional health data regulation, such as HIPAA. Instead, PGHD are maintained primarily on the information technology infrastructure of vendors, and data are governed under the IT firm’s own privacy policies and within the firm’s intellectual property rights. Dominant narratives position these highly personal data as valuable resources to transform healthcare, stimulate innovation in medical research, and engage individuals in their health and healthcare. However, ensuring privacy, security, and equity of benefits from PGHD will be challenging. PGHD can be aggregated and, despite putative “deidentification,” be linked with other health, economic, and social data for predictive analytics. As large tech companies enter the healthcare sector (e.g., Google Health is partnering with Ascension Health to analyze the PHI of millions of people across 21 U.S. states), the lack of harmonization between regulatory regimes may render existing safeguards to preserve patient privacy and control over their PHI ineffective. While healthcare providers are bound to adhere to health privacy laws, Big Tech comes under more relaxed regulatory regimes that will facilitate monetizing PGHD. We explore three existing data protection regimes relevant to PGHD in the United States that are currently in tension with one another: federal and state health-sector laws, data use and reuse for research and innovation, and industry self-regulation by large tech companies We then identify three types of structures (organizational, regulatory, technological/algorithmic), which synergistically could help enact needed regulatory oversight while limiting the friction and economic costs of regulation. This analysis provides a starting point for further discussions and negotiations among stakeholders and regulators to do so. 
    more » « less
  5. null (Ed.)
    An increasing number of people are sharing information through text messages, emails, and social media without proper privacy checks. In many situations, this could lead to serious privacy threats. This paper presents a methodology for providing extra safety precautions without being intrusive to users. We have developed and evaluated a model to help users take control of their shared information by automatically identifying text (i.e., a sentence or a transcribed utterance) that might contain personal or private disclosures. We apply off-the-shelf natural language processing tools to derive linguistic features such as part-of-speech, syntactic dependencies, and entity relations. From these features, we model and train a multichannel convolutional neural network as a classifier to identify short texts that have personal, private disclosures. We show how our model can notify users if a piece of text discloses personal or private information, and evaluate our approach in a binary classification task with 93% accuracy on our own labeled dataset, and 86% on a dataset of ground truth. Unlike document classification tasks in the area of natural language processing, our framework is developed keeping the sentence level context into consideration. 
    more » « less