skip to main content


Title: Data Flow Maps - Increasing Data Processing Transparency and Privacy Compliance in the Enterprise
In recent years, well-known cyber breaches have placed growing pressure on organizations to implement proper privacy and data protection standards. Attacks involving the theft of employee and customer personal information have damaged the reputations of well-known brands, resulting in significant financial costs. As a result, governments across the globe are actively examining and strengthening laws to better protect the personal data of its citizens. The General Data Protection Regulation (GDPR) updates European privacy law with an array of provisions that better protect consumers and require organizations to focus on accounting for privacy in their business processes through “privacy-by-design” and “privacy by default” principles. In the US, the National Privacy Research Strategy (NPRS), makes several recommendations that reinforce the need for organizations to better protect data. In response to these rapid developments in privacy compliance, data flow mapping has emerged as a valuable tool. Data flow mapping depicts the flow of data through a system or process, enumerating specific data elements handled, while identifying the risks at different stages of the data lifecycle. This Article explains the critical features of a data flow map and discusses how mapping may improve the transparency of the data lifecycle, while recognizing the limitations in building out data flow maps and the difficulties of maintaining updated maps. The Article then explores how data flow mapping may support data collection, transfer, storage, and destruction practices pursuant to various privacy regulations. Finally, a hypothetical case study is presented to show how data flow mapping was used by an organization to stay compliant with privacy rules and to improve the transparency of information flows  more » « less
Award ID(s):
1654085
NSF-PAR ID:
10039657
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Washington and Lee law review
Volume:
73
Issue:
2
ISSN:
0043-0463
Page Range / eLocation ID:
802-828
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Background The use of wearables facilitates data collection at a previously unobtainable scale, enabling the construction of complex predictive models with the potential to improve health. However, the highly personal nature of these data requires strong privacy protection against data breaches and the use of data in a way that users do not intend. One method to protect user privacy while taking advantage of sharing data across users is federated learning, a technique that allows a machine learning model to be trained using data from all users while only storing a user’s data on that user’s device. By keeping data on users’ devices, federated learning protects users’ private data from data leaks and breaches on the researcher’s central server and provides users with more control over how and when their data are used. However, there are few rigorous studies on the effectiveness of federated learning in the mobile health (mHealth) domain. Objective We review federated learning and assess whether it can be useful in the mHealth field, especially for addressing common mHealth challenges such as privacy concerns and user heterogeneity. The aims of this study are to describe federated learning in an mHealth context, apply a simulation of federated learning to an mHealth data set, and compare the performance of federated learning with the performance of other predictive models. Methods We applied a simulation of federated learning to predict the affective state of 15 subjects using physiological and motion data collected from a chest-worn device for approximately 36 minutes. We compared the results from this federated model with those from a centralized or server model and with the results from training individual models for each subject. Results In a 3-class classification problem using physiological and motion data to predict whether the subject was undertaking a neutral, amusing, or stressful task, the federated model achieved 92.8% accuracy on average, the server model achieved 93.2% accuracy on average, and the individual model achieved 90.2% accuracy on average. Conclusions Our findings support the potential for using federated learning in mHealth. The results showed that the federated model performed better than a model trained separately on each individual and nearly as well as the server model. As federated learning offers more privacy than a server model, it may be a valuable option for designing sensitive data collection methods. 
    more » « less
  2. To prepare for the age of the intelligent, highly connected, and autonomous vehicle, a new approach to concepts of granting consent, managing privacy, and dealing with the need to interact quickly and meaningfully is needed. Additionally, in an environment where personal data is rapidly shared with a multitude of independent parties, there exists a need to reduce the information asymmetry that currently exists between the user and data collecting entities. This Article rethinks the traditional notice and consent model in the context of real-time communication between vehicles or vehicles and infrastructure or vehicles and other surroundings and proposes a re-engineering of current privacy concepts to prepare for a rapidly approaching digital future. In this future, multiple independent actors such as vehicles or other machines may seek personal information at a rate that makes the traditional informed consent model untenable. This Article proposes a two-step approach: As an attempt to meet and balance user needs for a seamless experience while preserving their rights to privacy, the first step is a less static consent paradigm able to better support personal data in systems which use machine based real time communication and automation. In addition, the article proposes a radical re-thinking of the current privacy protection system by sharing the vision of “Privacy as a Service” as a second step, which is an independently managed method of granular technical privacy control that can better protect individual privacy while at the same time facilitating high-frequency communication in a machine-to-machine environment. 
    more » « less
  3. An essential requirement of any information management system is to protect data and resources against breach or improper modifications, while at the same time ensuring data access to legitimate users. Systems handling personal data are mandated to track its flow to comply with data protection regulations. We have built a novel framework that integrates semantically rich data privacy knowledge graph with Hyperledger Fabric blockchain technology, to develop an automated access-control and audit mechanism that enforces users' data privacy policies while sharing their data with third parties. Our blockchain based data-sharing solution addresses two of the most critical challenges: transaction verification and permissioned data obfuscation. Our solution ensures accountability for data sharing in the cloud by incorporating a secure and efficient system for End-to-End provenance. In this paper, we describe this framework along with the comprehensive semantically rich knowledge graph that we have developed to capture rules embedded in data privacy policy documents. Our framework can be used by organizations to automate compliance of their Cloud datasets. 
    more » « less
  4. null (Ed.)
    Security and privacy, regardless of the instance, are preponderating topics for most organizations. Bioinformatics and the study of computational biology are no exception. The premise of this report is to discuss the many different privacy concerns as it pertains to the field of bioinformatics, as well as the usage and storage of personal biodata. With the varying threats that target average users of technology, is the capability and infrastructure currently in place to protect users against a leakage or breach in personal data? This study discusses the different concerns surrounding the field of bioinformatics, how the data and personal information is currently stored, and will make recommendations on how to mitigate the risks associated with the usage and storage of personal biodata. This study includes interviews from bioinformaticians and industry professionals, a survey of adults who have the potential for impact, and current legislature that exists to address personal data protection. 
    more » « less
  5. Involving the public in scientific discovery offers opportunities for engagement, learning, participation, and action. Since its launch in 2007, the CitSci.org platform has supported hundreds of community-driven citizen science projects involving thousands of participants who have generated close to a million scientific measurements around the world. Members using CitSci.org follow their curiosities and concerns to develop, lead, or simply participate in research projects. While professional scientists are trained to make ethical determinations related to the collection of, access to, and use of information, citizen scientists and practitioners may be less aware of such issues and more likely to become involved in ethical dilemmas. In this era of big and open data, where data sharing is encouraged and open science is promoted, privacy and openness considerations can often be overlooked. Platforms that support the collection, use, and sharing of data and personal information need to consider their responsibility to protect the rights to and ownership of data, the provision of protection options for data and members, and at the same time provide options for openness. This requires critically considering both intended and unintended consequences of the use of platforms, data, and volunteer information. Here, we use our journey developing CitSci.org to argue that incorporating customization into platforms through flexible design options for project managers shifts the decision-making from top-down to bottom-up and allows project design to be more responsive to goals. To protect both people and data, we developed—and continue to improve—options that support various levels of “open” and “closed” access permissions for data and membership participation. These options support diverse governance styles that are responsive to data uses, traditional and indigenous knowledge sensitivities, intellectual property rights, personally identifiable information concerns, volunteer preferences, and sensitive data protections. We present a typology for citizen science openness choices, their ethical considerations, and strategies that we are actively putting into practice to expand privacy options and governance models based on the unique needs of individual projects using our platform. 
    more » « less