skip to main content


Title: Data Market Discipline: From Financial Regulation to Data Governance
Privacy regulation has traditionally been the remit of consumer protection, and privacy harm is cast as a contractual harm arising from the interpersonal exchanges between data subjects and data collectors. This frames surveillance of people by companies as primarily a consumer harm. In this article, we argue that the modern economy of personal data is better understood as an extension of the financial system. The data economy intersects with capital markets in ways that may increase systemic and systematic financial risks. We contribute a new regulatory approach to privacy harms: as a source of risk correlated across households, firms and the economy as a whole. We consider adapting tools from macroprudential regulations designed to mitigate financial crises to the market for personal data. We identify both promises and pitfalls to viewing individual privacy through the lens of the financial system.  more » « less
Award ID(s):
2105301
NSF-PAR ID:
10327398
Author(s) / Creator(s):
;
Editor(s):
Johnson, Kristin N.; Reyes, Carla L.
Date Published:
Journal Name:
Journal of international and comparative law
Volume:
8
Issue:
2
ISSN:
2313-3775
Page Range / eLocation ID:
459-486
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Research on keystroke dynamics has the good potential to offer continuous authentication that complements conventional authentication methods in combating insider threats and identity theft before more harm can be done to the genuine users. Unfortunately, the large amount of data required by free-text keystroke authentication often contain personally identifiable information, or PII, and personally sensitive information, such as a user's first name and last name, username and password for an account, bank card numbers, and social security numbers. As a result, there are privacy risks associated with keystroke data that must be mitigated before they are shared with other researchers. We conduct a systematic study to remove PII's from a recent large keystroke dataset. We find substantial amounts of PII's from the dataset, including names, usernames and passwords, social security numbers, and bank card numbers, which, if leaked, may lead to various harms to the user, including personal embarrassment, blackmails, financial loss, and identity theft. We thoroughly evaluate the effectiveness of our detection program for each kind of PII. We demonstrate that our PII detection program can achieve near perfect recall at the expense of losing some useful information (lower precision). Finally, we demonstrate that the removal of PII's from the original dataset has only negligible impact on the detection error tradeoff of the free-text authentication algorithm by Gunetti and Picardi. We hope that this experience report will be useful in informing the design of privacy removal in future keystroke dynamics based user authentication systems. 
    more » « less
  2. In recent years, well-known cyber breaches have placed growing pressure on organizations to implement proper privacy and data protection standards. Attacks involving the theft of employee and customer personal information have damaged the reputations of well-known brands, resulting in significant financial costs. As a result, governments across the globe are actively examining and strengthening laws to better protect the personal data of its citizens. The General Data Protection Regulation (GDPR) updates European privacy law with an array of provisions that better protect consumers and require organizations to focus on accounting for privacy in their business processes through “privacy-by-design” and “privacy by default” principles. In the US, the National Privacy Research Strategy (NPRS), makes several recommendations that reinforce the need for organizations to better protect data. In response to these rapid developments in privacy compliance, data flow mapping has emerged as a valuable tool. Data flow mapping depicts the flow of data through a system or process, enumerating specific data elements handled, while identifying the risks at different stages of the data lifecycle. This Article explains the critical features of a data flow map and discusses how mapping may improve the transparency of the data lifecycle, while recognizing the limitations in building out data flow maps and the difficulties of maintaining updated maps. The Article then explores how data flow mapping may support data collection, transfer, storage, and destruction practices pursuant to various privacy regulations. Finally, a hypothetical case study is presented to show how data flow mapping was used by an organization to stay compliant with privacy rules and to improve the transparency of information flows 
    more » « less
  3. Around the world, people increasingly generate data through their everyday activities. Much of this happens unwittingly through sensors, cameras, and other surveillance tools on roads, in cities, and at the workplace. However, how individuals and governments think about privacy varies significantly around the world. In this article, we explore differences between people’s attitudes toward privacy and data collection practices in the United States and the Netherlands, two countries with very different regulatory approaches to governing consumer privacy. Through a factorial vignette survey deployed in the two countries, we identify specific contextual factors associated with concerns regarding how personal data are being used. Using Nissenbaum’s framework of privacy as contextual integrity to guide our analysis, we consider the role that five factors play in this assessment: actors (those using data), data type, amount of data collected, reported purpose of data use, and inferences drawn from the data. Findings indicate nationally bound differences as well as shared concerns and indicate future directions for cross-cultural privacy research. 
    more » « less
  4. The trouble with data is that it frequently provides only an imperfect representation of a phenomenon of interest. Experts who are familiar with their datasets will often make implicit, mental corrections when analyzing a dataset, or will be cautious not to be overly confident about their findings if caveats are present. However, personal knowledge about the caveats of a dataset is typically not incorporated in a structured way, which is problematic if others who lack that knowledge interpret the data. In this work, we define such analysts' knowledge about datasets as data hunches. We differentiate data hunches from uncertainty and discuss types of hunches. We then explore ways of recording data hunches, and, based on a prototypical design, develop recommendations for designing visualizations that support data hunches. We conclude by discussing various challenges associated with data hunches, including the potential for harm and challenges for trust and privacy. We envision that data hunches will empower analysts to externalize their knowledge, facilitate collaboration and communication, and support the ability to learn from others' data hunches. 
    more » « less
  5. Purpose Existing algorithms for predicting suicide risk rely solely on data from electronic health records, but such models could be improved through the incorporation of publicly available socioeconomic data – such as financial, legal, life event and sociodemographic data. The purpose of this study is to understand the complex ethical and privacy implications of incorporating sociodemographic data within the health context. This paper presents results from a survey exploring what the general public’s knowledge and concerns are about such publicly available data and the appropriateness of using it in suicide risk prediction algorithms. Design/methodology/approach A survey was developed to measure public opinion about privacy concerns with using socioeconomic data across different contexts. This paper presented respondents with multiple vignettes that described scenarios situated in medical, private business and social media contexts, and asked participants to rate their level of concern over the context and what factor contributed most to their level of concern. Specific to suicide prediction, this paper presented respondents with various data attributes that could potentially be used in the context of a suicide risk algorithm and asked participants to rate how concerned they would be if each attribute was used for this purpose. Findings The authors found considerable concern across the various contexts represented in their vignettes, with greatest concern in vignettes that focused on the use of personal information within the medical context. Specific to the question of incorporating socioeconomic data within suicide risk prediction models, the results of this study show a clear concern from all participants in data attributes related to income, crime and court records, and assets. Data about one’s household were also particularly concerns for the respondents, suggesting that even if one might be comfortable with their own being used for risk modeling, data about other household members is more problematic. Originality/value Previous studies on the privacy concerns that arise when integrating data pertaining to various contexts of people’s lives into algorithmic and related computational models have approached these questions from individual contexts. This study differs in that it captured the variation in privacy concerns across multiple contexts. Also, this study specifically assessed the ethical concerns related to a suicide prediction model and determining people’s awareness of the publicness of select data attributes, as well as which of these data attributes generated the most concern in such a context. To the best of the authors’ knowledge, this is the first study to pursue this question. 
    more » « less