Future online safety technologies should consider the privacy needs of adolescents (ages 13-17) and support their ability to self-regulate their online behaviors and navigate online risks. To do this, adolescent online safety researchers and practitioners must shift towards solutions that are more teen-centric by designing privacy-preserving online safety solutions for teens. In this paper, we discuss privacy challenges we have encountered in conducting adolescent online safety research. We discuss privacy concerns of teens in regard to sharing their private social media data with researchers and potentially taking part in a user study where they share some of this information with their parents. Our research emphasizes a need for more privacy-preserving interventions for teens.
more »
« less
Safeguarding Privacy in Genome Research: A Comprehensive Framework for Authors
Abstract As genomic research continues to advance, sharing of genomic data and research outcomes has become increasingly important for fostering collaboration and accelerating scientific discovery. However, such data sharing must be balanced with the need to protect the privacy of individuals whose genetic information is being utilized. This paper presents a bidirectional framework for evaluating privacy risks associated with data shared (both in terms of summary statistics and research datasets) in genomic research papers, particularly focusing on re-identification risks such as membership inference attacks (MIA). The framework consists of a structured workflow that begins with a questionnaire designed to capture researchers’ (authors’) self-reported data sharing practices and privacy protection measures. Responses are used to calculate the risk of re-identification for their study (paper) when compared with the National Institutes of Health (NIH) genomic data sharing policy. Any gaps in compliance help us to identify potential vulnerabilities and encourage the researchers to enhance their privacy measures before submitting their research for publication. The paper also demonstrates the application of this framework, using published genomic research as case study scenarios to emphasize the importance of implementing bidirectional frameworks to support trustworthy open science and genomic data sharing practices.
more »
« less
- Award ID(s):
- 2427505
- PAR ID:
- 10627878
- Publisher / Repository:
- Proceedings of AMIA Joint Summits on Translational Science
- Date Published:
- Format(s):
- Medium: X
- Institution:
- Proceedings of AMIA Joint Summits on Translational Science
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract One of the major challenges in ensuring global food security is the ever‐changing biotic risk affecting the productivity and efficiency of the global food supply system. Biotic risks that threaten food security include pests and diseases that affect pre‐ and postharvest terrestrial agriculture and aquaculture. Strategies to minimize this risk depend heavily on plant and animal disease research. As data collected at high spatial and temporal resolutions become increasingly available, epidemiological models used to assess and predict biotic risks have become more accurate and, thus, more useful. However, with the advent of Big Data opportunities, a number of challenges have arisen that limit researchers’ access to complex, multi‐sourced, multi‐scaled data collected on pathogens, and their associated environments and hosts. Among these challenges, one of the most limiting factors is data privacy concerns from data owners and collectors. While solutions, such as the use of de‐identifying and anonymizing tools that protect sensitive information are recognized as effective practices for use by plant and animal disease researchers, there are comparatively few platforms that include data privacy by design that are accessible to researchers. We describe how the general thinking and design used for data sharing and analysis platforms can intrinsically address a number of these data privacy‐related challenges that are a barrier to researchers wanting to access data. We also describe how some of the data privacy concerns confronting plant and animal disease researchers are addressed by way of the GEMS informatics platform.more » « less
-
Big Data empowers the farming community with the information needed to optimize resource usage, increase productivity, and enhance the sustainability of agricultural practices. The use of Big Data in farming requires the collection and analysis of data from various sources such as sensors, satellites, and farmer surveys. While Big Data can provide the farming community with valuable insights and improve efficiency, there is significant concern regarding the security of this data as well as the privacy of the participants. Privacy regulations, such as the European Union’s General Data Protection Regulation (GDPR), the EU Code of Conduct on agricultural data sharing by contractual agreement, and the proposed EU AI law, have been created to address the issue of data privacy and provide specific guidelines on when and how data can be shared between organizations. To make confidential agricultural data widely available for Big Data analysis without violating the privacy of the data subjects, we consider privacy-preserving methods of data sharing in agriculture. Synthetic data that retains the statistical properties of the original data but does not include actual individuals’ information provides a suitable alternative to sharing sensitive datasets. Deep learning-based synthetic data generation has been proposed for privacy-preserving data sharing. However, there is a lack of compliance with documented data privacy policies in such privacy-preserving efforts. In this study, we propose a novel framework for enforcing privacy policy rules in privacy-preserving data generation algorithms. We explore several available agricultural codes of conduct, extract knowledge related to the privacy constraints in data, and use the extracted knowledge to define privacy bounds in a privacy-preserving generative model. We use our framework to generate synthetic agricultural data and present experimental results that demonstrate the utility of the synthetic dataset in downstream tasks. We also show that our framework can evade potential threats, such as re-identification and linkage issues, and secure data based on applicable regulatory policy rules.more » « less
-
The COVID-19 pandemic highlights the need for broad dissemination of case surveillance data. Local and global public health agencies have initiated efforts to do so, but there remains limited data available, due in part to concerns over privacy. As a result, current COVID-19 case surveillance data sharing policies are based on strong adversarial assumptions, such as the expectation that an attacker can readily re-identify individuals based on their distinguishability in a dataset. There are various re-identification risk measures to account for adversarial capabilities; however, the current array insufficiently accounts for real world data challenges - particularly issues of missing records in resources of identifiable records that adversaries may rely upon to execute attacks (e.g., 10 50-year-old male in the de-identified dataset vs. 5 50-year-old male in the identified dataset). In this paper, we introduce several approaches to amend such risk measures and assess re-identification risk in light of how an attacker's capabilities relate to missing records. We demonstrate the potential for these measures through a record linkage attack using COVID-19 case surveillance data and voter registration records in the state of Florida. Our findings demonstrate that adversarial assumptions, as realized in a risk measure, can dramatically affect re-identification risk estimation. Notably, we show that the re-identification risk is likely to be substantially smaller than the typical risk thresholds, which suggests that more detailed data could be shared publicly than is currently the case.more » « less
-
How do researchers in fieldwork-intensive disciplines protect sensitive data in the field, how do they assess their own practices, and how do they arrive at them? This article reports the results of a qualitative study with 36 semi-structured interviews with qualitative and multi-method researchers in political science and humanitarian aid/migration studies. We find that researchers frequently feel ill-prepared to handle the management of sensitive data in the field and find that formal institutions provide little support. Instead, they use a patchwork of sources to devise strategies for protecting their informants and their data. We argue that this carries substantial risks for the security of the data as well as their potential for later sharing and re-use. We conclude with some suggestions for effectively supporting data management in fieldwork-intensive research without unduly adding to the burden on researchers conducting it.more » « less
An official website of the United States government

