skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on February 1, 2026

Title: Considerations for social networks and health data sharing: An overview
The use of network analysis as a tool has increased exponentially as more clinical researchers see the benefits of network data for modeling of infectious disease transmission or translational activities in a variety of areas, including patient-caregiving teams, provider networks, patient-support networks, and adoption of health behaviors or treatments, to name a few. Yet, relational data such as network data carry a higher risk of deductive disclosure. Cases of reidentification have occurred and this is expected to become more common as computational ability increases. Recent data sharing policies aim to promote reproducibility, support replicability, and protect federal investment in the effort to collect these research data by making them available for secondary analyses. However, typical practices to protect individual-level clinical research data may not be sufficiently protective of participant privacy in the case of network data, nor in some cases do they permit secondary data analysis. When sharing data, researchers must balance security, accessibility, reproducibility, and adaptability (suitability for secondary analyses). Here, we provide background about applying network analysis to health and clinical research, describe the pros and cons of applying typical practices for sharing clinical data to network data, and provide recommendations for sharing network data.  more » « less
Award ID(s):
2024271 2140024
PAR ID:
10565596
Author(s) / Creator(s):
; ; ; ; ; ; ;
Publisher / Repository:
Science Direct, Annals of Epidemiology
Date Published:
Journal Name:
Annals of Epidemiology
Volume:
102
Issue:
C
ISSN:
1047-2797
Page Range / eLocation ID:
28 to 35
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Background With advances in digital health technologies and proliferation of biomedical data in recent years, applications of machine learning in health care and medicine have gained considerable attention. While inpatient settings are equipped to generate rich clinical data from patients, there is a dearth of actionable information that can be used for pursuing secondary research for specific clinical conditions. Objective This study focused on applying unsupervised machine learning techniques for traumatic brain injury (TBI), which is the leading cause of death and disability among children and adults aged less than 44 years. Specifically, we present a case study to demonstrate the feasibility and applicability of subspace clustering techniques for extracting patterns from data collected from TBI patients. Methods Data for this study were obtained from the Progesterone for Traumatic Brain Injury, Experimental Clinical Treatment–Phase III (PROTECT III) trial, which included a cohort of 882 TBI patients. We applied subspace-clustering methods (density-based, cell-based, and clustering-oriented methods) to this data set and compared the performance of the different clustering methods. Results The analyses showed the following three clusters of laboratory physiological data: (1) international normalized ratio (INR), (2) INR, chloride, and creatinine, and (3) hemoglobin and hematocrit. While all subclustering algorithms had a reasonable accuracy in classifying patients by mortality status, the density-based algorithm had a higher F1 score and coverage. Conclusions Clustering approaches serve as an important step for phenotype definition and validation in clinical domains such as TBI, where patient and injury heterogeneity are among the major reasons for failure of clinical trials. The results from this study provide a foundation to develop scalable clustering algorithms for further research and validation. 
    more » « less
  2. Patient-generated data (PGD) show great promise for informing the delivery of personalized and patient-centered care. However, patients' data tracking does not automatically lead to data sharing and discussion with clinicians, which can make it difficult to utilize and derive optimal benefit from PGD. In this paper, we investigate whether and how patients share their PGD with clinicians and the types of challenges that arise within this context. We describe patients' immediate experiences of PGD sharing with clinicians, based on our short onsite interviews with 57 patients who had just met with a clinician at a university health center. Our analyses identified overarching patterns in patients' PGD sharing practices and the associated challenges that arise from the information asymmetry between patients and clinicians and from patients' reliance on their memory to share their PGD. We discuss the implications of our findings for designing PGD-integrated health IT systems in ways to support patients' tracking of relevant PGD, clinicians' effective engagement with patients around PGD, and the efficient sharing and review of PGD within clinical settings. 
    more » « less
  3. null (Ed.)
    A range of regulatory pressures emanating from funding agencies and scholarly journals increasingly encourage researchers to engage in formal data sharing practices. As academic libraries continue to refine their role in supporting researchers in this data sharing space, one particular challenge has been finding new ways to meaningfully engage with campus researchers. Libraries help shape norms and encourage data sharing through education and training, and there has been significant growth in the services these institutions are able to provide and the ways in which library staff are able to collaborate and communicate with researchers. Evidence also suggests that within disciplines, normative pressures and expectations around professional conduct have a significant impact on data sharing behaviors (Kim and Adler 2015; Sigit Sayogo and Pardo 2013; Zenk-Moltgen et al. 2018). Duke University Libraries' Research Data Management program has recently centered part of its outreach strategy on leveraging peer networks and social modeling to encourage and normalize robust data sharing practices among campus researchers. The program has hosted two panel discussions on issues related to data management—specifically, data sharing and research reproducibility. This paper reflects on some lessons learned from these outreach efforts and outlines next steps. 
    more » « less
  4. Abstract As genomic research continues to advance, sharing of genomic data and research outcomes has become increasingly important for fostering collaboration and accelerating scientific discovery. However, such data sharing must be balanced with the need to protect the privacy of individuals whose genetic information is being utilized. This paper presents a bidirectional framework for evaluating privacy risks associated with data shared (both in terms of summary statistics and research datasets) in genomic research papers, particularly focusing on re-identification risks such as membership inference attacks (MIA). The framework consists of a structured workflow that begins with a questionnaire designed to capture researchers’ (authors’) self-reported data sharing practices and privacy protection measures. Responses are used to calculate the risk of re-identification for their study (paper) when compared with the National Institutes of Health (NIH) genomic data sharing policy. Any gaps in compliance help us to identify potential vulnerabilities and encourage the researchers to enhance their privacy measures before submitting their research for publication. The paper also demonstrates the application of this framework, using published genomic research as case study scenarios to emphasize the importance of implementing bidirectional frameworks to support trustworthy open science and genomic data sharing practices. 
    more » « less
  5. Objective Visual cohort analysis utilizing electronic health record data has become an important tool in clinical assessment of patient outcomes. In this article, we introduce Composer, a visual analysis tool for orthopedic surgeons to compare changes in physical functions of a patient cohort following various spinal procedures. The goal of our project is to help researchers analyze outcomes of procedures and facilitate informed decision-making about treatment options between patient and clinician. Methods In collaboration with orthopedic surgeons and researchers, we defined domain-specific user requirements to inform the design. We developed the tool in an iterative process with our collaborators to develop and refine functionality. With Composer, analysts can dynamically define a patient cohort using demographic information, clinical parameters, and events in patient medical histories and then analyze patient-reported outcome scores for the cohort over time, as well as compare it to other cohorts. Using Composer's current iteration, we provide a usage scenario for use of the tool in a clinical setting. Conclusion We have developed a prototype cohort analysis tool to help clinicians assess patient treatment options by analyzing prior cases with similar characteristics. Although Composer was designed using patient data specific to orthopedic research, we believe the tool is generalizable to other healthcare domains. A long-term goal for Composer is to develop the application into a shared decision-making tool that allows translation of comparison and analysis from a clinician-facing interface into visual representations to communicate treatment options to patients. 
    more » « less