skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Sharing personal ECG time-series data privately
Abstract ObjectiveEmerging technologies (eg, wearable devices) have made it possible to collect data directly from individuals (eg, time-series), providing new insights on the health and well-being of individual patients. Broadening the access to these data would facilitate the integration with existing data sources (eg, clinical and genomic data) and advance medical research. Compared to traditional health data, these data are collected directly from individuals, are highly unique and provide fine-grained information, posing new privacy challenges. In this work, we study the applicability of a novel privacy model to enable individual-level time-series data sharing while maintaining the usability for data analytics. Methods and materialsWe propose a privacy-protecting method for sharing individual-level electrocardiography (ECG) time-series data, which leverages dimensional reduction technique and random sampling to achieve provable privacy protection. We show that our solution provides strong privacy protection against an informed adversarial model while enabling useful aggregate-level analysis. ResultsWe conduct our evaluations on 2 real-world ECG datasets. Our empirical results show that the privacy risk is significantly reduced after sanitization while the data usability is retained for a variety of clinical tasks (eg, predictive modeling and clustering). DiscussionOur study investigates the privacy risk in sharing individual-level ECG time-series data. We demonstrate that individual-level data can be highly unique, requiring new privacy solutions to protect data contributors. ConclusionThe results suggest our proposed privacy-protection method provides strong privacy protections while preserving the usefulness of the data.  more » « less
Award ID(s):
2040727 1949217 1951430
PAR ID:
10405228
Author(s) / Creator(s):
; ;
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Journal of the American Medical Informatics Association
Volume:
29
Issue:
7
ISSN:
1527-974X
Format(s):
Medium: X Size: p. 1152-1160
Size(s):
p. 1152-1160
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract BackgroundEffective diabetes management requires precise glycemic control to prevent both hypoglycemia and hyperglycemia, yet existing machine learning (ML) and reinforcement learning (RL) approaches often fail to balance competing objectives. Traditional RL-based glucose regulation systems primarily focus on single-objective optimization, overlooking factors such as minimizing insulin overuse, reducing glycemic variability, and ensuring patient safety. Furthermore, these approaches typically rely on centralized data processing, which raises privacy concerns due to the sensitive nature of health care data. There is a critical need for a decentralized, privacy-preserving framework that can personalize blood glucose regulation while addressing the multiobjective nature of diabetes management. ObjectiveThis study aimed to develop and validate PRIMO-FRL (Privacy-Preserving Reinforcement Learning for Individualized Multi-Objective Glycemic Management Using Federated Reinforcement Learning), a novel framework that optimizes clinical objectives—maximizing time in range (TIR), reducing hypoglycemia and hyperglycemia, and minimizing glycemic risk—while preserving patient privacy. MethodsWe developed PRIMO-FRL, integrating multiobjective reward shaping to dynamically balance glucose stability, insulin efficiency, and risk reduction. The model was trained and tested using simulated data from 30 simulated patients (10 children, 10 adolescents, and 10 adults) generated with the Food and Drug Administration (FDA)–approved UVA/Padova simulator. A comparative analysis was conducted against state-of-the-art RL and ML models, evaluating performance using metrics such as TIR, hypoglycemia (<70 mg/dL), hyperglycemia (>180 mg/dL), and glycemic risk scores. ResultsThe PRIMO-FRL model achieved a robust overall TIR of 76.54%, with adults demonstrating the highest TIR at 81.48%, followed by children at 77.78% and adolescents at 70.37%. Importantly, the approach eliminated hypoglycemia, with 0.0% spent below 70 mg/dL across all cohorts, significantly outperforming existing methods. Mild hyperglycemia (180-250 mg/dL) was observed in adolescents (29.63%), children (22.22%), and adults (18.52%), with adults exhibiting the best control. Furthermore, the PRIMO-FRL approach consistently reduced glycemic risk scores, demonstrating improved safety and long-term stability in glucose regulation.. ConclusionsOur findings highlight the potential of PRIMO-FRL as a transformative, privacy-preserving approach to personalized glycemic management. By integrating federated RL, this framework eliminates hypoglycemia, improves TIR, and preserves data privacy by decentralizing model training. Unlike traditional centralized approaches that require sharing sensitive health data, PRIMO-FRL leverages federated learning to keep patient data local, significantly reducing privacy risks while enabling adaptive and personalized glucose control. This multiobjective optimization strategy offers a scalable, secure, and clinically viable solution for real-world diabetes care. The ability to train personalized models across diverse populations without exposing raw data makes PRIMO-FRL well-suited for deployment in privacy-sensitive health care environments. These results pave the way for future clinical adoption, demonstrating the potential of privacy-preserving artificial intelligence in optimizing glycemic regulation while maintaining security, adaptability, and personalization. 
    more » « less
  2. Differential privacy concepts have been successfully used to protect anonymity of individuals in population-scale analysis. Sharing of mobile sensor data, especially physiological data, raise different privacy challenges, that of protecting private behaviors that can be revealed from time series of sensor data. Existing privacy mechanisms rely on noise addition and data perturbation. But the accuracy requirement on inferences drawn from physiological data, together with well-established limits within which these data values occur, render traditional privacy mechanisms inapplicable. In this work, we define a new behavioral privacy metric based on differential privacy and propose a novel data substitution mechanism to protect behavioral privacy. We evaluate the efficacy of our scheme using 660 hours of ECG, respiration, and activity data collected from 43 participants and demonstrate that it is possible to retain meaningful utility, in terms of inference accuracy (90%), while simultaneously preserving the privacy of sensitive behaviors. 
    more » « less
  3. A Mavragani (Ed.)
    BackgroundPosttraumatic stress disorder (PTSD) is a serious public health concern. However, individuals with PTSD often do not have access to adequate treatment. A conversational agent (CA) can help to bridge the treatment gap by providing interactive and timely interventions at scale. Toward this goal, we have developed PTSDialogue—a CA to support the self-management of individuals living with PTSD. PTSDialogue is designed to be highly interactive (eg, brief questions, ability to specify preferences, and quick turn-taking) and supports social presence to promote user engagement and sustain adherence. It includes a range of support features, including psychoeducation, assessment tools, and several symptom management tools. ObjectiveThis paper focuses on the preliminary evaluation of PTSDialogue from clinical experts. Given that PTSDialogue focuses on a vulnerable population, it is critical to establish its usability and acceptance with clinical experts before deployment. Expert feedback is also important to ensure user safety and effective risk management in CAs aiming to support individuals living with PTSD. MethodsWe conducted remote, one-on-one, semistructured interviews with clinical experts (N=10) to gather insight into the use of CAs. All participants have completed their doctoral degrees and have prior experience in PTSD care. The web-based PTSDialogue prototype was then shared with the participant so that they could interact with different functionalities and features. We encouraged them to “think aloud” as they interacted with the prototype. Participants also shared their screens throughout the interaction session. A semistructured interview script was also used to gather insights and feedback from the participants. The sample size is consistent with that of prior works. We analyzed interview data using a qualitative interpretivist approach resulting in a bottom-up thematic analysis. ResultsOur data establish the feasibility and acceptance of PTSDialogue, a supportive tool for individuals with PTSD. Most participants agreed that PTSDialogue could be useful for supporting self-management of individuals with PTSD. We have also assessed how features, functionalities, and interactions in PTSDialogue can support different self-management needs and strategies for this population. These data were then used to identify design requirements and guidelines for a CA aiming to support individuals with PTSD. Experts specifically noted the importance of empathetic and tailored CA interactions for effective PTSD self-management. They also suggested steps to ensure safe and engaging interactions with PTSDialogue. ConclusionsBased on interviews with experts, we have provided design recommendations for future CAs aiming to support vulnerable populations. The study suggests that well-designed CAs have the potential to reshape effective intervention delivery and help address the treatment gap in mental health. 
    more » « less
  4. Abstract BackgroundAtrial fibrillation (AF) is often asymptomatic and thus under-observed. Given the high risks of stroke and heart failure among patients with AF, early prediction and effective management are crucial. Importantly, obstructive sleep apnea is highly prevalent among AF patients (60–90%); therefore, electrocardiogram (ECG) analysis from polysomnography (PSG), a standard diagnostic tool for subjects with suspected sleep apnea, presents a unique opportunity for the early prediction of AF. Our goal is to identify individuals at a high risk of developing AF in the future from a single-lead ECG recorded during standard PSGs. MethodsWe analyzed 18,782 single-lead ECG recordings from 13,609 subjects at Massachusetts General Hospital, identifying AF presence using ICD-9/10 codes in medical records. Our dataset comprises 15,913 recordings without a medical record for AF and 2,056 recordings from patients who were first diagnosed with AF between 1 day to 15 years after the PSG recording. The PSG data were partitioned into training, validation, and test cohorts. In the first phase, a signal quality index (SQI) was calculated in 30-second windows and those with SQI<0.95 were removed. From each remaining window, 150 hand-crafted features were extracted from time, frequency, time-frequency domains, and phase-space reconstructions of the ECG. A compilation of 12 statistical features summarized these window-specific features per recording, resulting in 1,800 features. We then updated a pre-trained deep neural network and data from the PhysioNet Challenge 2021 using transfer-learning to discriminate between recordings with and without AF using the same Challenge data. The model was applied to the PSG ECGs in 16-second windows to generate the probability of AF for each window. From the resultant probability sequence, 13 statistical features were extracted. Subsequently, we trained a shallow neural network to predict future AF using the extracted ECG and probability features. ResultsOn the test set, our model demonstrated a sensitivity of 0.67, specificity of 0.81, and precision of 0.3 for predicting AF. Further, survival analysis for AF outcomes, using the log-rank test, revealed a hazard ratio of 8.36 (p-value of 1.93 × 10−52). ConclusionsOur proposed ECG analysis method, utilizing overnight PSG data, shows promise in AF prediction despite a modest precision indicating the presence of false positive cases. This approach could potentially enable low-cost screening and proactive treatment for high-risk patients. Ongoing refinement, such as integrating additional physiological parameters could significantly reduce false positives, enhancing its clinical utility and accuracy. 
    more » « less
  5. Abstract MotivationAnalysis of time series transcriptomics data from clinical trials is challenging. Such studies usually profile very few time points from several individuals with varying response patterns and dynamics. Current methods for these datasets are mainly based on linear, global orderings using visit times which do not account for the varying response rates and subgroups within a patient cohort. ResultsWe developed a new method that utilizes multi-commodity flow algorithms for trajectory inference in large scale clinical studies. Recovered trajectories satisfy individual-based timing restrictions while integrating data from multiple patients. Testing the method on multiple drug datasets demonstrated an improved performance compared to prior approaches suggested for this task, while identifying novel disease subtypes that correspond to heterogeneous patient response patterns. Availability and implementationThe source code and instructions to download the data have been deposited on GitHub at https://github.com/euxhenh/Truffle. 
    more » « less