skip to main content


Title: "AI in healthcare: data governance challenges"
AI applications are poised to transform health care, revolutionizing benefits for individuals, communities, and health-care systems. As the articles in this special issue aptly illustrate, AI innovations in healthcare are maturing from early success in medical imaging and robotic process automation, promising a broad range of new applications. This is evidenced by the rapid deployment of AI to address critical challenges related to the COVID-19 pandemic, including disease diagnosis and monitoring, drug discovery, and vaccine development. At the heart of these innovations is the health data required for deep learning applications. Rapid accumulation of data, along with improved data quality, data sharing, and standardization, enable development of deep learning algorithms in many healthcare applications. One of the great challenges for healthcare AI is effective governance of these data—ensuring thoughtful aggregation and appropriate access to fuel innovation and improve patient outcomes and healthcare system efficiency while protecting the privacy and security of data subjects. Yet the literature on data governance has rarely looked beyond important pragmatic issues related to privacy and security. Less consideration has been given to unexpected or undesirable outcomes of healthcare in AI, such as clinician deskilling, algorithmic bias, the “regulatory vacuum”, and lack of public engagement. Amidst growing calls for ethical governance of algorithms, Reddy et al. developed a governance model for AI in healthcare delivery, focusing on principles of fairness, accountability, and transparency (FAT), and trustworthiness, and calling for wider discussion. Winter and Davidson emphasize the need to identify underlying values of healthcare data and use, noting the many competing interests and goals for use of health data—such as healthcare system efficiency and reform, patient and community health, intellectual property development, and monetization. Beyond the important considerations of privacy and security, governance must consider who will benefit from healthcare AI, and who will not. Whose values drive health AI innovation and use? How can we ensure that innovations are not limited to the wealthiest individuals or nations? As large technology companies begin to partner with health care systems, and as personally generated health data (PGHD) (e.g., fitness trackers, continuous glucose monitors, health information searches on the Internet) proliferate, who has oversight of these complex technical systems, which are essentially a black box? To tackle these complex and important issues, it is important to acknowledge that we have entered a new technical, organizational, and policy environment due to linked data, big data analytics, and AI. Data governance is no longer the responsibility of a single organization. Rather, multiple networked entities play a role and responsibilities may be blurred. This also raises many concerns related to data localization and jurisdiction—who is responsible for data governance? In this emerging environment, data may no longer be effectively governed through traditional policy models or instruments.  more » « less
Award ID(s):
1827952
NSF-PAR ID:
10311415
Author(s) / Creator(s):
Editor(s):
Reddy, S.; Winter, J.S.; Padmanabhan, S.
Date Published:
Journal Name:
Journal of hospital management and health policy
Volume:
5
Issue:
8
ISSN:
2523-2533
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Patient-generated health data (PGHD), created and captured from patients via wearable devices and mobile apps, are proliferating outside of clinical settings. Examples include sleep tracking, fitness trackers, continuous glucose monitors, and RFID-enabled implants, with many additional biometric or health surveillance applications in development or envisioned. These data are included in growing stockpiles of personal health data being mined for insight via big data analytics and artificial intelligence/deep learning technologies. Governing these data resources to facilitate patient care and health research while preserving individual privacy and autonomy will be challenging, as PGHD are the least regulated domains of digitalized personal health data (U.S. Department of Health and Human Services, 2018). When patients themselves collect digitalized PGHD using “apps” provided by technology firms, these data fall outside of conventional health data regulation, such as HIPAA. Instead, PGHD are maintained primarily on the information technology infrastructure of vendors, and data are governed under the IT firm’s own privacy policies and within the firm’s intellectual property rights. Dominant narratives position these highly personal data as valuable resources to transform healthcare, stimulate innovation in medical research, and engage individuals in their health and healthcare. However, ensuring privacy, security, and equity of benefits from PGHD will be challenging. PGHD can be aggregated and, despite putative “deidentification,” be linked with other health, economic, and social data for predictive analytics. As large tech companies enter the healthcare sector (e.g., Google Health is partnering with Ascension Health to analyze the PHI of millions of people across 21 U.S. states), the lack of harmonization between regulatory regimes may render existing safeguards to preserve patient privacy and control over their PHI ineffective. While healthcare providers are bound to adhere to health privacy laws, Big Tech comes under more relaxed regulatory regimes that will facilitate monetizing PGHD. We explore three existing data protection regimes relevant to PGHD in the United States that are currently in tension with one another: federal and state health-sector laws, data use and reuse for research and innovation, and industry self-regulation by large tech companies We then identify three types of structures (organizational, regulatory, technological/algorithmic), which synergistically could help enact needed regulatory oversight while limiting the friction and economic costs of regulation. This analysis provides a starting point for further discussions and negotiations among stakeholders and regulators to do so. 
    more » « less
  2. Patient-generated health data (PGHD), created and captured from patients via wearable devices and mobile apps, are proliferating outside of clinical settings. Examples include sleep tracking, fitness trackers, continuous glucose monitors, and RFID-enabled implants, with many additional biometric or health surveillance applications in development or envisioned. These data are included in growing stockpiles of personal health data being mined for insight via big data analytics and artificial intelligence/deep learning technologies. Governing these data resources to facilitate patient care and health research while preserving individual privacy and autonomy will be challenging, as PGHD are the least regulated domains of digitalized personal health data (U.S. Department of Health and Human Services, 2018). When patients themselves collect digitalized PGHD using “apps” provided by technology firms, these data fall outside of conventional health data regulation, such as HIPAA. Instead, PGHD are maintained primarily on the information technology infrastructure of vendors, and data are governed under the IT firm’s own privacy policies and within the firm’s intellectual property rights. Dominant narratives position these highly personal data as valuable resources to transform healthcare, stimulate innovation in medical research, and engage individuals in their health and healthcare. However, ensuring privacy, security, and equity of benefits from PGHD will be challenging. PGHD can be aggregated and, despite putative “deidentification,” be linked with other health, economic, and social data for predictive analytics. As large tech companies enter the healthcare sector (e.g., Google Health is partnering with Ascension Health to analyze the PHI of millions of people across 21 U.S. states), the lack of harmonization between regulatory regimes may render existing safeguards to preserve patient privacy and control over their PHI ineffective. While healthcare providers are bound to adhere to health privacy laws, Big Tech comes under more relaxed regulatory regimes that will facilitate monetizing PGHD. We explore three existing data protection regimes relevant to PGHD in the United States that are currently in tension with one another: federal and state health-sector laws, data use and reuse for research and innovation, and industry self-regulation by large tech companies We then identify three types of structures (organizational, regulatory, technological/algorithmic), which synergistically could help enact needed regulatory oversight while limiting the friction and economic costs of regulation. This analysis provides a starting point for further discussions and negotiations among stakeholders and regulators to do so. 
    more » « less
  3. null (Ed.)
    Research and experimentation using big data sets, specifically large sets of electronic health records (EHR) and social media data, is demonstrating the potential to understand the spread of diseases and a variety of other issues. Applications of advanced algorithms, machine learning, and artificial intelligence indicate a potential for rapidly advancing improvements in public health. For example, several reports indicate that social media data can be used to predict disease outbreak and spread (Brown, 2015). Since real-world EHR data has complicated security and privacy issues preventing it from being widely used by researchers, there is a real need to synthetically generate EHR data that is realistic and representative. Current EHR generators, such as Syntheaä (Walonoski et al., 2018) only simulate and generate pure medical-related data. However, adding patients’ social media data with their simulated EHR data would make combined data more comprehensive and realistic for healthcare research. This paper presents a patients’ social media data generator that extends an EHR data generator. By adding coherent social media data to EHR data, a variety of issues can be examined for emerging interests, such as where a contagious patient may have been and others with whom they may have been in contact. Social media data, specifically Twitter data, is generated with phrases indicating the onset of symptoms corresponding to the synthetically generated EHR reports of simulated patients. This enables creation of an open data set that is scalable up to a big-data size, and is not subject to the security, privacy concerns, and restrictions of real healthcare data sets. This capability is important to the modeling and simulation community, such as scientists and epidemiologists who are developing algorithms to analyze the spread of diseases. It enables testing a variety of analytics without revealing real-world private patient information. 
    more » « less
  4. Patient health records(PHRs) are crucial and sensitive as they contain essential information and are frequently shared among healthcare entities. This information must remain correct, up to date, private and accessible only to the authorized entities. Moreover, access must also be assured during health emergency crises such as the recent outbreak, which represents the greatest test of the flexibility and the efficiency of PHR sharing among healthcare providers, which ended up an immense interruption to the healthcare industry. Moreover, the right to privacy is the most fundamental right for a patient. Hence, the patient health records in the healthcare sector have faced issues with privacy breaches, insider outside attacks, and unauthorized access to crucial patients’ records. As a result, it pushes more patients to demand more control, security, and a smoother experience when they want to access their health records. Furthermore, the lack of interoperability among the healthcare system and providers and the added weight of cyber-attacks on an already overwhelmed system have called for an immediate solution. In this work, we developed a secured blockchain framework that safeguards patients’ full control over their health data which can be stored in their private IPFS and later shared with an authorized provider. Furthermore, the system ensures privacy and security while handling patient data, which can only be shared with the patients. The proposed Security and privacy analysis show promising results in providing time savings, enhanced confidentiality, and less disruption in patient-provider interactions. 
    more » « less
  5. Covid-19 outbreak represents an exceptional test of the flexibility and the efficiency of patient medical records transfer among healthcare providers which ended up in boundless interruption to the healthcare industry. This public crisis has pushed for an urgent innovation of the patient medical records transference (PMRT) system to meet the needs and provide appropriate patient care. Moreover, the drawback effects of Covid-19 changed the healthcare system forever, more patients are requesting more control, secure, and smoother experience when they want access to their health records. However, the problems stem from the lack of interoperability among the healthcare system and providers and the added burden of cyber-attacks on an already stressed system call for an immediate solution. In this work, we present a secured blockchain framework that ensures patients full ownership over their medical data which can be stored in their private IPFS and later can be shared with an authorized provider. The analysis of the proposed security and privacy aspects shows promising results in providing time savings and resulted in enhanced confidentiality and less disruption in patient-provider interactions. 
    more » « less