skip to main content

Title: "AI in healthcare: data governance challenges"
AI applications are poised to transform health care, revolutionizing benefits for individuals, communities, and health-care systems. As the articles in this special issue aptly illustrate, AI innovations in healthcare are maturing from early success in medical imaging and robotic process automation, promising a broad range of new applications. This is evidenced by the rapid deployment of AI to address critical challenges related to the COVID-19 pandemic, including disease diagnosis and monitoring, drug discovery, and vaccine development. At the heart of these innovations is the health data required for deep learning applications. Rapid accumulation of data, along with improved data quality, data sharing, and standardization, enable development of deep learning algorithms in many healthcare applications. One of the great challenges for healthcare AI is effective governance of these data—ensuring thoughtful aggregation and appropriate access to fuel innovation and improve patient outcomes and healthcare system efficiency while protecting the privacy and security of data subjects. Yet the literature on data governance has rarely looked beyond important pragmatic issues related to privacy and security. Less consideration has been given to unexpected or undesirable outcomes of healthcare in AI, such as clinician deskilling, algorithmic bias, the “regulatory vacuum”, and lack of public engagement. Amidst growing more » calls for ethical governance of algorithms, Reddy et al. developed a governance model for AI in healthcare delivery, focusing on principles of fairness, accountability, and transparency (FAT), and trustworthiness, and calling for wider discussion. Winter and Davidson emphasize the need to identify underlying values of healthcare data and use, noting the many competing interests and goals for use of health data—such as healthcare system efficiency and reform, patient and community health, intellectual property development, and monetization. Beyond the important considerations of privacy and security, governance must consider who will benefit from healthcare AI, and who will not. Whose values drive health AI innovation and use? How can we ensure that innovations are not limited to the wealthiest individuals or nations? As large technology companies begin to partner with health care systems, and as personally generated health data (PGHD) (e.g., fitness trackers, continuous glucose monitors, health information searches on the Internet) proliferate, who has oversight of these complex technical systems, which are essentially a black box? To tackle these complex and important issues, it is important to acknowledge that we have entered a new technical, organizational, and policy environment due to linked data, big data analytics, and AI. Data governance is no longer the responsibility of a single organization. Rather, multiple networked entities play a role and responsibilities may be blurred. This also raises many concerns related to data localization and jurisdiction—who is responsible for data governance? In this emerging environment, data may no longer be effectively governed through traditional policy models or instruments. « less
Authors:
Editors:
Reddy, S.; Winter, J.S.; Padmanabhan, S.
Award ID(s):
1827952
Publication Date:
NSF-PAR ID:
10311415
Journal Name:
Journal of hospital management and health policy
Volume:
5
Issue:
8
ISSN:
2523-2533
Sponsoring Org:
National Science Foundation
More Like this
  1. Patient-generated health data (PGHD), created and captured from patients via wearable devices and mobile apps, are proliferating outside of clinical settings. Examples include sleep tracking, fitness trackers, continuous glucose monitors, and RFID-enabled implants, with many additional biometric or health surveillance applications in development or envisioned. These data are included in growing stockpiles of personal health data being mined for insight via big data analytics and artificial intelligence/deep learning technologies. Governing these data resources to facilitate patient care and health research while preserving individual privacy and autonomy will be challenging, as PGHD are the least regulated domains of digitalized personal health data (U.S. Department of Health and Human Services, 2018). When patients themselves collect digitalized PGHD using “apps” provided by technology firms, these data fall outside of conventional health data regulation, such as HIPAA. Instead, PGHD are maintained primarily on the information technology infrastructure of vendors, and data are governed under the IT firm’s own privacy policies and within the firm’s intellectual property rights. Dominant narratives position these highly personal data as valuable resources to transform healthcare, stimulate innovation in medical research, and engage individuals in their health and healthcare. However, ensuring privacy, security, and equity of benefits from PGHD willmore »be challenging. PGHD can be aggregated and, despite putative “deidentification,” be linked with other health, economic, and social data for predictive analytics. As large tech companies enter the healthcare sector (e.g., Google Health is partnering with Ascension Health to analyze the PHI of millions of people across 21 U.S. states), the lack of harmonization between regulatory regimes may render existing safeguards to preserve patient privacy and control over their PHI ineffective. While healthcare providers are bound to adhere to health privacy laws, Big Tech comes under more relaxed regulatory regimes that will facilitate monetizing PGHD. We explore three existing data protection regimes relevant to PGHD in the United States that are currently in tension with one another: federal and state health-sector laws, data use and reuse for research and innovation, and industry self-regulation by large tech companies We then identify three types of structures (organizational, regulatory, technological/algorithmic), which synergistically could help enact needed regulatory oversight while limiting the friction and economic costs of regulation. This analysis provides a starting point for further discussions and negotiations among stakeholders and regulators to do so.« less
  2. Patient-generated health data (PGHD), created and captured from patients via wearable devices and mobile apps, are proliferating outside of clinical settings. Examples include sleep tracking, fitness trackers, continuous glucose monitors, and RFID-enabled implants, with many additional biometric or health surveillance applications in development or envisioned. These data are included in growing stockpiles of personal health data being mined for insight via big data analytics and artificial intelligence/deep learning technologies. Governing these data resources to facilitate patient care and health research while preserving individual privacy and autonomy will be challenging, as PGHD are the least regulated domains of digitalized personal health data (U.S. Department of Health and Human Services, 2018). When patients themselves collect digitalized PGHD using “apps” provided by technology firms, these data fall outside of conventional health data regulation, such as HIPAA. Instead, PGHD are maintained primarily on the information technology infrastructure of vendors, and data are governed under the IT firm’s own privacy policies and within the firm’s intellectual property rights. Dominant narratives position these highly personal data as valuable resources to transform healthcare, stimulate innovation in medical research, and engage individuals in their health and healthcare. However, ensuring privacy, security, and equity of benefits from PGHD willmore »be challenging. PGHD can be aggregated and, despite putative “deidentification,” be linked with other health, economic, and social data for predictive analytics. As large tech companies enter the healthcare sector (e.g., Google Health is partnering with Ascension Health to analyze the PHI of millions of people across 21 U.S. states), the lack of harmonization between regulatory regimes may render existing safeguards to preserve patient privacy and control over their PHI ineffective. While healthcare providers are bound to adhere to health privacy laws, Big Tech comes under more relaxed regulatory regimes that will facilitate monetizing PGHD. We explore three existing data protection regimes relevant to PGHD in the United States that are currently in tension with one another: federal and state health-sector laws, data use and reuse for research and innovation, and industry self-regulation by large tech companies We then identify three types of structures (organizational, regulatory, technological/algorithmic), which synergistically could help enact needed regulatory oversight while limiting the friction and economic costs of regulation. This analysis provides a starting point for further discussions and negotiations among stakeholders and regulators to do so.« less
  3. Research and experimentation using big data sets, specifically large sets of electronic health records (EHR) and social media data, is demonstrating the potential to understand the spread of diseases and a variety of other issues. Applications of advanced algorithms, machine learning, and artificial intelligence indicate a potential for rapidly advancing improvements in public health. For example, several reports indicate that social media data can be used to predict disease outbreak and spread (Brown, 2015). Since real-world EHR data has complicated security and privacy issues preventing it from being widely used by researchers, there is a real need to synthetically generate EHR data that is realistic and representative. Current EHR generators, such as Syntheaä (Walonoski et al., 2018) only simulate and generate pure medical-related data. However, adding patients’ social media data with their simulated EHR data would make combined data more comprehensive and realistic for healthcare research. This paper presents a patients’ social media data generator that extends an EHR data generator. By adding coherent social media data to EHR data, a variety of issues can be examined for emerging interests, such as where a contagious patient may have been and others with whom they may have been in contact. Socialmore »media data, specifically Twitter data, is generated with phrases indicating the onset of symptoms corresponding to the synthetically generated EHR reports of simulated patients. This enables creation of an open data set that is scalable up to a big-data size, and is not subject to the security, privacy concerns, and restrictions of real healthcare data sets. This capability is important to the modeling and simulation community, such as scientists and epidemiologists who are developing algorithms to analyze the spread of diseases. It enables testing a variety of analytics without revealing real-world private patient information.« less
  4. E-healthcare has been envisaged as a major component of the infrastructure of modern healthcare, and has been developing rapidly in China. For healthcare, news media can play an important role in raising public interest and utilization of a particular service and complicating (and, perhaps clouding) debate on public health policy issues. We conducted a linguistic analysis of news reports from January 2015 to June 2021 related to E-healthcare in mainland China, using a heterogeneous graphical modeling approach. This approach can simultaneously cluster the datasets and estimate the conditional dependence relationships of keywords. It was found that there were eight phases of media coverage. The focuses and main topics of media coverage were extracted based on the network hub and module detection. The temporal patterns of media reports were found to be mostly consistent with the policy trend. Specifically, in the policy embryonic period (2015–2016), two phases were obtained, industry management was the main topic, and policy and regulation were the focuses of media coverage. In the policy development period (2017–2019), four phases were discovered. All the four main topics, namely industry development, health care, financial market, and industry management, were present. In 2017 Q3–2017 Q4, the major focuses of mediamore »coverage included social security, healthcare and reform, and others. In 2018 Q1, industry regulation and finance became the focuses. In the policy outbreak period (2020–), two phases were discovered. Financial market and industry management were the main topics. Medical insurance and healthcare for the elderly became the focuses. This analysis can offer insights into how the media responds to public policy for E-healthcare, which can be valuable for the government, public health practitioners, health care industry investors, and others.« less
  5. Nearly half of people prescribed medication to treat chronic or short-term conditions do not take their medicine as prescribed. This leads to worse treatment outcomes, higher hospital admission rates, increased healthcare costs, and increased morbidity and mortality rates. While some instances of medication non-adherence are a result of problems with the treatment plan or barriers caused by the health care provider, many are instances caused by patient-related factors such as forgetting, running out of medication, and not understanding the required dosages. This presents a clear need for patient-centered systems that can reliably increase medication adherence. To that end, in this work we describe an activity recognition system capable of recognizing when individuals take medication in an unconstrained, real-world environment. Our methodology uses a modified version of the Bagging ensemble method to suit unbalanced data and a classifier trained on the prediction probabilities of the Bagging classifier to identify when individuals took medication during a full-day study. Using this methodology we are able to recognize when individuals took medication with an F-measure of 0.77. Our system is a first step towards developing personal health interfaces that are capable of providing personalized medication adherence interventions.