ImportanceVirtual patient-physician communications have increased since 2020 and negatively impacted primary care physician (PCP) well-being. Generative artificial intelligence (GenAI) drafts of patient messages could potentially reduce health care professional (HCP) workload and improve communication quality, but only if the drafts are considered useful. ObjectivesTo assess PCPs’ perceptions of GenAI drafts and to examine linguistic characteristics associated with equity and perceived empathy. Design, Setting, and ParticipantsThis cross-sectional quality improvement study tested the hypothesis that PCPs’ ratings of GenAI drafts (created using the electronic health record [EHR] standard prompts) would be equivalent to HCP-generated responses on 3 dimensions. The study was conducted at NYU Langone Health using private patient-HCP communications at 3 internal medicine practices piloting GenAI. ExposuresRandomly assigned patient messages coupled with either an HCP message or the draft GenAI response. Main Outcomes and MeasuresPCPs rated responses’ information content quality (eg, relevance), using a Likert scale, communication quality (eg, verbosity), using a Likert scale, and whether they would use the draft or start anew (usable vs unusable). Branching logic further probed for empathy, personalization, and professionalism of responses. Computational linguistics methods assessed content differences in HCP vs GenAI responses, focusing on equity and empathy. ResultsA total of 16 PCPs (8 [50.0%] female) reviewed 344 messages (175 GenAI drafted; 169 HCP drafted). Both GenAI and HCP responses were rated favorably. GenAI responses were rated higher for communication style than HCP responses (mean [SD], 3.70 [1.15] vs 3.38 [1.20];P = .01,U = 12 568.5) but were similar to HCPs on information content (mean [SD], 3.53 [1.26] vs 3.41 [1.27];P = .37;U = 13 981.0) and usable draft proportion (mean [SD], 0.69 [0.48] vs 0.65 [0.47],P = .49,t = −0.6842). Usable GenAI responses were considered more empathetic than usable HCP responses (32 of 86 [37.2%] vs 13 of 79 [16.5%]; difference, 125.5%), possibly attributable to more subjective (mean [SD], 0.54 [0.16] vs 0.31 [0.23];P < .001; difference, 74.2%) and positive (mean [SD] polarity, 0.21 [0.14] vs 0.13 [0.25];P = .02; difference, 61.5%) language; they were also numerically longer (mean [SD] word count, 90.5 [32.0] vs 65.4 [62.6]; difference, 38.4%), but the difference was not statistically significant (P = .07) and more linguistically complex (mean [SD] score, 125.2 [47.8] vs 95.4 [58.8];P = .002; difference, 31.2%). ConclusionsIn this cross-sectional study of PCP perceptions of an EHR-integrated GenAI chatbot, GenAI was found to communicate information better and with more empathy than HCPs, highlighting its potential to enhance patient-HCP communication. However, GenAI drafts were less readable than HCPs’, a significant concern for patients with low health or English literacy.
more »
« less
This content will become publicly available on July 23, 2025
The First Generative AI Prompt-A-Thon in Healthcare: A Novel Approach to Workforce Engagement with a Private Instance of ChatGPT
BackgroundHealthcare crowdsourcing events (e.g. hackathons) facilitate interdisciplinary collaboration and encourage innovation. Peer-reviewed research has not yet considered a healthcare crowdsourcing event focusing on generative artificial intelligence (GenAI), which generates text in response to detailed prompts and has vast potential for improving the efficiency of healthcare organizations. Our event, the New York University Langone Health (NYULH) Prompt-a-thon, primarily sought to inspire and build AI fluency within our diverse NYULH community, and foster collaboration and innovation. Secondarily, we sought to analyze how participants’ experience was influenced by their prior GenAI exposure and whether they received sample prompts during the workshop. MethodsExecuting the event required the assembly of an expert planning committee, who recruited diverse participants, anticipated technological challenges, and prepared the event. The event was composed of didactics and workshop sessions, which educated and allowed participants to experiment with using GenAI on real healthcare data. Participants were given novel “project cards” associated with each dataset that illuminated the tasks GenAI could perform and, for a random set of teams, sample prompts to help them achieve each task (the public repository of project cards can be found athttps://github.com/smallw03/NYULH-Generative-AI-Prompt-a-thon-Project-Cards). Afterwards, participants were asked to fill out a survey with 7-point Likert-style questions. ResultsOur event was successful in educating and inspiring hundreds of enthusiastic in-person and virtual participants across our organization on the responsible use of GenAI in a low-cost and technologically feasible manner. All participants responded positively, on average, to each of the survey questions (e.g., confidence in their ability to use and trust GenAI). Critically, participants reported a self-perceived increase in their likelihood of using and promoting colleagues’ use of GenAI for their daily work. No significant differences were seen in the surveys of those who received sample prompts with their project task descriptions ConclusionThe first healthcare Prompt-a-thon was an overwhelming success, with minimal technological failures, positive responses from diverse participants and staff, and evidence of post-event engagement. These findings will be integral to planning future events at our institution, and to others looking to engage their workforce in utilizing GenAI.
more »
« less
- PAR ID:
- 10534951
- Editor(s):
- Hastings, Janna
- Publisher / Repository:
- PLOS
- Date Published:
- Journal Name:
- PLOS Digital Health
- Volume:
- 3
- Issue:
- 7
- ISSN:
- 2767-3170
- Page Range / eLocation ID:
- e0000394
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Mishra, S; Kothiyal, A; Iyer, S; Sahasrabudhe, S; Lingnau, A; Kuo, R (Ed.)This paper describes an experience report centered on high school mathematics teachers’ use of ALICE, a Generative AI (GenAI) module of the Edfinity homework system. Given natural language prompts (from teachers), ALICE generates the programming code (in WeBWorK format) for the corresponding interactive, isomorphic, auto-gradable problem along with hints and a solution. Writing such code would normally require programming skills. Working with teachers in high schools across a mid-western US state, this paper presents teachers’ experiences using ALICE, on prompt engineering, and the factors that influence these experiences. The implementation study also examines the impact of this experience on teachers’ classroom practice and their views about AI. Findings suggest that teachers’ experiences were largely very positive, however these experiences are shaped by several factors including their context, their attitudes toward technology and AI use, and the perceived usefulness of the tool. These factors hold different levels of importance for individual teachers. The promising results contribute to the burgeoning field of GenAI in education and understanding teacher-AI teaming.more » « less
-
The introduction of generative artificial intelligence (GenAI) has been met with a mix of reactions by higher education institutions, ranging from consternation and resistance to wholehearted acceptance. Previous work has looked at the discourse and policies adopted by universities across the U.S. as well as educators, along with the inclusion of GenAI-related content and topics in higher education. Building on previous research, this study reports findings from a survey of engineering educators on their use of and perspectives toward generative AI. Specifically, we surveyed 98 educators from engineering, computer science, and education who participated in a workshop on GenAI in Engineering Education to learn about their perspectives on using these tools for teaching and research. We asked them about their use of and comfort with GenAI, their overall perspectives on GenAI, the challenges and potential harms of using it for teaching, learning, and research, and examined whether their approach to using and integrating GenAI in their classroom influenced their experiences with GenAI and perceptions of it. Consistent with other research in GenAI education, we found that while the majority of participants were somewhat familiar with GenAI, reported use varied considerably. We found that educators harbored mostly hopeful and positive views about the potential of GenAI. We also found that those who engaged more with their students on the topic of GenAI, both as communicators (those who spoke directly with their students) and as incorporators (those who included it in their syllabus), tend to be more positive about its contribution to learning, while also being more attuned to its potential abuses. These findings suggest that integrating and engaging with generative AI is essential to foster productive interactions between instructors and students around this technology. Our work ultimately contributes to the evolving discourse on GenAI use, integration, and avoidance within educational settings. Through exploratory quantitative research, we have identified specific areas for further investigation.more » « less
-
Website privacy policies are often lengthy and intricate. Privacy assistants assist in simplifying policies and making them more accessible and user-friendly. The emergence of generative AI (genAI) offers new opportunities to build privacy assistants that can answer users’ questions about privacy policies. However, genAI’s reliability is a concern due to its potential for producing inaccurate information. This study introduces GenAIPABench, a benchmark for evaluating Generative AI-based Privacy Assistants (GenAIPAs). GenAIPABench includes: 1) A set of curated questions about privacy policies along with annotated answers for various organizations and regulations; 2) Metrics to assess the accuracy, relevance, and consistency of responses; and 3) A tool for generating prompts to introduce privacy policies and paraphrased variants of the curated questions. We evaluated 3 leading genAI systems—ChatGPT-4, Bard, and Bing AI—using GenAIPABench to gauge their effectiveness as GenAIPAs. Our results demonstrate significant promise in genAI capabilities in the privacy domain while also highlighting challenges in managing complex queries, ensuring consistency, and verifying source accuracy.more » « less
-
Website privacy policies are often lengthy and intricate. Privacy assistants assist in simplifying policies and making them more accessible and user-friendly. The emergence of generative AI (genAI) offers new opportunities to build privacy assistants that can answer users’ questions about privacy policies. However, genAI’s reliability is a concern due to its potential for producing inaccurate information. This study introduces GenAIPABench, a benchmark for evaluating Generative AI-based Privacy Assistants (GenAIPAs). GenAIPABench includes: 1) A set of curated questions about privacy policies along with annotated answers for various organizations and regulations; 2) Metrics to assess the accuracy, relevance, and consistency of responses; and 3) A tool for generating prompts to introduce privacy policies and paraphrased variants of the curated questions. We evaluated three leading genAI systems—ChatGPT-4, Bard, and Bing AI—using GenAIPABench to gauge their effectiveness as GenAIPAs. Our results demonstrate significant promise in genAI capabilities in the privacy domain while also highlighting challenges in managing complex queries, ensuring consistency, and verifying source accuracy.more » « less