PurposeChallenges in teaching the engineering design process (EDP) at the high-school level, such as promoting good documentation practices, are well-documented. While developments in educational artificial intelligence (AI) systems have the potential to assist in addressing these challenges, the open-ended nature of the EDP leads to challenges that often lack the specificity required for actionable AI development. In addition, conventional educational AI systems (e.g. intelligent tutoring systems) primarily target procedural domain tasks with well-defined outcomes and problem-solving strategies, while the EDP involves open-ended problems and multiple correct solutions, making AI intervention timing and appropriateness complex. Design/methodology/approachAuthors conducted a six-week-long Research through Co-Design (RtCD) process (i.e. a co-design process rooted in Research through Design) with two experienced high-school engineering teachers to co-construct actionable insight in the form of AI intervention points (AI-IPs) in engineering education where an AI system can effectively intervene to support them while highlighting their pedagogical practices. FindingsThis paper leveraged the design of task models to iteratively refine our prior understanding of teachers’ experiences with teaching the EDP into three AI-IPs related to documentation, ephemeral interactions between teachers and students and disruptive failures that can serve as a focus for intelligent educational system designs. Originality/valueThis paper discusses the implications of these AI-IPs for designing educational AI systems to support engineering education as well as the importance of leveraging RtCD methodologies to engage teachers in developing intelligent educational systems that align with their needs and afford them control over computational interventions in their classrooms.
more »
« less
Evaluation of GPT-4 ability to identify and generate patient instructions for actionable incidental radiology findings
Abstract ObjectivesTo evaluate the proficiency of a HIPAA-compliant version of GPT-4 in identifying actionable, incidental findings from unstructured radiology reports of Emergency Department patients. To assess appropriateness of artificial intelligence (AI)-generated, patient-facing summaries of these findings. Materials and MethodsRadiology reports extracted from the electronic health record of a large academic medical center were manually reviewed to identify non-emergent, incidental findings with high likelihood of requiring follow-up, further sub-stratified as “definitely actionable” (DA) or “possibly actionable—clinical correlation” (PA-CC). Instruction prompts to GPT-4 were developed and iteratively optimized using a validation set of 50 reports. The optimized prompt was then applied to a test set of 430 unseen reports. GPT-4 performance was primarily graded on accuracy identifying either DA or PA-CC findings, then secondarily for DA findings alone. Outputs were reviewed for hallucinations. AI-generated patient-facing summaries were assessed for appropriateness via Likert scale. ResultsFor the primary outcome (DA or PA-CC), GPT-4 achieved 99.3% recall, 73.6% precision, and 84.5% F-1. For the secondary outcome (DA only), GPT-4 demonstrated 95.2% recall, 77.3% precision, and 85.3% F-1. No findings were “hallucinated” outright. However, 2.8% of cases included generated text about recommendations that were inferred without specific reference. The majority of True Positive AI-generated summaries required no or minor revision. ConclusionGPT-4 demonstrates proficiency in detecting actionable, incidental findings after refined instruction prompting. AI-generated patient instructions were most often appropriate, but rarely included inferred recommendations. While this technology shows promise to augment diagnostics, active clinician oversight via “human-in-the-loop” workflows remains critical for clinical implementation.
more »
« less
- PAR ID:
- 10534839
- Publisher / Repository:
- Oxford University Press
- Date Published:
- Journal Name:
- Journal of the American Medical Informatics Association
- Volume:
- 31
- Issue:
- 9
- ISSN:
- 1067-5027
- Format(s):
- Medium: X Size: p. 1983-1993
- Size(s):
- p. 1983-1993
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
As large language models (LLMs) expand the power of natural language processing to handle long inputs, rigorous and systematic analyses are necessary to understand their abilities and behavior. A salient application is summarization, due to its ubiquity and controversy (e.g., researchers have declared the death of summarization). In this paper, we use financial report summarization as a case study because financial reports are not only long but also use numbers and tables extensively. We propose a computational framework for characterizing multimodal long-form summarization and investigate the behavior of Claude 2.0/2.1, GPT-4/3.5, and Cohere. We find that GPT-3.5 and Cohere fail to perform this summarization task meaningfully. For Claude 2 and GPT-4, we analyze the extractiveness of the summary and identify a position bias in LLMs. This position bias disappears after shuffling the input for Claude, which suggests that Claude seems to recognize important information. We also conduct a comprehensive investigation on the use of numeric data in LLM-generated summaries and offer a taxonomy of numeric hallucination. We employ prompt engineering to improve GPT-4's use of numbers with limited success. Overall, our analyses highlight the strong capability of Claude 2 in handling long multimodal inputs compared to GPT-4. The generated summaries and evaluation code are available at https://github.com/ChicagoHAI/characterizing-multimodal-long-form-summarization.more » « less
-
Mavrikis, M; Lalle, S; Azevedo, R; Biswas, G; Roll, I (Ed.)Exploratory learning environments (ELEs), such as simulation-based platforms and open-ended science curricula, promote hands-on exploration and problem-solving but make it difficult for teachers to gain timely insights into students' conceptual understanding. This paper presents LearnLens, a generative AI (GenAI)-enhanced teacher-facing dashboard designed to support problem-based instruction in middle school science. LearnLens processes students' open-ended responses from digital assessments to provide various insights, including sample responses, word clouds, bar charts, and AI-generated summaries. These features elucidate students' thinking, enabling teachers to adjust their instruction based on emerging patterns of understanding. The dashboard was informed by teacher input during professional development sessions and implemented within a middle school Earth science curriculum. We report insights from teacher interviews that highlight the dashboard's usability and potential to guide teachers' instruction in the classroom.more » « less
-
Abstract The concentration of dopamine (DA) and tyrosine (Tyr) reflects the condition of patients with Parkinson's disease, whereas moderate paracetamol (PA) can help relieve their pain. Therefore, real‐time measurements of these bioanalytes have important clinical implications for patients with Parkinson's disease. However, previous sensors suffer from either limited sensitivity or complex fabrication and integration processes. This work introduces a simple and cost‐effective method to prepare high‐quality, flexible titanium dioxide (TiO2) thin films with highly reactive (001)‐facets. The as‐fabricated TiO2film supported by a carbon cloth electrode (i.e., TiO2–CC) allows excellent electrochemical specificity and sensitivity to DA (1.390 µA µM−1 cm−2), Tyr (0.126 µA µM−1 cm−2), and PA (0.0841 µA µM−1 cm−2). More importantly, accurate DA concentration in varied pH conditions can be obtained by decoupling them within a single differential pulse voltammetry measurement without additional sensing units. The TiO2–CC electrochemical sensor can be integrated into a smart diaper to detect the trace amount of DA or an integrated skin‐interfaced patch with microfluidic sampling and wireless transmission units for real‐time detection of the sweat Try and PA concentration. The wearable sensor based on TiO2–CC prepared by facile manufacturing methods holds great potential in the daily health monitoring and care of patients with neurological disorders.more » « less
-
ObjectivesTo identify what patient-related characteristics have been reported to be associated with the occurrence of shared decision-making (SDM) about treatment. DesignScoping review. Eligibility criteriaPeer-reviewed articles in English or Dutch reporting on associations between patient-related characteristics and the occurrence of SDM for actual treatment decisions. Information sourcesCOCHRANE Library, Embase, MEDLINE, PsycInfo, PubMed and Web of Science were systematically searched for articles published until 25 March 2019. ResultsThe search yielded 5289 hits of which 53 were retained. Multiple categories of patient characteristics were identified: (1) sociodemographic characteristics (eg, gender), (2) general health and clinical characteristics (eg, symptom severity), (3) psychological characteristics and coping with illness (eg, self-efficacy) and (4) SDM style or preference. Many characteristics showed no association or unclear relationships with SDM occurrence. For example, for female gender positive, negative and, most frequently, non-significant associations were seen. ConclusionsA large variety of patient-related characteristics have been studied, but for many the association with SDM occurrence remains unclear. The results will caution often-made assumptions about associations and provide an important step to target effective interventions to foster SDM with all patients.more » « less
An official website of the United States government
