skip to main content

This content will become publicly available on July 1, 2022

Title: Novel Computational Linguistic Measures, Dialogue System and the Development of SOPHIE: Standardized Online Patient for Healthcare Interaction Education
In this paper, we describe the iterative participatory design of SOPHIE, an online virtual patient for feedback-based practice of sensitive patient-physician conversations, and discuss an initial qualitative evaluation of the system by professional end users. The design of SOPHIE was motivated from a computational linguistic analysis of the transcripts of 383 patient-physician conversations from an essential office visit of late stage cancer patients with their oncologists. We developed methods for the automatic detection of two behavioral paradigms, lecturing and positive language usage patterns (sentiment trajectory of conversation), that are shown to be significantly associated with patient prognosis understanding. These automated metrics associated with effective communication were incorporated into SOPHIE, and a pilot user study identified that SOPHIE was favorably reviewed by a user group of practicing physicians.
; ; ; ; ; ; ;
Award ID(s):
Publication Date:
Journal Name:
IEEE transactions on affective computing
Page Range or eLocation-ID:
Sponsoring Org:
National Science Foundation
More Like this
  1. Background Online physician reviews are an important source of information for prospective patients. In addition, they represent an untapped resource for studying the effects of gender on the doctor-patient relationship. Understanding gender differences in online reviews is important because it may impact the value of those reviews to patients. Documenting gender differences in patient experience may also help to improve the doctor-patient relationship. This is the first large-scale study of physician reviews to extensively investigate gender bias in online reviews or offer recommendations for improvements to online review systems to correct for gender bias and aid patients in selecting amore »physician. Objective This study examines 154,305 reviews from across the United States for all medical specialties. Our analysis includes a qualitative and quantitative examination of review content and physician rating with regard to doctor and reviewer gender. Methods A total of 154,305 reviews were sampled from Google Place reviews. Reviewer and doctor gender were inferred from names. Reviews were coded for overall patient experience (negative or positive) by collapsing a 5-star scale and coded for general categories (process, positive/negative soft skills), which were further subdivided into themes. Computational text processing methods were employed to apply this codebook to the entire data set, rendering it tractable to quantitative methods. Specifically, we estimated binary regression models to examine relationships between physician rating, patient experience themes, physician gender, and reviewer gender). Results Female reviewers wrote 60% more reviews than men. Male reviewers were more likely to give negative reviews (odds ratio [OR] 1.15, 95% CI 1.10-1.19; P<.001). Reviews of female physicians were considerably more negative than those of male physicians (OR 1.99, 95% CI 1.94-2.14; P<.001). Soft skills were more likely to be mentioned in the reviews written by female reviewers and about female physicians. Negative reviews of female doctors were more likely to mention candor (OR 1.61, 95% CI 1.42-1.82; P<.001) and amicability (OR 1.63, 95% CI 1.47-1.90; P<.001). Disrespect was associated with both female physicians (OR 1.42, 95% CI 1.35-1.51; P<.001) and female reviewers (OR 1.27, 95% CI 1.19-1.35; P<.001). Female patients were less likely to report disrespect from female doctors than expected from the base ORs (OR 1.19, 95% CI 1.04-1.32; P=.008), but this effect overrode only the effect for female reviewers. Conclusions This work reinforces findings in the extensive literature on gender differences and gender bias in patient-physician interaction. Its novel contribution lies in highlighting gender differences in online reviews. These reviews inform patients’ choice of doctor and thus affect both patients and physicians. The evidence of gender bias documented here suggests review sites may be improved by providing information about gender differences, controlling for gender when presenting composite ratings for physicians, and helping users write less biased reviews.« less
  2. Background Increased work through electronic health record (EHR) messaging is frequently cited as a factor of physician burnout. However, studies to date have relied on anecdotal or self-reported measures, which limit the ability to match EHR use patterns with continuous stress patterns throughout the day. Objective The aim of this study is to collect EHR use and physiologic stress data through unobtrusive means that provide objective and continuous measures, cluster distinct patterns of EHR inbox work, identify physicians’ daily physiologic stress patterns, and evaluate the association between EHR inbox work patterns and physician physiologic stress. Methods Physicians were recruited frommore »5 medical centers. Participants (N=47) were given wrist-worn devices (Garmin Vivosmart 3) with heart rate sensors to wear for 7 days. The devices measured physiological stress throughout the day based on heart rate variability (HRV). Perceived stress was also measured with self-reports through experience sampling and a one-time survey. From the EHR system logs, the time attributed to different activities was quantified. By using a clustering algorithm, distinct inbox work patterns were identified and their associated stress measures were compared. The effects of EHR use on physician stress were examined using a generalized linear mixed effects model. Results Physicians spent an average of 1.08 hours doing EHR inbox work out of an average total EHR time of 3.5 hours. Patient messages accounted for most of the inbox work time (mean 37%, SD 11%). A total of 3 patterns of inbox work emerged: inbox work mostly outside work hours, inbox work mostly during work hours, and inbox work extending after hours that were mostly contiguous to work hours. Across these 3 groups, physiologic stress patterns showed 3 periods in which stress increased: in the first hour of work, early in the afternoon, and in the evening. Physicians in group 1 had the longest average stress duration during work hours (80 out of 243 min of valid HRV data; P=.02), as measured by physiological sensors. Inbox work duration, the rate of EHR window switching (moving from one screen to another), the proportion of inbox work done outside of work hours, inbox work batching, and the day of the week were each independently associated with daily stress duration (marginal R2=15%). Individual-level random effects were significant and explained most of the variation in stress (conditional R2=98%). Conclusions This study is among the first to demonstrate associations between electronic inbox work and physiological stress. We identified 3 potentially modifiable factors associated with stress: EHR window switching, inbox work duration, and inbox work outside work hours. Organizations seeking to reduce physician stress may consider system-based changes to reduce EHR window switching or inbox work duration or the incorporation of inbox management time into work hours.« less
  3. Augmented Reality (AR) as a technology will improve the way we work and live in the future. The Microsoft HoloLens device allows for rendering of interactive virtual components into a real world space. The HoloLens is an augmented reality headset and can display these virtual components in front of the user’s eyes, so the data needed to complete a real-world task will always be available. The nature of a HoloLens device lends itself useful for applications in a healthcare setting. Potential benefits come from transitioning to a more hands-free environment such as allowing the logging of data while in sterilemore »environments without needing to sterilize repeatedly from touching paper or tablet. This project developed an augmented reality (AR) application to include a care plan tracker established by a patient’s doctor to allow the patient to do daily tasks without a health care worker’s supervision. The application displays the medications that the patient needs to ingest, daily tasks to complete, and health data to record. The application allows the physician to retrieve useful patient information regularly without scheduled physicals. This project sets a baseline that will provide future developers with documentation, research, and this sample application to assist in the design and construction of more complex applications in the future at the University of New Hampshire.« less
  4. Doshi-Velez, Finale ; Fackler, Jim ; Jung, Ken ; Kale, David ; Ranganath, Rajesh ; Wallace, Byron ; Wiens, Jenna (Ed.)
    Electronic Health Records (EHRs) provide vital contextual information to radiologists and other physicians when making a diagnosis. Unfortunately, because a given patient’s record may contain hundreds of notes and reports, identifying relevant information within these in the short time typically allotted to a case is very difficult. We propose and evaluate models that extract relevant text snippets from patient records to provide a rough case summary intended to aid physicians considering one or more diagnoses. This is hard because direct supervision (i.e., physician annotations of snippets relevant to specific diagnoses in medical records) is prohibitively expensive to collect at scale.more »We propose a distantly supervised strategy in which we use groups of International Classification of Diseases (ICD) codes observed in ‘future’ records as noisy proxies for ‘downstream’ diagnoses. Using this we train a transformer-based neural model to perform extractive summarization conditioned on potential diagnoses. This model defines an attention mechanism that is conditioned on potential diagnoses (queries) provided by the diagnosing physician. We train (via distant supervision) and evaluate variants of this model on EHR data from Brigham and Women’s Hospital in Boston and MIMIC-III (the latter to facilitate reproducibility). Evaluations performed by radiologists demonstrate that these distantly supervised models yield better extractive summaries than do unsupervised approaches. Such models may aid diagnosis by identifying sentences in past patient reports that are clinically relevant to a potential diagnosis. Code is available at« less
  5. Background & Purpose: Deformational plagiocephaly and brachycephaly (DPB) is a cranial condition manifested in 20% of infants in the US. DPB affects children and their families through psychological pressure, social stigma, and significant medical costs. If detected between 0-3 months of age, there is strong potential for correction via aggressive repositioning and/or physical therapy if congenital muscular torticollis is present. At later stages, DPB is most effectively treated by more expensive treatments like helmet therapy. Two cranial parameters that can help with the early detection and tracking of DPB are the cranial index (CI) and cranial vault asymmetry index (CVAI).more »Currently, these measurements are performed with a hand caliper by a specialist, i.e., nurse practitioner (CRNP) or physician assistant who specializes in cleft-craniofacial diagnosis, physical therapist, pediatric plastic/neurosurgeons, or orthotist. To make the measurements frequent, accessible, and accurate at the point of care, i.e., in pediatric offices, we developed and evaluated a mobile app called SoftspotTM to measure CI and CVAI, thus facilitating the early detection and monitoring of DPB. Method/Description: Sequences of bird’s eye-view head photos extracted from video were collected for 77 patients (aged 2 – 11 months, 51 females, 26 males) with an iPhone X (Apple Inc., Cupertino, CA). The head length, width, and diagonals were measured by a single CRNP via hand calipers at a large multidisciplinary cranio-facial center with IRB approval and patient consent. For each patient, five images were chosen by an analyst and segmented into head components, namely the head and nose, using quantitative imaging methods. For each image CI and CVAI were automatically measured, and these measurements were averaged for each patient. Automated CI and CVAI measurements were compared to values obtained by the caliper measurements in terms of mean absolute error (MAE), and outliers were excluded beyond 3 standard deviations away from the average MAE. Results were further analyzed by the Bland-Altman method and Spearman Correlation Coefficient. Results: MAE was 2.18 ± 1.60 for CI and 1.57 ± 1.03 for CVAI measurements. Spearman Correlation Coefficients between measurements and ground truth were 0.93 for CI (p<0.001) and 0.91 for CVAI (p<0.001). Bland-Altman analysis revealed limits of agreement for CI and CVAI as [-4.59, 5.76] (mean = 0.59) and [-3.91, 3.40] (mean = -0.25) respectively. Conclusions: Digital smartphone-based methods for DPB assessment are feasible, and this study demonstrated significant correlation between automated digital measurements and ground truth clinical values. Smartphone-based measurements of DPB can be performed at the point of care to improve the early detection and treatment of DPB.« less