skip to main content

Title: Physical-Virtual Agents for Healthcare Simulation
Conventional Intelligent Virtual Agents (IVAs) focus primarily on the visual and auditory channels for both the agent and the interacting human: the agent displays a visual appearance and speech as output, while processing the human’s verbal and non-verbal behavior as input. However, some interactions, particularly those between a patient and healthcare provider, inherently include tactile components.We introduce an Intelligent Physical-Virtual Agent (IPVA) head that occupies an appropriate physical volume; can be touched; and via human-in-the-loop control can change appearance, listen, speak, and react physiologically in response to human behavior. Compared to a traditional IVA, it provides a physical affordance, allowing for more realistic and compelling human-agent interactions. In a user study focusing on neurological assessment of a simulated patient showing stroke symptoms, we compared the IPVA head with a high-fidelity touch-aware mannequin that has a static appearance. Various measures of the human subjects indicated greater attention, affinity for, and presence with the IPVA patient, all factors that can improve healthcare training.  more » « less
Award ID(s):
1800961 1564065
Author(s) / Creator(s):
; ; ; ; ;
Date Published:
Journal Name:
International Conference on Intelligent Virtual Agents
Page Range / eLocation ID:
99 to 106
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Background

    Practitioner and family experiences of pediatric re/habilitation can be inequitable. The Young Children’s Participation and Environment Measure (YC-PEM) is an evidence-based and promising electronic patient-reported outcome measure that was designed with and for caregivers for research and practice. This study examined historically minoritized caregivers’ responses to revised YC-PEM content modifications and their perspectives on core intelligent virtual agent functionality needed to improve its reach for equitable service design.


    Caregivers were recruited during a routine early intervention (EI) service visit and met five inclusion criteria: (1) were 18 + years old; (2) identified as the parent or legal guardian of a child 0–3 years old enrolled in EI services for 3 + months; (3) read, wrote, and spoke English; (4) had Internet and telephone access; and (5) identified as a parent or legal guardian of a Black, non-Hispanic child or as publicly insured. Three rounds of semi-structured cognitive interviews (55–90 min each) used videoconferencing to gather caregiver feedback on their responses to select content modifications while completing YC-PEM, and their ideas for core intelligent virtual agent functionality. Interviews were transcribed verbatim, cross-checked for accuracy, and deductively and inductively content analyzed by multiple staff in three rounds.


    Eight Black, non-Hispanic caregivers from a single urban EI catchment and with diverse income levels (Mdn = $15,001–20,000) were enrolled, with children (M = 21.2 months,SD = 7.73) enrolled in EI. Caregivers proposed three ways to improve comprehension (clarify item wording, remove or simplify terms, add item examples). Environmental item edits prompted caregivers to share how they relate and respond to experiences with interpersonal and institutional discrimination impacting participation. Caregivers characterized three core functions of a virtual agent to strengthen YC-PEM navigation (read question aloud, visual and verbal prompts, more examples and/or definitions).


    Results indicate four ways that YC-PEM content will be modified to strengthen how providers screen for unmet participation needs and determinants to design pediatric re/habilitation services that are responsive to family priorities. Results also motivate the need for user-centered design of an intelligent virtual agent to strengthen user navigation, prior to undertaking a community-based pragmatic trial of its implementation for equitable practice.

    more » « less
  2. Studying group dynamics requires fine-grained spatial and temporal understanding of human behavior. Social psychologists studying human interaction patterns in face-to-face group meetings often find themselves struggling with huge volumes of data that require many hours of tedious manual coding. There are only a few publicly available multi-modal datasets of face-to-face group meetings that enable the development of automated methods to study verbal and non-verbal human behavior. In this paper, we present a new, publicly available multi-modal dataset for group dynamics study that differs from previous datasets in its use of ceiling-mounted, unobtrusive depth sensors. These can be used for fine-grained analysis of head and body pose and gestures, without any concerns about participants' privacy or inhibited behavior. The dataset is complemented by synchronized and time-stamped meeting transcripts that allow analysis of spoken content. The dataset comprises 22 group meetings in which participants perform a standard collaborative group task designed to measure leadership and productivity. Participants' post-task questionnaires, including demographic information, are also provided as part of the dataset. We show the utility of the dataset in analyzing perceived leadership, contribution, and performance, by presenting results of multi-modal analysis using our sensor-fusion algorithms designed to automatically understand audio-visual interactions. 
    more » « less
  3. null (Ed.)
    In recent years, Reinforcement learning (RL), especially Deep RL (DRL), has shown outstanding performance in video games from Atari, Mario, to StarCraft. However, little evidence has shown that DRL can be successfully applied to real-life human-centric tasks such as education or healthcare. Different from classic game-playing where the RL goal is to make an agent smart, in human-centric tasks the ultimate RL goal is to make the human-agent interactions productive and fruitful. Additionally, in many real-life human-centric tasks, data can be noisy and limited. As a sub-field of RL, batch RL is designed for handling situations where data is limited yet noisy, and building simulations is challenging. In two consecutive classroom studies, we investigated applying batch DRL to the task of pedagogical policy induction for an Intelligent Tutoring System (ITS), and empirically evaluated the effectiveness of induced pedagogical policies. In Fall 2018 (F18), the DRL policy is compared against an expert-designed baseline policy and in Spring 2019 (S19), we examined the impact of explaining the batch DRL-induced policy with student decisions and the expert baseline policy. Our results showed that 1) while no significant difference was found between the batch RL-induced policy and the expert policy in F18, the batch RL-induced policy with simple explanations significantly improved students’ learning performance more than the expert policy alone in S19; and 2) no significant differences were found between the student decision making and the expert policy. Overall, our results suggest that pairing simple explanations with induced RL policies can be an important and effective technique for applying RL to real-life human-centric tasks. 
    more » « less
  4. Stephanidis, Constantine ; Chen, Jessie Y. ; Fragomeni, Gino (Ed.)
    Post-traumatic stress disorder (PTSD) is a mental health condition affecting people who experienced a traumatic event. In addition to the clinical diagnostic criteria for PTSD, behavioral changes in voice, language, facial expression and head movement may occur. In this paper, we demonstrate how a machine learning model trained on a general population with self-reported PTSD scores can be used to provide behavioral metrics that could enhance the accuracy of the clinical diagnosis with patients. Both datasets were collected from a clinical interview conducted by a virtual agent (SimSensei) [10]. The clinical data was recorded from PTSD patients, who were victims of sexual assault, undergoing a VR exposure therapy. A recurrent neural network was trained on verbal, visual and vocal features to recognize PTSD, according to self-reported PCL-C scores [4]. We then performed decision fusion to fuse three modalities to recognize PTSD in patients with a clinical diagnosis, achieving an F1-score of 0.85. Our analysis demonstrates that machine-based PTSD assessment with self-reported PTSD scores can generalize across different groups and be deployed to assist diagnosis of PTSD. 
    more » « less
  5. Background. Simulation has revolutionized teaching and learning. However, traditional manikins are limited in their ability to exhibit emotions, movements, and interactive eye gaze. As a result, students struggle with immersion and may be unable to authentically relate to the patient. Intervention. We developed a new type of patient simulator called the Physical-Virtual Patients (PVP) which combines the physicality of manikins with the richness of dynamic visuals. The PVP uses spatial Augmented Reality to rear project dynamic imagery (e.g., facial expressions, ptosis, pupil reactions) on a semi-transparent physical shell. The shell occupies space and matches the dimensions of a human head. Methods. We compared two groups of third semester nursing students (N=59) from a baccalaureate program using a between-participant design, one group interacting with a traditional high-fidelity manikin versus a more realistic PVP head. The learners had to perform a neurological assessment. We measured authenticity, urgency, and learning. Results. Learners had a more realistic encounter with the PVP patient (p=0.046), they were more engaged with the PVP condition compared to the manikin in terms of authenticity of encounter and cognitive strategies. The PVP provoked a higher sense of urgency (p=0.002). There was increased learning for the PVP group compared to the manikin group on the pre and post-simulation scores (p=0.027). Conclusion. The realism of the visuals in the PVP increases authenticity and engagement which results in a greater sense of urgency and overall learning. 
    more » « less