Background: Trust is a critical driver of technology usage behaviors and is essential for technology adoption. Thus, nurses’ participation in software development is critical for influencing their involvement, competency, and overall perceptions of software quality. Purpose: To engage nurses as subject matter experts to develop a machine learning (ML) Pain Recognition Automated Monitoring System. Method: Using the Human-centered Design for Embedded Machine Learning Solutions (HCDe-MLS) model, nurses informed the development of an intuitive data labeling software solution, Human-to-Artificial Intelligence (H2AI). Findings: H2AI facilitated efficient data labeling, stored labeled data to train ML models, and tracked inter-rater reliability. OpenCV provided efficient video-to-image data pre-processing for data labeling. MobileFaceNet demonstrated superior results for default landmark placement on neonatal video images. Discussion: Nurses’ engagement in clinical decision support software development is critical for ensuring the end-product addresses nurses’ priorities, reflects nurses’ actual cognitive and decision-making processes, and garners nurses’ trust and technology adoption.
more »
« less
Performance Evaluation of a Supervised Machine Learning Pain Classification Model Developed by Neonatal Nurses
Background: Early-life pain is associated with adverse neurodevelopmental consequences; and current pain assessment practices are discontinuous, inconsistent, and highly dependent on nurses’ availability. Furthermore, facial expressions in commonly used pain assessment tools are not associated with brain-based evidence of pain. Purpose: To develop and validate a machine learning (ML) model to classify pain. Methods: In this retrospective validation study, using a human-centered design for Embedded Machine Learning Solutions approach and the Neonatal Facial Coding System (NFCS), 6 experienced neonatal intensive care unit (NICU) nurses labeled data from randomly assigned iCOPEvid (infant Classification Of Pain Expression video) sequences of 49 neonates undergoing heel lance. NFCS is the only observational pain assessment tool associated with brain-based evidence of pain. A standard 70% training and 30% testing split of the data was used to train and test several ML models. NICU nurses’ interrater reliability was evaluated, and NICU nurses’ area under the receiver operating characteristic curve (AUC) was compared with the ML models’ AUC. Results: Nurses weighted mean interrater reliability was 68% (63%-79%) for NFCS tasks, 77.7% (74%-83%) for pain intensity, and 48.6% (15%-59%) for frame and 78.4% (64%-100%) for video pain classification, with AUC of 0.68. The best performing ML model had 97.7% precision, 98% accuracy, 98.5% recall, and AUC of 0.98. Implications for Practice and Research: The pain classification ML model AUC far exceeded that of NICU nurses for identifying neonatal pain. These findings will inform the development of a continuous, unbiased, brain-based, nurse-in-the-loop Pain Recognition Automated Monitoring System (PRAMS) for neonates and infants.
more »
« less
- Award ID(s):
- 2205472
- PAR ID:
- 10531569
- Publisher / Repository:
- Wolters Kluwer
- Date Published:
- Journal Name:
- Advances in Neonatal Care
- Volume:
- 24
- Issue:
- 3
- ISSN:
- 1536-0903
- Page Range / eLocation ID:
- 301-310
- Subject(s) / Keyword(s):
- computer vision, interrater reliability, neonatal pain, pain classification, supervised machine learning
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Existing pain assessment methods in the intensive care unit rely on patient self-report or visual observation by nurses. Patient self-report is subjective and can suffer from poor recall. In the case of non-verbal patients, behavioral pain assessment methods provide limited granularity, are subjective, and put additional burden on already overworked staff. Previous studies have shown the feasibility of autonomous pain expression assessment by detecting Facial Action Units (AUs). However, previous approaches for detecting facial pain AUs are historically limited to controlled environments. In this study, for the first time, we collected and annotated a pain-related AU dataset, Pain-ICU, containing 55,085 images from critically ill adult patients. We evaluated the performance of OpenFace, an open-source facial behavior analysis tool, and the trained AU R-CNN model on our Pain-ICU dataset. Variables such as assisted breathing devices, environmental lighting, and patient orientation with respect to the camera make AU detection harder than with controlled settings. Although OpenFace has shown state-of-the-art results in general purpose AU detection tasks, it could not accurately detect AUs in our Pain-ICU dataset (F1-score 0.42). To address this problem, we trained the AU R-CNN model on our Pain-ICU dataset, resulting in a satisfactory average F1-score 0.77. In this study, we show the feasibility of detecting facial pain AUs in uncontrolled ICU settings.more » « less
-
Timely detection of horse pain is important for equine welfare. Horses express pain through their facial and body behavior, but may hide signs of pain from unfamiliar human observers. In addition, collecting visual data with detailed annotation of horse behavior and pain state is both cumbersome and not scalable. Consequently, a pragmatic equine pain classification system would use video of the unobserved horse and weak labels. This paper proposes such a method for equine pain classification by using multi-view surveillance video footage of unobserved horses with induced orthopaedic pain, with temporally sparse video level pain labels. To ensure that pain is learned from horse body language alone, we first train a self-supervised generative model to disentangle horse pose from its appearance and background before using the disentangled horse pose latent representation for pain classification. To make best use of the pain labels, we develop a novel loss that formulates pain classification as a multi-instance learning problem. Our method achieves pain classification accuracy better than human expert performance with 60% accuracy. The learned latent horse pose representation is shown to be viewpoint covariant, and disentangled from horse appearance. Qualitative analysis of pain classified segments shows correspondence between the pain symptoms identified by our model, and equine pain scales used in veterinary practice.more » « less
-
Purpose: Magnetic Resonance Imaging (MRI) enables non‐invasive assessment of brain abnormalities during early life development. Permanent magnet scanners operating in the neonatal intensive care unit (NICU) facilitate MRI of sick infants, but have long scan times due to lower signal‐to‐noise ratios (SNR) and limited receive coils. This work accelerates in‐NICU MRI with diffusion probabilistic generative models by developing a training pipeline accounting for these challenges. Methods: We establish a novel training dataset of clinical, 1 Tesla neonatal MR images in collaboration with Aspect Imaging and Sha'are Zedek Medical Center. We propose a pipeline to handle the low quantity and SNR of our real‐world dataset (1) modifying existing network architectures to support varying resolutions; (2) training a single model on all data with learned class embedding vectors; (3) applying self‐supervised denoising before training; and (4) reconstructing by averaging posterior samples. Retrospective under‐sampling experiments, accounting for signal decay, evaluated each item of our proposed methodology. A clinical reader study with practicing pediatric neuroradiologists evaluated our proposed images reconstructed from under‐sampled data. Results: Combining all data, denoising pre‐training, and averaging posterior samples yields quantitative improvements in reconstruction. The generative model decouples the learned prior from the measurement model and functions at two acceleration rates without re‐training. The reader study suggests that proposed images reconstructed from under‐sampled data are adequate for clinical use. Conclusion: Diffusion probabilistic generative models applied with the proposed pipeline to handle challenging real‐world datasets could reduce the scan time of in‐NICU neonatal MRI.more » « less
-
Although pain is widely recognized to be a multidimensional experience, it is typically measured by unidimensional patient self-reported visual analog scale (VAS). However, self-reported pain is subjective, difficult to interpret and sometimes impossible to obtain. Machine learning models have been developed to automatically recognize pain at both the frame level and sequence (or video) level. Many methods use or learn facial action units (AUs) defined by the Facial Action Coding System (FACS) for describing facial expressions with muscle movement. In this paper, we analyze the relationship between sequence-level multidimensional pain measurements and frame-level AUs and an AU derived pain-related measure, the Prkachin and Solomon Pain Intensity (PSPI). We study methods that learn sequence-level metrics from frame-level metrics. Specifically, we explore an extended multitask learning model to predict VAS from human-labeled AUs with the help of other sequence-level pain measurements during training. This model consists of two parts: a multitask learning neural network model to predict multidimensional pain scores, and an ensemble learning model to linearly combine the multidimensional pain scores to best approximate VAS. Starting from human-labeled AUs, the model achieves a mean absolute error (MAE) on VAS of 1.73. It outperforms provided human sequence-level estimates which have an MAE of 1.76. Combining our machine learning model with the human estimates gives the best performance of MAE on VAS of 1.48.more » « less
An official website of the United States government

