skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Near-Infrared Imaging Photoplethysmography During Driving
Imaging photoplethysmography (iPPG) could greatly improve driver safety systems by enabling capabilities ranging from identifying driver fatigue to unobtrusive early heart failure detection. Unfortunately, the driving context poses unique challenges to iPPG, including illumination and motion. First, drastic illumination variations present during driving can overwhelm the small intensity-based iPPG signals. Second, significant driver head motion during driving, as well as camera motion (e.g., vibration) make it challenging to recover iPPG signals. To address these two challenges, we present two innovations. First, we demonstrate that we can reduce most outside light variations using narrow-band near-infrared (NIR) video recordings and obtain reliable heart rate estimates. Second, we present a novel optimization algorithm, which we call AutoSparsePPG, that leverages the quasi-periodicity of iPPG signals and achieves better performance than the state-of-the-art methods. In addition, we release the first publicly available driving dataset that contains both NIR and RGB video recordings of a passenger's face with simultaneous ground truth pulse oximeter recordings.  more » « less
Award ID(s):
1652633 1801372
PAR ID:
10217883
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
IEEE Transactions on Intelligent Transportation Systems
ISSN:
1524-9050
Page Range / eLocation ID:
1 to 12
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Attention networks perform well on diverse computer vision tasks. The core idea is that the signal of interest is stronger in some pixels ("foreground"), and by selectively focusing computation on these pixels, networks can extract subtle information buried in noise and other sources of corruption. Our paper is based on one key observation: in many real-world applications, many sources of corruption, such as illumination and motion, are often shared between the "foreground" and the "background" pixels. Can we utilize this to our advantage? We propose the utility of inverse attention networks, which focus on extracting information about these shared sources of corruption. We show that this helps to effectively suppress shared covariates and amplify signal information, resulting in improved performance. We illustrate this on the task of camera-based physiological measurement where the signal of interest is weak and global illumination variations and motion act as significant shared sources of corruption. We perform experiments on three datasets and show that our approach of inverse attention produces state-of-the-art results, increasing the signal-to-noise ratio by up to 5.8 dB, reducing heart rate and breathing rate estimation errors by as much as 30 %, recovering subtle waveform dynamics, and generalizing from RGB to NIR videos without retraining. 
    more » « less
  2. Louis, Matthieu (Ed.)
    Imaging neural activity in a behaving animal presents unique challenges in part because motion from an animal’s movement creates artifacts in fluorescence intensity time-series that are difficult to distinguish from neural signals of interest. One approach to mitigating these artifacts is to image two channels simultaneously: one that captures an activity-dependent fluorophore, such as GCaMP, and another that captures an activity-independent fluorophore such as RFP. Because the activity-independent channel contains the same motion artifacts as the activity-dependent channel, but no neural signals, the two together can be used to identify and remove the artifacts. However, existing approaches for this correction, such as taking the ratio of the two channels, do not account for channel-independent noise in the measured fluorescence. Here, we present Two-channel Motion Artifact Correction (TMAC), a method which seeks to remove artifacts by specifying a generative model of the two channel fluorescence that incorporates motion artifact, neural activity, and noise. We use Bayesian inference to infer latent neural activity under this model, thus reducing the motion artifact present in the measured fluorescence traces. We further present a novel method for evaluating ground-truth performance of motion correction algorithms by comparing the decodability of behavior from two types of neural recordings; a recording that had both an activity-dependent fluorophore and an activity-independent fluorophore (GCaMP and RFP) and a recording where both fluorophores were activity-independent (GFP and RFP). A successful motion correction method should decode behavior from the first type of recording, but not the second. We use this metric to systematically compare five models for removing motion artifacts from fluorescent time traces. We decode locomotion from a GCaMP expressing animal 20x more accurately on average than from control when using TMAC inferred activity and outperforms all other methods of motion correction tested, the best of which were ~8x more accurate than control. 
    more » « less
  3. Camera-based heart rate measurement is becoming an attractive option as a non-contact modality for continuous remote health and engagement monitoring. However, reliable heart rate extraction from camera-based measurement is challenging in realistic scenarios, especially when the subject is moving. In this work, we develop a motion-robust algorithm, labeled RobustPPG, for extracting photoplethysmography signals (PPG) from face video and estimating the heart rate. Our key innovation is to explicitly model and generate motion distortions due to the movements of the person’s face. We use inverse rendering to obtain the 3D shape and albedo of the face and environment lighting from video frames and then render the human face for each frame. The rendered face is similar to the original face but does not contain the heart rate signal; facial movements alone cause pixel intensity variation in the generated video frames. Finally, we use the generated motion distortion to filter the motion-induced measurements. We demonstrate that our approach performs better than the state-of-the-art methods in extracting a clean blood volume signal with over 2 dB signal quality improvement and 30% improvement in RMSE of estimated heart rate in intense motion scenarios. 
    more » « less
  4. null (Ed.)
    Emotion regulation can be characterized by different activities that attempt to alter an emotional response, whether behavioral, physiological or neurological. The two most widely adopted strategies, cognitive reappraisal and expressive suppression are explored in this study, specifically in the context of disgust. Study participants (N = 21) experienced disgust via video exposure, and were instructed to either regulate their emotions or express them freely. If regulating, they were required to either cognitively reappraise or suppress their emotional experiences while viewing the videos. Video recordings of the participants' faces were taken during the experiment and electrocardiogram (ECG), electromyography (EMG), and galvanic skin response (GSR) readings were also collected for further analysis. We compared the participants behavioral (facial musculature movements) and physiological (GSR and heart rate) responses as they aimed to alter their emotional responses and computationally determined that when responding to disgust stimuli, the signals recorded during suppression and free expression were very similar, whereas those recorded during cognitive reappraisal were significantly different. Thus, in the context of this study, from a signal analysis perspective, we conclude that emotion regulation via cognitive reappraisal significantly alters participants' physiological responses to disgust, unlike regulation via suppression. 
    more » « less
  5. Automated interpretation of ultrasound imaging of the heart (echocardiograms) could improve the detection and treatment of aortic stenosis (AS), a deadly heart disease. However, existing deep learning pipelines for assessing AS from echocardiograms have two key limitations. First, most methods rely on limited 2D cineloops, thereby ignoring widely available Spectral Doppler imaging that contains important complementary information about pressure gradients and blood flow abnormalities associated with AS. Second, obtaining labeled data is difficult. There are often far more unlabeled echocardiogram recordings available, but these remain underutilized by existing methods. To overcome these limitations, we introduce Semi-supervised Multimodal Multiple-Instance Learning (SMMIL), a new deep learning framework for automatic interpretation for structural heart diseases like AS. During training, SMMIL can combine a smaller labeled set and an abundant unlabeled set of both 2D and Doppler modalities to improve its classifier. When deployed, SMMIL can combine information from all available images to produce an accurate study-level diagnosis of this life-threatening condition. Experiments demonstrate that SMMIL outperforms recent alternatives, including two medical foundation models. 
    more » « less