skip to main content


Search for: All records

Award ID contains: 1801372

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Confocal microscopy is a standard approach for obtaining volumetric images of a sample with high axial and lateral resolution, especially when dealing with scattering samples. Unfortunately, a confocal microscope is quite expensive compared to traditional microscopes. In addition, the point scanning in confocal microscopy leads to slow imaging speed and photobleaching due to the high dose of laser energy. In this paper, we demonstrate how the advances in machine learning can be exploited to teach a traditional wide-field microscope, one that’s available in every lab, into producing 3D volumetric images like a confocal microscope. The key idea is to obtain multiple images with different focus settings using a wide-field microscope and use a 3D generative adversarial network (GAN) based neural network to learn the mapping between the blurry low-contrast image stacks obtained using a wide-field microscope and the sharp, high-contrast image stacks obtained using a confocal microscope. After training the network with widefield-confocal stack pairs, the network can reliably and accurately reconstruct 3D volumetric images that rival confocal images in terms of its lateral resolution, z-sectioning and image contrast. Our experimental results demonstrate generalization ability to handle unseen data, stability in the reconstruction results, high spatial resolution even when imaging thick (∼40 microns) highly-scattering samples. We believe that such learning-based microscopes have the potential to bring confocal imaging quality to every lab that has a wide-field microscope.

     
    more » « less
  2. null (Ed.)
    Attention networks perform well on diverse computer vision tasks. The core idea is that the signal of interest is stronger in some pixels ("foreground"), and by selectively focusing computation on these pixels, networks can extract subtle information buried in noise and other sources of corruption. Our paper is based on one key observation: in many real-world applications, many sources of corruption, such as illumination and motion, are often shared between the "foreground" and the "background" pixels. Can we utilize this to our advantage? We propose the utility of inverse attention networks, which focus on extracting information about these shared sources of corruption. We show that this helps to effectively suppress shared covariates and amplify signal information, resulting in improved performance. We illustrate this on the task of camera-based physiological measurement where the signal of interest is weak and global illumination variations and motion act as significant shared sources of corruption. We perform experiments on three datasets and show that our approach of inverse attention produces state-of-the-art results, increasing the signal-to-noise ratio by up to 5.8 dB, reducing heart rate and breathing rate estimation errors by as much as 30 %, recovering subtle waveform dynamics, and generalizing from RGB to NIR videos without retraining. 
    more » « less
  3. null (Ed.)
    Over the last few years, camera-based estimation of vital signs referred to as imaging photoplethysmography (iPPG) has garnered significant attention due to the relative simplicity, ease, unobtrusiveness and flexibility offered by such measurements. It is expected that iPPG may be integrated into a host of emerging applications in areas as diverse as autonomous cars, neonatal monitoring, and telemedicine. In spite of this potential, the primary challenge of non-contact camera-based measurements is the relative motion between the camera and the subjects. Current techniques employ 2D feature tracking to reduce the effect of subject and camera motion but they are limited to handling translational and in-plane motion. In this paper, we study, for the first-time, the utility of 3D face tracking to allow iPPG to retain robust performance even in presence of out-of-plane and large relative motions. We use a RGB-D camera to obtain 3D information from the subjects and use the spatial and depth information to fit a 3D face model and track the model over the video frames. This allows us to estimate correspondence over the entire video with pixel-level accuracy, even in the presence of out-of-plane or large motions. We then estimate iPPG from the warped video data that ensures per-pixel correspondence over the entire window-length used for estimation. Our experiments demonstrate improvement in robustness when head motion is large. 
    more » « less
  4. null (Ed.)
    Deep learning approaches currently achieve the state-of-the-art results on camera-based vital signs measurement. One of the main challenges with using neural models for these applications is the lack of sufficiently large and diverse datasets. Limited data increases the chances of overfitting models to the available data which in turn can harm generalization. In this paper, we show that the generalizability of imaging photoplethysmography models can be improved by augmenting the training set with "magnified" videos. These augmentations are specifically designed to reveal useful features for recovering the photoplethysmogram. We show that using augmentations of this form is more effective at improving model robustness than other commonly used data augmentation approaches. We show better within-dataset and especially cross-dataset performance with our proposed data augmentation approach on three publicly available datasets. 
    more » « less
  5. Camera-based physiological measurement enables vital signs to be captured unobtrusively without contact with the body. Remote, or imaging, photoplethysmography involves recovering peripheral blood flow from subtle variations in video pixel intensities. While the pulse signal might be easy to obtain from high quality uncompressed videos, the signal-to-noise ratio drops dramatically with video bitrate. Uncompressed videos incur large file storage and data transfer costs, making analysis, manipulation and sharing challenging. To help address these challenges, we use compression specific supervised models to mitigate the effect of temporal video compression on heart rate estimates. We perform a systematic evaluation of the performance of state-of-the-art algorithms across different levels, and formats, of compression. We demonstrate that networks trained on compressed videos consistently outperform other benchmark methods, both on stationary videos and videos with significant rigid head motions. By training on videos with the same, or higher compression factor than test videos, we achieve improvements in signal-to-noise ratio (SNR) of up to 3 dB and mean absolute error (MAE) of up to 6 beats per minute (BPM).

     
    more » « less
  6. null (Ed.)
    Imaging photoplethysmography (iPPG) could greatly improve driver safety systems by enabling capabilities ranging from identifying driver fatigue to unobtrusive early heart failure detection. Unfortunately, the driving context poses unique challenges to iPPG, including illumination and motion. First, drastic illumination variations present during driving can overwhelm the small intensity-based iPPG signals. Second, significant driver head motion during driving, as well as camera motion (e.g., vibration) make it challenging to recover iPPG signals. To address these two challenges, we present two innovations. First, we demonstrate that we can reduce most outside light variations using narrow-band near-infrared (NIR) video recordings and obtain reliable heart rate estimates. Second, we present a novel optimization algorithm, which we call AutoSparsePPG, that leverages the quasi-periodicity of iPPG signals and achieves better performance than the state-of-the-art methods. In addition, we release the first publicly available driving dataset that contains both NIR and RGB video recordings of a passenger's face with simultaneous ground truth pulse oximeter recordings. 
    more » « less
  7. null (Ed.)
    It is well established that many datasets used for computer vision tasks are not representative and may be biased. The result of this is that evaluation metrics may not reflect real-world performance and might expose some groups (often minorities) to greater risks than others. Imaging photoplethysmography is a set of techniques that enables noncontact measurement of vital signs using imaging devices. While these methods hold great promise for low-cost and scalable physiological monitoring, it is important that performance is characterized accurately over diverse populations. We perform a meta-analysis across three datasets, including 73 people and over 400 videos featuring a broad range of skin types to study how skin types and gender affect the measurements. While heart rate measurement can be performed on all skin types under certain conditions, we find that average performance drops significantly for the darkest skin type. We also observe a slight drop in the performance for females. We compare supervised and unsupervised learning algorithms and find that skin type does not impact all methods equally. The imaging photoplethysmography community should devote greater efforts to addressing these disparities and collecting representative datasets. 
    more » « less