skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Learned Compressive Representations for Single-Photon 3D Imaging
Single-photon 3D cameras can record the time of arrival of billions of photons per second with picosecond accuracy. One common approach to summarize the photon data stream is to build a per-pixel timestamp histogram, resulting in a 3D histogram tensor that encodes distances along the time axis. As the spatio-temporal resolution of the histogram tensor increases, the in-pixel memory requirements and output data rates can quickly become impractical. To overcome this limitation, we propose a family of linear compressive representations of histogram tensors that can be computed efficiently, in an online fashion, as a matrix operation. We design practical lightweight compressive representations that are amenable to an in-pixel implementation and consider the spatio-temporal information of each timestamp. Furthermore, we implement our proposed framework as the first layer of a neural network, which enables the joint end-to-end optimization of the compressive representations and a downstream SPAD data processing model. We find that a well-designed compressive representation can reduce in-sensor memory and data rates up to 2 orders of magnitude without significantly reducing 3D imaging quality. Finally, we analyze the power consumption implications through an on-chip implementation.  more » « less
Award ID(s):
1846884 2138471 1943149
PAR ID:
10499096
Author(s) / Creator(s):
; ; ; ; ; ; ; ;
Publisher / Repository:
IEEE
Date Published:
Journal Name:
IEEE International Converence on Computer Vision
ISBN:
979-8-3503-0718-4
Page Range / eLocation ID:
10722 to 10732
Format(s):
Medium: X
Location:
Paris, France
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Single-photon avalanche diodes (SPADs) are a rapidly developing image sensing technology with extreme low-light sensitivity and picosecond timing resolution. These unique capabilities have enabled SPADs to be used in applications like LiDAR, non-line-of-sight imaging and fluorescence microscopy that require imaging in photon-starved scenarios. In this work we harness these capabilities for dealing with motion blur in a passive imaging setting in low illumination conditions. Our key insight is that the data captured by a SPAD array camera can be represented as a 3D spatio-temporal tensor of photon detection events which can be integrated along arbitrary spatio-temporal trajectories with dynamically varying integration windows, depending on scene motion. We propose an algorithm that estimates pixel motion from photon timestamp data and dynamically adapts the integration windows to minimize motion blur. Our simulation results show the applicability of this algorithm to a variety of motion profiles including translation, rotation and local object motion. We also demonstrate the real-world feasibility of our method on data captured using a 32x32 SPAD camera. 
    more » « less
  2. null (Ed.)
    Real-world spatio-temporal data is often incomplete or inaccurate due to various data loading delays. For example, a location-disease-time tensor of case counts can have multiple delayed updates of recent temporal slices for some locations or diseases. Recovering such missing or noisy (under-reported) elements of the input tensor can be viewed as a generalized tensor completion problem. Existing tensor completion methods usually assume that i) missing elements are randomly distributed and ii) noise for each tensor element is i.i.d. zero-mean. Both assumptions can be violated for spatio-temporal tensor data. We often observe multiple versions of the input tensor with different under-reporting noise levels. The amount of noise can be time- or location-dependent as more updates are progressively introduced to the tensor. We model such dynamic data as a multi-version tensor with an extra tensor mode capturing the data updates. We propose a low-rank tensor model to predict the updates over time. We demonstrate that our method can accurately predict the ground-truth values of many real-world tensors. We obtain up to 27.2% lower root mean-squared-error compared to the best baseline method. Finally, we extend our method to track the tensor data over time, leading to significant computational savings. 
    more » « less
  3. We consider the task of 3D pose estimation and tracking of multiple people seen in an arbitrary number of camera feeds. We propose TesseTrack, a novel top-down approach that simultaneously reasons about multiple individuals’ 3D body joint reconstructions and associations in space and time in a single end-to-end learnable framework. At the core of our approach is a novel spatio-temporal formulation that operates in a common voxelized feature space aggregated from single- or multiple camera views. After a person detection step, a 4D CNN produces short-term persons pecific representations which are then linked across time by a differentiable matcher. The linked descriptions are then merged and deconvolved into 3D poses. This joint spatio-temporal formulation contrasts with previous piecewise strategies that treat 2D pose estimation, 2D-to-3D lifting, and 3D pose tracking as independent sub-problems that are error-prone when solved in isolation. Furthermore, unlike previous methods, TesseTrack is robust to changes in the number of camera views and achieves very good results even if a single view is available at inference time. Quantitative evaluation of 3D pose reconstruction accuracy on standard benchmarks shows significant improvements over the state of the art. Evaluation of multi-person articulated 3D pose tracking in our novel evaluation framework demonstrates the superiority of TesseTrack over strong baselines. 
    more » « less
  4. In time-correlated single-photon counting (TCSPC), photons that arrive during the detector and timing electronics dead times are missed, causing distortion of the detection time distribution. Conventional wisdom holds that TCSPC should be performed with detections in fewer than 5% of illumination cycles to avoid substantial distortion. This requires attenuation and leads to longer acquisition times if the incident flux is too high. Through the example of ranging with a single-photon lidar system, this work demonstrates that accurately modeling the sequence of detection times as a Markov chain allows for measurements at much higher incident flux without attenuation. Our probabilistic model is validated by the close match between the limiting distribution of the Markov chain and both simulated and experimental data, so long as issues of calibration and afterpulsing are minimal. We propose an algorithm that corrects for the distortion in detection histograms caused by dead times without assumptions on the form of the transient light intensity. Our histogram correction yields substantially improved depth imaging performance, and modest additional improvement is achieved with a parametric model assuming a single depth per pixel. We show results for depth and flux estimation with up to 5 photoelectrons per illumination cycle on average, facilitating an increase in time efficiency of more than two orders of magnitude. The use of identical TCSPC equipment in other fields suggests that our modeling and histogram correction could likewise enable high-flux acquisitions in fluorescence lifetime microscopy or quantum optics applications. 
    more » « less
  5. We present a novel architecture for the design of single-photon detecting arrays that captures relative intensity or timing information from a scene, rather than absolute. The proposed method for capturing relative information between pixels or groups of pixels requires very little circuitry, and thus allows for a significantly higher pixel packing factor than is possible with per-pixel TDC approaches. The inherently compressive nature of the differential measurements also reduces data throughput and lends itself to physical implementations of compressed sensing, such as Haar wavelets. We demonstrate this technique for HDR imaging and LiDAR, and describe possible future applications. 
    more » « less