skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Pushing Point Cloud Compression to the Edge
As Point Clouds (PCs) gain popularity in processing millions of data points for 3D rendering in many applications, efficient data compression becomes a critical issue. This is because compression is the primary bottleneck in minimizing the latency and energy consumption of existing PC pipelines. Data compression becomes even more critical as PC processing is pushed to edge devices with limited compute and power budgets. In this paper, we propose and evaluate two complementary schemes, intra-frame compression and inter-frame compression, to speed up the PC compression, without losing much quality or compression efficiency. Unlike existing techniques that use sequential algorithms, our first design, intra-frame compression, exploits parallelism for boosting the performance of both geometry and attribute compression. The proposed parallelism brings around 43.7× performance improvement and 96.6% energy savings at a cost of 1.01× larger compressed data size. To further improve the compression efficiency, our second scheme, inter-frame compression, considers the temporal similarity among the video frames and reuses the attribute data from the previous frame for the current frame. We implement our designs on an NVIDIA Jetson AGX Xavier edge GPU board. Experimental results with six videos show that the combined compression schemes provide 34.0× speedup compared to a state-of-the-art scheme, with minimal impact on quality and compression ratio.  more » « less
Award ID(s):
1763681
PAR ID:
10440765
Author(s) / Creator(s):
; ; ; ; ;
Date Published:
Journal Name:
International Symposium on Microarchitecture (MICRO) 2022
Page Range / eLocation ID:
282 to 299
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We present a first-of-its-kind ultra-compact intelligent camera system, dubbed i-FlatCam, including a lensless camera with a computational (Comp.) chip. It highlights (1) a predict-then-focus eye tracking pipeline for boosted efficiency without compromising the accuracy, (2) a unified compression scheme for single-chip processing and improved frame rate per second (FPS), and (3) dedicated intra-channel reuse design for depth-wise convolutional layers (DW-CONV) to increase utilization. i-FlatCam demonstrates the first eye tracking pipeline with a lensless camera and achieves 3.16 degrees of accuracy, 253 FPS, 91.49 µJ/Frame, and 6.7mm×8.9mm×1.2mm camera form factor, paving the way for next-generation Augmented Reality (AR) and Virtual Reality (VR) devices. 
    more » « less
  2. Deconvolution is a key component in contemporary neural networks, especially generative adversarial networks (GANs) and fully convolutional networks (FCNs). Due to extra operations of deconvolution compared to convolution, considerable degradation of performance as well as energy efficiency is incurred when implementing deconvolution on the existing resistive random access memory (ReRAM)-based processing-in-memory (PIM) accelerators. In this work, we propose a ReRAM-based accelerator design, RED, for providing high-performance and low-energy deconvolution. We analyze the deconvolution execution on the existing ReRAM-based PIMs and utilize its interior computation pattern for design optimization. RED includes two major contributions: pixel-wise mapping scheme and zero-skipping data flow. Pixel-wise mapping scheme removes the zero insertion and performs convolutions over several ReRAM arrays and thus enables parallel computations with non-zero inputs. Zero-skipping data flow, assisted with customized input buffers design, enhances the computation parallelism and input data reuse. In evaluation, we compare RED against the existing ReRAM-based PIMs and CMOS-based counterpart with a variety of GAN and FCN models, each of which contains multiple deconvolution layers. The experimental results show that RED achieves a 4.0×-56.16× speedup and a 1.05×-18.17× energy efficiency improvement over previous related accelerator designs. 
    more » « less
  3. In point cloud compression, exploiting temporal redundancy for inter predictive coding is challenging because of the irregular geometry. This paper proposes an efficient block-based inter-coding scheme for color attribute compression. The scheme includes integer-precision motion estimation and an adaptive graph based in-loop filtering scheme for improved attribute prediction. The proposed block-based motion estimation scheme consists of an initial motion search that exploits geometric and color attributes, followed by a motion refinement that only minimizes color prediction error. To further improve color prediction, we propose a vertex-domain low-pass graph filtering scheme that can adaptively remove noise from predictors computed from motion estimation with different accuracy. Our experiments demonstrate significant coding gain over state-of-the-art coding methods. 
    more » « less
  4. In order to meet the performance/privacy require- ments of future data-intensive mobile applications, e.g., self- driving cars, mobile data analytics, and AR/VR, service providers are expected to draw on shared storage/computation/connectivity resources at the network “edge”. To be cost-effective, a key functional requirement for such infrastructure is enabling the shar- ing of heterogeneous resources amongst tenants/service providers supporting spatially varying and dynamic user demands. This paper proposes a resource allocation criterion, namely, Share Constrained Slicing (SCS), for slices allocated predefined shares of the network’s resources, which extends traditional α−fairness criterion, by striking a balance among inter- and intra-slice fairness vs. overall efficiency. We show that SCS has several desirable properties including slice-level protection, envyfreeness, and load- driven elasticity. In practice, mobile users’ dynamics could make the cost of implementing SCS high, so we discuss the feasibility of using a simpler (dynamically) weighted max-min as a surrogate resource allocation scheme. For a setting with stochastic loads and elastic user requirements, we establish a sufficient condition for the stability of the associated coupled network system. Finally, and perhaps surprisingly, we show via extensive simulations that while SCS (and/or the surrogate weighted max-min allocation) provides inter-slice protection, they can achieve improved job delay and/or perceived throughput, as compared to other weighted max- min based allocation schemes whose intra-slice weight allocation is not share-constrained, e.g., traditional max-min or discriminatory processor sharing. 
    more » « less
  5. Abstract. Phase correlation (PC) is a well-known method for estimating cloud motion vectors (CMVs) from infrared and visible spectrum images. Commonly, phase shift is computed in the small blocks of the images using the fast Fourier transform. In this study, we investigate the performance and the stability of the blockwise PC method by changing the block size, the frame interval, and combinations of red, green, and blue (RGB) channels from the total sky imager (TSI) at the United States Atmospheric Radiation Measurement user facility's Southern Great Plains site. We find that shorter frame intervals, followed by larger block sizes, are responsible for stable estimates of the CMV, as suggested by the higher autocorrelations. The choice of RGB channels has a limited effect on the quality of CMVs, and the red and the grayscale images are marginally more reliable than the other combinations during rapidly evolving low-level clouds. The stability of CMVs was tested at different image resolutions with an implementation of the optimized algorithm on the Sage cyberinfrastructure test bed. We find that doubling the frame rate outperforms quadrupling the image resolution in achieving CMV stability. The correlations of CMVs with the wind data are significant in the range of 0.38–0.59 with a 95 % confidence interval, despite the uncertainties and limitations of both datasets. A comparison of the PC method with constructed data and the optical flow method suggests that the post-processing of the vector field has a significant effect on the quality of the CMV. The raindrop-contaminated images can be identified by the rotation of the TSI mirror in the motion field. The results of this study are critical to optimizing algorithms for edge-computing sensor systems. 
    more » « less