skip to main content


Title: Joint neural phase retrieval and compression for energy- and computation-efficient holography on the edge
Recent deep learning approaches have shown remarkable promise to enable high fidelity holographic displays. However, lightweight wearable display devices cannot afford the computation demand and energy consumption for hologram generation due to the limited onboard compute capability and battery life. On the other hand, if the computation is conducted entirely remotely on a cloud server, transmitting lossless hologram data is not only challenging but also result in prohibitively high latency and storage. In this work, by distributing the computation and optimizing the transmission, we propose the first framework that jointly generates and compresses high-quality phase-only holograms. Specifically, our framework asymmetrically separates the hologram generation process into high-compute remote encoding (on the server), and low-compute decoding (on the edge) stages. Our encoding enables light weight latent space data, thus faster and efficient transmission to the edge device. With our framework, we observed a reduction of 76% computation and consequently 83% in energy cost on edge devices, compared to the existing hologram generation methods. Our framework is robust to transmission and decoding errors, and approach high image fidelity for as low as 2 bits-per-pixel, and further reduced average bit-rates and decoding time for holographic videos.  more » « less
Award ID(s):
2107454
NSF-PAR ID:
10465404
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
ACM Transactions on Graphics
Volume:
41
Issue:
4
ISSN:
0730-0301
Page Range / eLocation ID:
1 to 16
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Dual-connectivity streaming is a key enabler of next generation six Degrees Of Freedom (6DOF) Virtual Reality (VR) scene immersion. Indeed, using conventional sub-6 GHz WiFi only allows to reliably stream a low-quality baseline representation of the VR content, while emerging high-frequency communication technologies allow to stream in parallel a high-quality user viewport-specific enhancement representation that synergistically integrates with the baseline representation, to deliver high-quality VR immersion. We investigate holistically as part of an entire future VR streaming system two such candidate emerging technologies, Free Space Optics (FSO) and millimeter-Wave (mmWave) that benefit from a large available spectrum to deliver unprecedented data rates. We analytically characterize the key components of the envisioned dual-connectivity 6DOF VR streaming system that integrates in addition edge computing and scalable 360° video tiling, and we formulate an optimization problem to maximize the immersion fidelity delivered by the system, given the WiFi and mmWave/FSO link rates, and the computing capabilities of the edge server and the users’ VR headsets. This optimization problem is mixed integer programming of high complexity and we formulate a geometric programming framework to compute the optimal solution at low complexity. We carry out simulation experiments to assess the performance of the proposed system using actual 6DOF navigation traces from multiple mobile VR users that we collected. Our results demonstrate that our system considerably advances the traditional state-of-the-art and enables streaming of 8K-120 frames-per-second (fps) 6DOF content at high fidelity. 
    more » « less
  2. Holography is a promising avenue for high-quality displays without requiring bulky, complex optical systems. While recent work has demonstrated accurate hologram generation of 2D scenes, high-quality holographic projections of 3D scenes has been out of reach until now. Existing multiplane 3D holography approaches fail to model wavefronts in the presence of partial occlusion while holographic stereogram methods have to make a fundamental tradeoff between spatial and angular resolution. In addition, existing 3D holographic display methods rely on heuristic encoding of complex amplitude into phase-only pixels which results in holograms with severe artifacts. Fundamental limitations of the input representation, wavefront modeling, and optimization methods prohibit artifact-free 3D holographic projections in today’s displays. To lift these limitations, we introduce hogel-free holography which optimizes for true 3D holograms, supporting both depth- and view-dependent effects for the first time. Our approach overcomes the fundamental spatio-angular resolution tradeoff typical to stereogram approaches. Moreover, it avoids heuristic encoding schemes to achieve high image fidelity over a 3D volume. We validate that the proposed method achieves 10 dB PSNR improvement on simulated holographic reconstructions. We also validate our approach on an experimental prototype with accurate parallax and depth focus effects. 
    more » « less
  3. With the proliferation of low-cost sensors and the Internet of Things, the rate of producing data far exceeds the compute and storage capabilities of today’s infrastructure. Much of this data takes the form of time series, and in response, there has been increasing interest in the creation of time series archives in the last decade, along with the development and deployment of novel analysis methods to process the data. The general strategy has been to apply a plurality of similarity search mechanisms to various subsets and subsequences of time series data in order to identify repeated patterns and anomalies; however, the computational demands of these approaches renders them incompatible with today’s power-constrained embedded CPUs. To address this challenge, we present FA-LAMP, an FPGA-accelerated implementation of the Learned Approximate Matrix Profile (LAMP) algorithm, which predicts the correlation between streaming data sampled in real-time and a representative time series dataset used for training. FA-LAMP lends itself as a real-time solution for time series analysis problems such as classification. We present the implementation of FA-LAMP on both edge- and cloud-based prototypes. On the edge devices, FA-LAMP integrates accelerated computation as close as possible to IoT sensors, thereby eliminating the need to transmit and store data in the cloud for posterior analysis. On the cloud-based accelerators, FA-LAMP can execute multiple LAMP models on the same board, allowing simultaneous processing of incoming data from multiple data sources across a network. LAMP employs a Convolutional Neural Network (CNN) for prediction. This work investigates the challenges and limitations of deploying CNNs on FPGAs using the Xilinx Deep Learning Processor Unit (DPU) and the Vitis AI development environment. We expose several technical limitations of the DPU, while providing a mechanism to overcome them by attaching custom IP block accelerators to the architecture. We evaluate FA-LAMP using a low-cost Xilinx Ultra96-V2 FPGA as well as a cloud-based Xilinx Alveo U280 accelerator card and measure their performance against a prototypical LAMP deployment running on a Raspberry Pi 3, an Edge TPU, a GPU, a desktop CPU, and a server-class CPU. In the edge scenario, the Ultra96-V2 FPGA improved performance and energy consumption compared to the Raspberry Pi; in the cloud scenario, the server CPU and GPU outperformed the Alveo U280 accelerator card, while the desktop CPU achieved comparable performance; however, the Alveo card offered an order of magnitude lower energy consumption compared to the other four platforms. Our implementation is publicly available at https://github.com/aminiok1/lamp-alveo. 
    more » « less
  4. We present a foveated rendering method to accelerate the amplitude-only computer-generated hologram (AO-CGH) calculation in a holographic near-eye 3D display. For a given target image, we compute a high-resolution foveal region and a low-resolution peripheral region with dramatically reduced pixel numbers. Our technique significantly improves the computation speed of the AO-CGH while maintaining the perceived image quality in the fovea. Moreover, to accommodate the eye gaze angle change, we develop an algorithm to laterally shift the foveal image with negligible extra computational cost. Our technique holds great promise in advancing the holographic 3D display in real-time use.

     
    more » « less
  5. With the advent of 5G, supporting high-quality game streaming applications on edge devices has become a reality. This is evidenced by a recent surge in cloud gaming applications on mobile devices. In contrast to video streaming applications, interactive games require much more compute power for supporting improved rendering (such as 4K streaming) with the stipulated frames-per second (FPS) constraints. This in turn consumes more battery power in a power-constrained mobile device. Thus, the state-of-the-art gaming applications suffer from lower video quality (QoS) and/or energy efficiency. While there has been a plethora of recent works on optimizing game streaming applications, to our knowledge, there is no study that systematically investigates the design pairs on the end-to-end game streaming pipeline across the cloud, network, and edge devices to understand the individual contributions of the different stages of the pipeline for improving the overall QoS and energy efficiency. In this context, this paper presents a comprehensive performance and power analysis of the entire game streaming pipeline consisting of the server/cloud side, network, and edge. Through extensive measurements with a high-end workstation mimicking the cloud end, an open-source platform (Moonlight-GameStreaming) emulating the edge device/mobile platform, and two network settings (WiFi and 5G) we conduct a detailed measurement-based study with seven representative games with different characteristics. We characterize the performance in terms of frame latency, QoS, bitrate, and energy consumption for different stages of the gaming pipeline. Our study shows that the rendering stage and the encoding stage at the cloud end are the bottlenecks to support 4K streaming. While 5G is certainly more suitable for supporting enhanced video quality with 4K streaming, it is more expensive in terms of power consumption compared to WiFi. Further, fluctuations in 5G network quality can lead to huge frame drops thus affecting QoS, which needs to be addressed by a coordinated design between the edge device and the server. Finally, the network interface and the decoder units in a mobile platform need more energy-efficient design to support high quality games at a lower cost. These observations should help in designing more cost-effective future cloud gaming platforms. 
    more » « less