skip to main content

Search for: All records

Creators/Authors contains: "Heide, Felix"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Tasks across diverse application domains can be posed as large-scale optimization problems, these include graphics, vision, machine learning, imaging, health, scheduling, planning, and energy system forecasting. Independently of the application domain, proximal algorithms have emerged as a formal optimization method that successfully solves a wide array of existing problems, often exploiting problem-specific structures in the optimization. Although model-based formal optimization provides a principled approach to problem modeling with convergence guarantees, at first glance, this seems to be at odds with black-box deep learning methods. A recent line of work shows that, when combined with learning-based ingredients, model-based optimization methods are effective, interpretable, and allow for generalization to a wide spectrum of applications with little or no extra training data. However, experimenting with such hybrid approaches for different tasks by hand requires domain expertise in both proximal optimization and deep learning, which is often error-prone and time-consuming. Moreover, naively unrolling these iterative methods produces lengthy compute graphs, which when differentiated via autograd techniques results in exploding memory consumption, making batch-based training challenging. In this work, we introduce ∇-Prox, a domain-specific modeling language and compiler for large-scale optimization problems using differentiable proximal algorithms. ∇-Prox allows users to specify optimization objective functions of unknowns concisely at a high level, and intelligently compiles the problem into compute and memory-efficient differentiable solvers. One of the core features of ∇-Prox is its full differentiability, which supports hybrid model- and learning-based solvers integrating proximal optimization with neural network pipelines. Example applications of this methodology include learning-based priors and/or sample-dependent inner-loop optimization schedulers, learned with deep equilibrium learning or deep reinforcement learning. With a few lines of code, we show ∇-Prox can generate performant solvers for a range of image optimization problems, including end-to-end computational optics, image deraining, and compressive magnetic resonance imaging. We also demonstrate ∇-Prox can be used in a completely orthogonal application domain of energy system planning, an essential task in the energy crisis and the clean energy transition, where it outperforms state-of-the-art CVXPY and commercial Gurobi solvers. 
    more » « less
    Free, publicly-accessible full text available August 1, 2024
  2. The Visual Turing Test is the ultimate goal to evaluate the realism of holographic displays. Previous studies have focused on addressing challenges such as limited e ́tendue and image quality over a large focal volume, but they have not investigated the effect of pupil sampling on the viewing experience in full 3D holograms. In this work, we tackle this problem with a novel hologram generation algorithm motivated by matching the projection operators of incoherent (Light Field) and coherent (Wigner Function) light transport. To this end, we supervise hologram computation using synthesized photographs, which are rendered on-the-fly using Light Field refocusing from stochastically sampled pupil states during optimization. The proposed method produces holograms with correct parallax and focus cues, which are important for passing the Visual Turing Test. We validate that our approach compares favorably to state-of-the-art CGH algorithms that use Light Field and Focal Stack supervision. Our experiments demonstrate that our algorithm improves the viewing experience when evaluated under a large variety of different pupil states. 
    more » « less
    Free, publicly-accessible full text available July 28, 2024
  3. Holography is a promising avenue for high-quality displays without requiring bulky, complex optical systems. While recent work has demonstrated accurate hologram generation of 2D scenes, high-quality holographic projections of 3D scenes has been out of reach until now. Existing multiplane 3D holography approaches fail to model wavefronts in the presence of partial occlusion while holographic stereogram methods have to make a fundamental tradeoff between spatial and angular resolution. In addition, existing 3D holographic display methods rely on heuristic encoding of complex amplitude into phase-only pixels which results in holograms with severe artifacts. Fundamental limitations of the input representation, wavefront modeling, and optimization methods prohibit artifact-free 3D holographic projections in today’s displays. To lift these limitations, we introduce hogel-free holography which optimizes for true 3D holograms, supporting both depth- and view-dependent effects for the first time. Our approach overcomes the fundamental spatio-angular resolution tradeoff typical to stereogram approaches. Moreover, it avoids heuristic encoding schemes to achieve high image fidelity over a 3D volume. We validate that the proposed method achieves 10 dB PSNR improvement on simulated holographic reconstructions. We also validate our approach on an experimental prototype with accurate parallax and depth focus effects. 
    more » « less
  4. Holographic displays promise to deliver unprecedented display capabilities in augmented reality applications, featuring a wide field of view, wide color gamut, spatial resolution, and depth cues all in a compact form factor. While emerging holographic display approaches have been successful in achieving large étendue and high image quality as seen by a camera, the large étendue also reveals a problem that makes existing displays impractical: the sampling of the holographic field by the eye pupil. Existing methods have not investigated this issue due to the lack of displays with large enough étendue, and, as such, they suffer from severe artifacts with varying eye pupil size and location. We show that the holographic field as sampled by the eye pupil is highly varying for existing display setups, and we propose pupil-aware holography that maximizes the perceptual image quality irrespective of the size, location, and orientation of the eye pupil in a near-eye holographic display. We validate the proposed approach both in simulations and on a prototype holographic display and show that our method eliminates severe artifacts and significantly outperforms existing approaches. 
    more » « less
  5. null (Ed.)
  6. Abstract

    Nano-optic imagers that modulate light at sub-wavelength scales could enable new applications in diverse domains ranging from robotics to medicine. Although metasurface optics offer a path to such ultra-small imagers, existing methods have achieved image quality far worse than bulky refractive alternatives, fundamentally limited by aberrations at large apertures and low f-numbers. In this work, we close this performance gap by introducing a neural nano-optics imager. We devise a fully differentiable learning framework that learns a metasurface physical structure in conjunction with a neural feature-based image reconstruction algorithm. Experimentally validating the proposed method, we achieve an order of magnitude lower reconstruction error than existing approaches. As such, we present a high-quality, nano-optic imager that combines the widest field-of-view for full-color metasurface operation while simultaneously achieving the largest demonstrated aperture of 0.5 mm at an f-number of 2.

    more » « less
  7. null (Ed.)
  8. null (Ed.)
  9. null (Ed.)
    Most modern commodity imaging systems we use directly for photography—or indirectly rely on for downstream applications—employ optical systems of multiple lenses that must balance deviations from perfect optics, manufacturing constraints, tolerances, cost, and footprint. Although optical designs often have complex interactions with downstream image processing or analysis tasks, today’s compound optics are designed in isolation from these interactions. Existing optical design tools aim to minimize optical aberrations, such as deviations from Gauss’ linear model of optics, instead of application-specific losses, precluding joint optimization with hardware image signal processing (ISP) and highly parameterized neural network processing. In this article, we propose an optimization method for compound optics that lifts these limitations. We optimize entire lens systems jointly with hardware and software image processing pipelines, downstream neural network processing, and application-specific end-to-end losses. To this end, we propose a learned, differentiable forward model for compound optics and an alternating proximal optimization method that handles function compositions with highly varying parameter dimensions for optics, hardware ISP, and neural nets. Our method integrates seamlessly atop existing optical design tools, such as Zemax . We can thus assess our method across many camera system designs and end-to-end applications. We validate our approach in an automotive camera optics setting—together with hardware ISP post processing and detection—outperforming classical optics designs for automotive object detection and traffic light state detection. For human viewing tasks, we optimize optics and processing pipelines for dynamic outdoor scenarios and dynamic low-light imaging. We outperform existing compartmentalized design or fine-tuning methods qualitatively and quantitatively, across all domain-specific applications tested. 
    more » « less