Search for: All records

Creators/Authors contains: "Boominathan, Vivek"

« Prev Next »

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Event fields: Capturing light fields at high speed, resolution, and dynamic range

https://doi.org/10.1109/CVPR52734.2025.02506

Qu, Ziyuan; Zou, Zihao; Boominathan, Vivek; Chakravarthula, Praneeth; Pediredla, Adithya (June 2025, IEEE)

Event cameras, which feature pixels that independently respond to changes in brightness, are becoming increasingly popular in high- speed applications due to their lower latency, reduced bandwidth requirements, and enhanced dynamic range compared to traditional frame- based cameras. Numerous imaging and vision techniques have leveraged event cameras for high- speed scene understanding by capturing high- framerate, high- dynamic range videos, primarily utilizing the temporal advantages inherent to event cameras. Additionally, imaging and vision techniques have utilized the light field—a complementary dimension to temporal information—for enhanced scene understanding.In this work, we propose "Event Fields", a new approach that utilizes innovative optical designs for event cameras to capture light fields at high speed. We develop the underlying mathematical framework for Event Fields and introduce two foundational frameworks to capture them practically: spatial multiplexing to capture temporal derivatives and temporal multiplexing to capture angular derivatives. To realize these, we design two complementary optical setups— one using a kaleidoscope for spatial multiplexing and another using a galvanometer for temporal multiplexing. We evaluate the performance of both designs using a custom-built simulator and real hardware prototypes, showcasing their distinct benefits. Our event fields unlock the full advantages of typical light fields—like post- capture refocusing and depth estimation—now supercharged for high- speed and high- dynamic range scenes. This novel light- sensing paradigm opens doors to new applications in photography, robotics, and AR/VR, and presents fresh challenges in rendering and machine learning.
more » « less
Free, publicly-accessible full text available June 10, 2026
Passive Snapshot Coded Aperture Dual-Pixel RGB-D Imaging

Ghanekar, Bhargav; Khan, Salman S; Sharma, Pranav; Singh, Shreyas; Boominathan, Vivek; Mitra, Kaushik; Veeraraghavan, Ashok (March 2024, ArXiv)

Passive, compact, single-shot 3D sensing is useful in many application areas such as microscopy, medical imaging, surgical navigation, and autonomous driving where form factor, time, and power constraints can exist. Obtaining RGB-D scene information over a short imaging distance, in an ultra-compact form factor, and in a passive, snapshot manner is challenging. Dual-pixel (DP) sensors are a potential solution to achieve the same. DP sensors collect light rays from two different halves of the lens in two interleaved pixel arrays, thus capturing two slightly different views of the scene, like a stereo camera system. However, imaging with a DP sensor implies that the defocus blur size is directly proportional to the disparity seen between the views. This creates a trade-off between disparity estimation vs. deblurring accuracy. To improve this trade-off effect, we propose CADS (Coded Aperture Dual-Pixel Sensing), in which we use a coded aperture in the imaging lens along with a DP sensor. In our approach, we jointly learn an optimal coded pattern and the reconstruction algorithm in an end-to-end optimization setting. Our resulting CADS imaging system demonstrates improvement of >1.5dB PSNR in all-in-focus (AIF) estimates and 5-6% in depth estimation quality over naive DP sensing for a wide range of aperture settings. Furthermore, we build the proposed CADS prototypes for DSLR photography settings and in an endoscope and a dermoscope form factor. Our novel coded dual-pixel sensing approach demonstrates accurate RGB-D reconstruction results in simulations and real-world experiments in a passive, snapshot, and compact manner.
more » « less
Full Text Available
Real-time, deep-learning aided lensless microscope

https://doi.org/10.1364/BOE.490199

Wu, Jimin; Boominathan, Vivek; Veeraraghavan, Ashok; Robinson, Jacob_T (July 2023, Biomedical Optics Express)

Traditional miniaturized fluorescence microscopes are critical tools for modern biology. Invariably, they struggle to simultaneously image with a high spatial resolution and a large field of view (FOV). Lensless microscopes offer a solution to this limitation. However, real-time visualization of samples is not possible with lensless imaging, as image reconstruction can take minutes to complete. This poses a challenge for usability, as real-time visualization is a crucial feature that assists users in identifying and locating the imaging target. The issue is particularly pronounced in lensless microscopes that operate at close imaging distances. Imaging at close distances requires shift-varying deconvolution to account for the variation of the point spread function (PSF) across the FOV. Here, we present a lensless microscope that achieves real-time image reconstruction by eliminating the use of an iterative reconstruction algorithm. The neural network-based reconstruction method we show here, achieves more than 10000 times increase in reconstruction speed compared to iterative reconstruction. The increased reconstruction speed allows us to visualize the results of our lensless microscope at more than 25 frames per second (fps), while achieving better than 7 µm resolution over a FOV of 10 mm². This ability to reconstruct and visualize samples in real-time empowers a more user-friendly interaction with lensless microscopes. The users are able to use these microscopes much like they currently do with conventional microscopes.
more » « less
NeuWS: Neural wavefront shaping for guidestar-free imaging through static and dynamic scattering media

https://doi.org/10.1126/sciadv.adg4671

Feng, Brandon Y.; Guo, Haiyun; Xie, Mingyang; Boominathan, Vivek; Sharma, Manoj K.; Veeraraghavan, Ashok; Metzler, Christopher A. (June 2023, Science Advances)

Diffraction-limited optical imaging through scattering media has the potential to transform many applications such as airborne and space-based imaging (through the atmosphere), bioimaging (through skin and human tissue), and fiber-based imaging (through fiber bundles). Existing wavefront shaping methods can image through scattering media and other obscurants by optically correcting wavefront aberrations using high-resolution spatial light modulators—but these methods generally require (i) guidestars, (ii) controlled illumination, (iii) point scanning, and/or (iv) statics scenes and aberrations. We propose neural wavefront shaping (NeuWS), a scanning-free wavefront shaping technique that integrates maximum likelihood estimation, measurement modulation, and neural signal representations to reconstruct diffraction-limited images through strong static and dynamic scattering media without guidestars, sparse targets, controlled illumination, nor specialized image sensors. We experimentally demonstrate guidestar-free, wide field-of-view, high-resolution, diffraction-limited imaging of extended, nonsparse, and static/dynamic scenes captured through static/dynamic aberrations.
more » « less
Full Text Available
Foveated thermal computational imaging prototype using all-silicon meta-optics

https://doi.org/10.1364/OPTICA.502857

Saragadam, Vishwanath; Han, Zheyi; Boominathan, Vivek; Huang, Luocheng; Tan, Shiyu; Fröch, Johannes_E; Böhringer, Karl_F; Baraniuk, Richard_G; Majumdar, Arka; Veeraraghavan, Ashok (January 2024, Optica)

Foveated imaging provides a better tradeoff between situational awareness (field of view) and resolution, and is critical in long wavelength infrared regimes because of the size, weight, power, and cost of thermal sensors. We demonstrate computational foveated imaging by exploiting the ability of a meta-optical frontend to discriminate between different polarization states and a computational backend to reconstruct the captured image/video. The frontend is a three-element optic: the first element, which we call the “foveal” element, is a metalens that focuses s-polarized light at a distance off₁without affecting the p-polarized light; the second element, which we call the “perifovea” element, is another metalens that focuses p-polarized light at a distance off₂without affecting thes-polarized light. The third element is a freely rotating polarizer that dynamically changes the mixing ratios between the two polarization states. Both the foveal element (focal length=150mm; diameter=75mm) and the perifoveal element (focal length=25mm; diameter=25mm) were fabricated as polarization-sensitive, all-silicon, meta surfaces resulting in a large-aperture, 1:6 foveal expansion, thermal imaging capability. A computational backend then utilizes a deep image prior to separate the resultant multiplexed image or video into a foveated image consisting of a high resolution center and a lower-resolution large field of view context. We build a prototype system and demonstrate 12 frames per second real-time, thermal, foveated image and video capture..
more » « less
FlatNet3D: intensity and absolute depth from single-shot lensless capture

https://doi.org/10.1364/JOSAA.466286

Bagadthey, Dhruvjyoti; Prabhu, Sanjana; Khan, Salman_S; Fredrick, D_Tony; Boominathan, Vivek; Veeraraghavan, Ashok; Mitra, Kaushik (September 2022, Journal of the Optical Society of America A)

Lensless cameras are ultra-thin imaging systems that replace the lens with a thin passive optical mask and computation. Passive mask-based lensless cameras encode depth information in their measurements for a certain depth range. Early works have shown that this encoded depth can be used to perform 3D reconstruction of close-range scenes. However, these approaches for 3D reconstructions are typically optimization based and require strong hand-crafted priors and hundreds of iterations to reconstruct. Moreover, the reconstructions suffer from low resolution, noise, and artifacts. In this work, we proposeFlatNet3D—a feed-forward deep network that can estimate both depth and intensity from a single lensless capture. FlatNet3D is an end-to-end trainable deep network that directly reconstructs depth and intensity from a lensless measurement using an efficient physics-based 3D mapping stage and a fully convolutional network. Our algorithm is fast and produces high-quality results, which we validate using both simulated and real scenes captured using PhlatCam.
more » « less
Recent advances in lensless imaging

https://doi.org/10.1364/OPTICA.431361

Boominathan, Vivek; Robinson, Jacob_T; Waller, Laura; Veeraraghavan, Ashok (December 2021, Optica)

Lensless imaging provides opportunities to design imaging systems free from the constraints imposed by traditional camera architectures. Due to advances in imaging hardware, fabrication techniques, and new algorithms, researchers have recently developed lensless imaging systems that are extremely compact and lightweight or able to image higher-dimensional quantities. Here we review these recent advances and describe the design principles and their effects that one should consider when developing and using lensless imaging systems.
more » « less
EDoF-ToF: extended depth of field time-of-flight imaging

https://doi.org/10.1364/OE.441515

Tan, Jasper; Boominathan, Vivek; Baraniuk, Richard; Veeraraghavan, Ashok (November 2021, Optics Express)

Conventional continuous-wave amplitude-modulated time-of-flight (CWAM ToF) cameras suffer from a fundamental trade-off between light throughput and depth of field (DoF): a larger lens aperture allows more light collection but suffers from significantly lower DoF. However, both high light throughput, which increases signal-to-noise ratio, and a wide DoF, which enlarges the system’s applicable depth range, are valuable for CWAM ToF applications. In this work, we propose EDoF-ToF, an algorithmic method to extend the DoF of large-aperture CWAM ToF cameras by using a neural network to deblur objects outside of the lens’s narrow focal region and thus produce an all-in-focus measurement. A key component of our work is the proposed large-aperture ToF training data simulator, which models the depth-dependent blurs and partial occlusions caused by such apertures. Contrary to conventional image deblurring where the blur model is typically linear, ToF depth maps are nonlinear functions of scene intensities, resulting in a nonlinear blur model that we also derive for our simulator. Unlike extended DoF for conventional photography where depth information needs to be encoded (or made depth-invariant) using additional hardware (phase masks, focal sweeping, etc.), ToF sensor measurements naturally encode depth information, allowing a completely software solution to extended DoF. We experimentally demonstrate EDoF-ToF increasing the DoF of a conventional ToF system by 3.6 ×, effectively achieving the DoF of a smaller lens aperture that allows 22.1 × less light. Ultimately, EDoF-ToF enables CWAM ToF cameras to enjoy the benefits of both high light throughput and a wide DoF.
more » « less
i-FlatCam: A 253 FPS, 91.49 µJ/Frame Ultra-Compact Intelligent Lensless Camera for Real-Time and Efficient Eye Tracking in VR/AR

https://doi.org/10.1109/VLSITechnologyandCir46769.2022.9830458

Zhao, Yang; Li, Ziyun; Fu, Yonggan; Zhang, Yongan; Li, Chaojian; Wan, Cheng; You, Haoran; Wu, Shang; Ouyang, Xu; Boominathan, Vivek; et al (June 2022, 2022 Symposium on VLSI Technology & Circuits Digest of Technical Papers)

We present a first-of-its-kind ultra-compact intelligent camera system, dubbed i-FlatCam, including a lensless camera with a computational (Comp.) chip. It highlights (1) a predict-then-focus eye tracking pipeline for boosted efficiency without compromising the accuracy, (2) a unified compression scheme for single-chip processing and improved frame rate per second (FPS), and (3) dedicated intra-channel reuse design for depth-wise convolutional layers (DW-CONV) to increase utilization. i-FlatCam demonstrates the first eye tracking pipeline with a lensless camera and achieves 3.16 degrees of accuracy, 253 FPS, 91.49 µJ/Frame, and 6.7mm×8.9mm×1.2mm camera form factor, paving the way for next-generation Augmented Reality (AR) and Virtual Reality (VR) devices.
more » « less
Full Text Available
SACoD: Sensor Algorithm Co-Design Towards Efficient CNN-powered Intelligent PhlatCam

https://doi.org/10.1109/ICCV48922.2021.00512

Fu, Yonggan; Zhang, Yang; Wang, Yue; Lu, Zhihan; Boominathan, Vivek; Veeraraghavan, Ashok; Lin, Yingyan (October 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV))

There has been a booming demand for integrating Convolutional Neural Networks (CNNs) powered functionalities into Internet-of-Thing (IoT) devices to enable ubiquitous intelligent "IoT cameras". However, more extensive applications of such IoT systems are still limited by two challenges. First, some applications, especially medicine-and wearable-related ones, impose stringent requirements on the camera form factor. Second, powerful CNNs often require considerable storage and energy cost, whereas IoT devices often suffer from limited resources. PhlatCam, with its form factor potentially reduced by orders of magnitude, has emerged as a promising solution to the first aforementioned challenge, while the second one remains a bottleneck. Existing compression techniques, which can potentially tackle the second challenge, are far from realizing the full potential in storage and energy reduction, because they mostly focus on the CNN algorithm itself. To this end, this work proposes SACoD, a Sensor Algorithm Co-Design framework to develop more efficient CNN-powered PhlatCam. In particular, the mask coded in the Phlat-Cam sensor and the backend CNN model are jointly optimized in terms of both model parameters and architectures via differential neural architecture search. Extensive experiments including both simulation and physical measurement on manufactured masks show that the proposed SACoD framework achieves aggressive model compression and energy savings while maintaining or even boosting the task accuracy, when benchmarking over two state-of-the-art (SOTA) designs with six datasets across four different vision tasks including classification, segmentation, image translation, and face recognition. Our codes are available at: https://github.com/RICE-EIC/SACoD.
more » « less
Full Text Available

« Prev Next »