We introduce the first end-to-end learning-based solution to near-field Photometric Stereo (PS), where the light sources are close to the object of interest. This setup is especially useful for reconstructing large immobile objects. Our method is fast, producing a mesh from 52 512x384 resolution images in about 1 second on a commodity GPU, thus potentially unlocking several AR/VR applications. Existing approaches rely on optimization coupled with a far-field PS network operating on pixels or small patches. Using optimization makes these approaches slow and memory intensive (requiring 17GB GPU and 27GB of CPU memory) while using only pixels or patches makes them highly susceptible to noise and calibration errors. To address these issues, we develop a recursive multi-resolution scheme to estimate surface normal and depth maps of the whole image at each step. The predicted depth map at each scale is then used to estimate 'per-pixel lighting' for the next scale. This design makes our approach almost 45x faster and 2 degrees more accurate (11.3 vs. 13.3 degrees Mean Angular Error) than the state-of-the-art near-field PS reconstruction technique, which uses iterative optimization.
more »
« less
High-Resolution Stereo Matching based on Sampled Photoconsistency Computation
We propose an approach to binocular stereo that avoids exhaustive photoconsistency computations at every pixel, since they are redundant and computationally expensive, especially for high resolution images. We argue that developing scalable stereo algorithms is critical as image resolution is expected to continue increasing rapidly. Our approach relies on oversegmentation of the images into superpixels, followed by photoconsistency computation for only a random subset of the pixels of each superpixel. This generates sparse reconstructed points which are used to fit planes. Plane hypotheses are propagated among neighboring superpixels, and they are evaluated at each superpixel by selecting a random subset of pixels on which to aggregate photoconsistency scores for the competing planes. We performed extensive tests to characterize the performance of this algorithm in terms of accuracy and speed on the full-resolution stereo pairs of the 2014 Middlebury benchmark that contains up to 6-megapixel images. Our results show that very large computational savings can be achieved at a small loss of accuracy. A multi-threaded implementation of our method is faster than other methods that achieve similar accuracy and thus it provides a useful accuracy-speed tradeoff.
more »
« less
- Award ID(s):
- 1637761
- PAR ID:
- 10041310
- Date Published:
- Journal Name:
- British Machine Vision Conference
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Hyperspectral cameras collect detailed spectral information at each image pixel, contributing to the identification of image features. The rich spectral content of hyperspectral imagery has led to its application in diverse fields of study. This study focused on cloud classification using a dataset of hyperspectral sky images captured by a Resonon PIKA XC2 camera. The camera records images using 462 spectral bands, ranging from 400 to 1000 nm, with a spectral resolution of 1.9 nm. Our preliminary/unlabeled dataset comprised 33 parent hyperspectral images (HSI), each a substantial unlabeled image measuring 4402-by-1600 pixels. With the meteorological expertise within our team, we manually labeled pixels by extracting 10 to 20 sample patches from each parent image, each patch consisting of a 50-by-50 pixel field. This process yielded a collection of 444 patches, each categorically labeled into one of seven cloud and sky condition categories. To embed the inherent data structure while classifying individual pixels, we introduced an innovative technique to boost classification accuracy by incorporating patch-specific information into each pixel’s feature vector. The posterior probabilities generated by these classifiers, which capture the unique attributes of each patch, were subsequently concatenated with the pixel’s original spectral data to form an augmented feature vector. We then applied a final classifier to map the augmented vectors to the seven cloud/sky categories. The results compared favorably to the baseline model devoid of patch-origin embedding, showing that incorporating the spatial context along with the spectral information inherent in hyperspectral images enhances the classification accuracy in hyperspectral cloud classification. The dataset is available on IEEE DataPort.more » « less
-
We use neural radiance fields (NeRFs) to build interactive 3D environments from large-scale visual captures spanning buildings or even multiple city blocks collected primarily from drones. In contrast to single object scenes (on which NeRFs are traditionally evaluated), our scale poses multiple challenges including (1) the need to model thousands of images with varying lighting conditions, each of which capture only a small subset of the scene, (2) prohibitively large model capacities that make it infeasible to train on a single GPU, and (3) significant challenges for fast rendering that would enable interactive fly-throughs. To address these challenges, we begin by analyzing visibility statistics for large-scale scenes, motivating a sparse network structure where parameters are specialized to different regions of the scene. We introduce a simple geometric clustering algorithm for data parallelism that partitions training images (or rather pixels) into different NeRF sub-modules that can be trained in parallel. We evaluate our approach on existing datasets (Quad 6k and UrbanScene3D) as well as against our own drone footage, improving training speed by 3x and PSNR by 12%. We also evaluate recent NeRF fast renderers on top of Mega-NeRF and introduce a novel method that exploits temporal coherence. Our technique achieves a 40x speedup over conventional NeRF rendering while remaining within 0.8 db in PSNR quality, exceeding the fidelity of existing fast renderers.more » « less
-
null (Ed.)High-resolution optical imaging systems are quickly becoming universal tools to characterize and quantify microbial diversity in marine ecosystems. Automated detection systems such as convolutional neural networks (CNN) are often developed to identify the immense number of images collected. The goal of our study was to develop a CNN to classify phytoplankton images collected with an Imaging FlowCytobot for the Palmer Antarctica Long-Term Ecological Research project. A medium complexity CNN was developed using a subset of manually-identified images, resulting in an overall accuracy, recall, and f1-score of 93.8%, 93.7%, and 93.7%, respectively. The f1-score dropped to 46.5% when tested on a new random subset of 10,269 images, likely due to highly imbalanced class distributions, high intraclass variance, and interclass morphological similarities of cells in naturally occurring phytoplankton assemblages. Our model was then used to predict taxonomic classifications of phytoplankton at Palmer Station, Antarctica over 2017-2018 and 2018-2019 summer field seasons. The CNN was generally able to capture important seasonal dynamics such as the shift from large centric diatoms to small pennate diatoms in both seasons, which is thought to be driven by increases in glacial meltwater from January to March. Moving forward, we hope to further increase the accuracy of our model to better characterize coastal phytoplankton communities threatened by rapidly changing environmental conditions.more » « less
-
Hyperdimensional vector processing is a nascent computing approach that mimics the brain structure and offers lightweight, robust, and efficient hardware solutions for different learning and cognitive tasks. For image recognition and classification, hyperdimensional computing (HDC) utilizes the intensity values of captured images and the positions of image pixels. Traditional HDC systems represent the intensity and positions with binary hypervectors of 1K–10K dimensions. The intensity hypervectors are cross-correlated for closer values and uncorrelated for distant values in the intensity range. The position hypervectors are pseudo-random binary vectors generated iteratively for the best classification performance. In this study, we propose a radically new approach for encoding image data in HDC systems. Position hypervectors are no longer needed by encoding pixel intensities using a deterministic approach based on quasi-random sequences. The proposed approach significantly reduces the number of operations by eliminating the position hypervectors and the multiplication operations in the HDC system. Additionally, we suggest a hybrid technique for generating hypervectors by combining two deterministic sequences, achieving higher classification accuracy. Our experimental results show up to 102× reduction in runtime and significant memory-usage savings with improved accuracy compared to a baseline HDC system with conventional hypervector encoding.more » « less
An official website of the United States government

