Abstract We consider semantic image segmentation. Our method is inspired by Bayesian deep learning which improves image segmentation accuracy by modeling the uncertainty of the network output. In contrast to uncertainty, our method directly learns to predict the erroneous pixels of a segmentation network, which is modeled as a binary classification problem. It can speed up training comparing to the Monte Carlo integration often used in Bayesian deep learning. It also allows us to train a branch to correct the labels of erroneous pixels. Our method consists of three stages: (i) predict pixel-wise error probability of the initial result, (ii) redetermine new labels for pixels with high error probability, and (iii) fuse the initial result and the redetermined result with respect to the error probability. We formulate the error-pixel prediction problem as a classification task and employ an error-prediction branch in the network to predict pixel-wise error probabilities. We also introduce a detail branch to focus the training process on the erroneous pixels. We have experimentally validated our method on the Cityscapes and ADE20K datasets. Our model can be easily added to various advanced segmentation networks to improve their performance. Taking DeepLabv3+ as an example, our network can achieve 82.88% of mIoU on Cityscapes testing dataset and 45.73% on ADE20K validation dataset, improving corresponding DeepLabv3+ results by 0.74% and 0.13% respectively.
more »
« less
This content will become publicly available on January 1, 2026
Predictive Sampling in Image Sensing for Sparse Image Processing
This paper presents a pixel-level predictive sampling method for image sensing and processing to reduce the computing overhead for power-limited image sensing systems. The predictive sampling method scans through rows and columns to identify the location and value of the critical pixels, which are the turning points in the row and column arrays. The prediction is performed using the value of prior pixels and a pre-defined error threshold. When the prediction is successful, the pixel is marked as a non-critical pixel and is skipped for recording and processing. Only the critical pixels are selected for further processing. We proposed reconstruction methods that recover the raw image from the selected critical pixels using interpolation. The experimental results show that the proposed method can reduce the data throughput by 72% with an error of 1.6% for sparse images. The convolutional neural network model applied with this method can achieve a similar detection accuracy in a standard method while only using 27.1% of data size.
more »
« less
- Award ID(s):
- 2015573
- PAR ID:
- 10644651
- Publisher / Repository:
- IEEE
- Date Published:
- Journal Name:
- IEEE Sensors Letters
- Volume:
- 9
- Issue:
- 1
- ISSN:
- 2475-1472
- Page Range / eLocation ID:
- 1 to 4
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Parallel-laser photogrammetry is growing in popularity as a way to collect non-invasive body size data from wild mammals. Despite its many appeals, this method requires researchers to hand-measure (i) the pixel distance between the parallel laser spots (inter-laser distance) to produce a scale within the image, and (ii) the pixel distance between the study subject’s body landmarks (inter-landmark distance). This manual effort is time-consuming and introduces human error: a researcher measuring the same image twice will rarely return the same values both times (resulting in within-observer error), as is also the case when two researchers measure the same image (resulting in between-observer error). Here, we present two independent methods that automate the inter-laser distance measurement of parallel-laser photogrammetry images. One method uses machine learning and image processing techniques in Python, and the other uses image processing techniques in ImageJ. Both of these methods reduce labor and increase precision without sacrificing accuracy. We first introduce the workflow of the two methods. Then, using two parallel-laser datasets of wild mountain gorilla and wild savannah baboon images, we validate the precision of these two automated methods relative to manual measurements and to each other. We also estimate the reduction of variation in final body size estimates in centimeters when adopting these automated methods, as these methods have no human error. Finally, we highlight the strengths of each method, suggest best practices for adopting either of them, and propose future directions for the automation of parallel-laser photogrammetry data.more » « less
-
Hemanth, Jude (Ed.)Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) causes coronavirus disease 2019 (COVID-19). Imaging tests such as chest X-ray (CXR) and computed tomography (CT) can provide useful information to clinical staff for facilitating a diagnosis of COVID-19 in a more efficient and comprehensive manner. As a breakthrough of artificial intelligence (AI), deep learning has been applied to perform COVID-19 infection region segmentation and disease classification by analyzing CXR and CT data. However, prediction uncertainty of deep learning models for these tasks, which is very important to safety-critical applications like medical image processing, has not been comprehensively investigated. In this work, we propose a novel ensemble deep learning model through integrating bagging deep learning and model calibration to not only enhance segmentation performance, but also reduce prediction uncertainty. The proposed method has been validated on a large dataset that is associated with CXR image segmentation. Experimental results demonstrate that the proposed method can improve the segmentation performance, as well as decrease prediction uncertainty.more » « less
-
Besides per-pixel accuracy, topological correctness is also crucial for the segmentation of images with fine-scale structures, e.g., satellite images and biomedical images. In this paper, by leveraging the theory of digital topology, we identify pixels in an image that are critical for topology. By focusing on these critical pixels, we propose a new homotopy warping loss to train deep image segmentation networks for better topological accuracy. To efficiently identify these topologically critical pixels, we propose a new algorithm exploiting the distance transform. The proposed algorithm, as well as the loss function, naturally generalize to different topological structures in both 2D and 3D settings. The proposed loss function helps deep nets achieve better performance in terms of topology-aware metrics, outperforming state-of-the-art structure/topology-aware segmentation methods.more » « less
-
We present a parallelized optimization method based on fast Neural Radiance Fields (NeRF) for estimating 6-DoF pose of a camera with respect to an object or scene. Given a single observed RGB image of the target, we can predict the translation and rotation of the camera by minimizing the residual between pixels rendered from a fast NeRF model and pixels in the observed image. We integrate a momentum-based camera extrinsic optimization procedure into Instant Neural Graphics Primitives, a recent exceptionally fast NeRF implementation. By introducing parallel Monte Carlo sampling into the pose estimation task, our method overcomes local minima and improves efficiency in a more extensive search space. We also show the importance of adopting a more robust pixel-based loss function to reduce error. Experiments demonstrate that our method can achieve improved generalization and robustness on both synthetic and real-world benchmarks.more » « less
An official website of the United States government
