Removing photobombing elements from images is a challenging task that requires sophisticated image inpainting techniques. Despite the availability of various methods, their effectiveness depends on the complexity of the image and the nature of the distracting element. To address this issue, we conducted a benchmark study to evaluate 10 state-of-the-art photobombing removal methods on a dataset of over 300 images. Our study focused on identifying the most effective image inpainting techniques for removing unwanted regions from images. We annotated the photobombed regions that require removal and evaluated the performance of each method using peak signal-to-noise ratio (PSNR), structural similarity index (SSIM), and Fréchet inception distance (FID). The results show that image inpainting techniques can effectively remove photobombing elements, but more robust and accurate methods are needed to handle various image complexities. Our benchmarking study provides a valuable resource for researchers and practitioners to select the most suitable method for their specific photobombing removal task.
more » « less- Award ID(s):
- 2025234
- NSF-PAR ID:
- 10499855
- Publisher / Repository:
- Springer Science + Business Media
- Date Published:
- Journal Name:
- Multimedia Tools and Applications
- Volume:
- 83
- Issue:
- 29
- ISSN:
- 1573-7721
- Format(s):
- Medium: X Size: p. 73553-73568
- Size(s):
- p. 73553-73568
- Sponsoring Org:
- National Science Foundation
More Like this
-
Photobombing occurs very often in photography. This causes inconvenience to the main person(s) in the photos. Therefore, there is a legitimate need to remove the photobombing from taken images to produce a pleasing image. In this paper, the aim is to conduct a benchmark on this aforementioned problem. To this end, we first collect a dataset of images with undesired and distracting elements which requires the removal of photobombing. Then, we annotate the photobombed regions which should be removed. Next, different image inpainting methods are leveraged to remove the photobombed regions and reconstruct the image. We further invited professional photoshoppers to remove the unwanted regions. These photoshopped images are considered as the groundtruth. In our benchmark, several performance metrics are leveraged to compare the results of different methods with the groundtruth. The experiments provide insightful results which demonstrate the effectiveness of inpainting methods in this particular problem.more » « less
-
Existing image inpainting methods typically fill holes by borrowing information from surrounding pixels. They often produce unsatisfactory results when the holes overlap with or touch foreground objects due to lack of information about the actual extent of foreground and background regions within the holes. These scenarios, however, are very important in practice, especially for applications such as the removal of distracting objects. To address the problem, we propose a foreground-aware image inpainting system that explicitly disentangles structure inference and content completion. Specifically, our model learns to predict the foreground contour first, and then inpaints the missing region using the predicted contour as guidance. We show that by such disentanglement, the contour completion model predicts reasonable contours of objects, and further substantially improves the performance of image inpainting. Experiments show that our method significantly outperforms existing methods and achieves superior inpainting results on challenging cases with complex compositions.more » « less
-
In this paper, we present a new inpainting framework for recovering missing regions of video frames. Compared with image inpainting, performing this task on video presents new challenges such as how to preserving temporal consistency and spatial details, as well as how to handle arbitrary input video size and length fast and efficiently. Towards this end, we propose a novel deep learning architecture which incorporates ConvLSTM and optical flow for modeling the spatial-temporal consistency in videos. It also saves much computational resource such that our method can handle videos with larger frame size and arbitrary length streamingly in real-time. Furthermore, to generate an accurate optical flow from corrupted frames, we propose a robust flow generation module, where two sources of flows are fed and a flow blending network is trained to fuse them. We conduct extensive experiments to evaluate our method in various scenarios and different datasets, both qualitatively and quantitatively. The experimental results demonstrate the superior of our method compared with the state-of-the-art inpainting approaches.more » « less
-
null (Ed.)Psychologists recognize Raven’s Progressive Matrices as a useful test of general human intelligence. While many computational models investigate various forms of top-down, deliberative reasoning on the test, there has been less research on bottom-up perceptual processes, like Gestalt image completion, that are also critical in human test performance. In this work, we investigate how Gestalt visual reasoning on the Raven’s test can be modeled using generative image inpainting techniques from computer vision. We demonstrate that a reasoning agent that has access to an off- the-shelf inpainting model trained only on photorealistic images of objects achieves a score of 27/36 on the Colored Progressive Matrices, which corresponds to average performance for nine-year-old children. We also show that when our agent uses inpainting models trained on other datasets (faces, places, and textures), it does not perform as well. Our results illustrate how learning visual regularities in real-world images can translate into successful reasoning about artificial test stimuli. On the flip side, our results also highlight the limitations of such transfer, which may contribute to explanations for why intelligence tests like the Raven’s are often sensitive to people’s individual sociocultural backgrounds.more » « less
-
Image security is becoming an increasingly important issue due to advances in deep learning based image manipulations, such as deep image inpainting and deepfakes. There has been considerable work to date on detecting such image manipulations using improved algorithms, with little attention paid to the possible role that hardware advances may have for improving security. We propose to use a focal stack camera as a novel secure imaging device, to the best of our knowledge, that facilitates localizing modified regions in manipulated images. We show that applying convolutional neural network detection methods to focal stack images achieves significantly better detection accuracy compared to single image based forgery detection. This work demonstrates that focal stack images could be used as a novel secure image file format and opens up a new direction for secure imaging.