Computer Vision models has increasingly been embedded into video software to recognize and classify things in the physical world. While this can provide a useful result it also opens the door to vulnerabilities through a physical attack. Using a printed-out generated image, individuals can exploit computer visions models to disguise their true intentions. A possible way to block and mitigate the problems is to detect and blur the entire image to try to allow the AI to inference the said image.
more »
« less
Defending Adversarial Patches: Detect & Blur
Computer Vision models have increasingly been embedded into video software to recognize and classify things in the physical world. While this can provide a useful result it also opens the door to vulnerabilities through a physical attack. Using a printed-out generated image, individuals can exploit computer visions models to disguise their true intentions. A possible way to block and mitigate the problems is to detect and blur the entire image to try to allow the AI to inference the said image.
more »
« less
- PAR ID:
- 10677925
- Publisher / Repository:
- The 2025 ADMI Symposium
- Date Published:
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Given the ability to directly manipulate image pixels in the digital input space, an adversary can easily generate imperceptible perturbations to fool a Deep Neural Network (DNN) image classifier, as demonstrated in prior work. In this work, we propose ShapeShifter, an attack that tackles the more challenging problem of crafting physical adversarial perturbations to fool image-based object detectors like Faster R-CNN. Attacking an object detector is more difficult than attacking an image classifier, as it needs to mislead the classification results in multiple bounding boxes with different scales. Extending the digital attack to the physical world adds another layer of difficulty, because it requires the perturbation to be robust enough to survive real-world distortions due to different viewing distances and angles, lighting conditions, and camera limitations. We show that the Expectation over Transformation technique, which was originally proposed to enhance the robustness of adversarial perturbations in image classification, can be successfully adapted to the object detection setting. ShapeShifter can generate adversarially perturbed stop signs that are consistently mis-detected by Faster R-CNN as other objects, posing a potential threat to autonomous vehicles and other safety-critical computer vision systems.more » « less
-
The deep neural network (DNN) model for computer vision tasks (object detection and classification) is widely used in autonomous vehicles, such as driverless cars and unmanned aerial vehicles. However, DNN models are shown to be vulnerable to adversarial image perturbations. The generation of adversarial examples against inferences of DNNs has been actively studied recently. The generation typically relies on optimizations taking an entire image frame as the decision variable. Hence, given a new image, the computationally expensive optimization needs to start over as there is no learning between the independent optimizations. Very few approaches have been developed for attacking online image streams while taking into account the underlying physical dynamics of autonomous vehicles, their mission, and the environment. The article presents a multi-level reinforcement learning framework that can effectively generate adversarial perturbations to misguide autonomous vehicles’ missions. In the existing image attack methods against autonomous vehicles, optimization steps are repeated for every image frame. This framework removes the need for fully converged optimization at every frame. Using multi-level reinforcement learning, we integrate a state estimator and a generative adversarial network that generates the adversarial perturbations. Due to the reinforcement learning agent consisting of state estimator, actor, and critic that only uses image streams, the proposed framework can misguide the vehicle to increase the adversary’s reward without knowing the states of the vehicle and the environment. Simulation studies and a robot demonstration are provided to validate the proposed framework’s performance.more » « less
-
null (Ed.)As paper ballots and post-election audits gain increased adoption in the United States, election technology vendors are offering products that allow jurisdictions to review ballot images—digital scans produced by optical-scan voting machines—in their post-election audit procedures. Jurisdictions including the state of Maryland rely on such image audits as an alternative to inspecting the physical paper ballots. We show that image audits can be reliably defeated by an attacker who can run malicious code on the voting machines or election management system. Using computer vision techniques, we develop an algorithm that automatically and seamlessly manipulates ballot images, moving voters’ marks so that they appear to be votes for the attacker’s preferred candidate. Our implementation is compatible with many widely used ballot styles, and we show that it is effective using a large corpus of ballot images from a real election. We also show that the attack can be delivered in the form of a malicious Windows scanner driver, which we test with a scanner that has been certified for use in vote tabulation by the U.S. Election Assistance Commission. These results demonstrate that post-election audits must inspect physical ballots, not merely ballot images, if they are to strongly defend against computer-based attacks on widely used voting systems.more » « less
-
Marine scientists have been leveraging supervised machine learning algorithms to analyze image and video data for nearly two decades. There have been many advances, but the cost of generating expert human annotations to train new models remains extremely high. There is broad recognition both in computer and domain sciences that generating training data remains the major bottleneck when developing ML models for targeted tasks. Increasingly, computer scientists are not attempting to produce highly-optimized models from general annotation frameworks, instead focusing on adaptation strategies to tackle new data challenges. Taking inspiration from large language models, computer vision researchers are now thinking in terms of “foundation models” that can yield reasonable zero- and few-shot detection and segmentation performance with human prompting. Here we consider the utility of this approach for ocean imagery, leveraging Meta’s Segment Anything Model to enrich ocean image annotations based on existing labels. This workflow yields promising results, especially for modernizing existing data repositories. Moreover, it suggests that future human annotation efforts could use foundation models to speed progress toward a sufficient training set to address domain specific problems.more » « less
An official website of the United States government

