skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Tracking objects using QR codes and deep learning
Despite recent advances in deep learning, object detection and tracking still require considerable manual and computational effort. First, we need to collect and create a database of hundreds or thousands of images of the target objects. Next we must annotate or curate the images to indicate the presence and position of the target objects within those images. Finally, we must train a CNN (convolution neural network) model to detect and locate the target objects in new images. This training is usually computationally intensive, consists of thousands of epochs, and can take tens of hours for each target object. Even after the model training in completed, there is still a chance of failure if the real-time tracking and object detection phases lack sufficient accuracy, precision, and/or speed for many important applications. Here we present a system and approach which minimizes the computational expense of the various steps in the training and real-time tracking process outlined above of for applications in the development of mixed-reality science laboratory experiences by using non-intrusive object-encoding 2D QR codes that are mounted directly onto the surfaces of the lab tools to be tracked. This system can start detecting and tracking it immediately and eliminates the laborious process of acquiring and annotating a new training dataset for every new lab tool to be tracked.  more » « less
Award ID(s):
1918045
PAR ID:
10432261
Author(s) / Creator(s):
; ; ;
Editor(s):
Zhou, Jianhong ; Osten, Wolfgang; Nikolaev, Dmitry P.
Date Published:
Journal Name:
Proc. SPIE 12701, Fifteenth International Conference on Machine Vision (ICMV 2022)
Volume:
12701
Issue:
06
Page Range / eLocation ID:
43
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Recent work in adversarial machine learning started to focus on the visual perception in autonomous driving and studied Adversarial Examples (AEs) for object detection models. However, in such visual perception pipeline the detected objects must also be tracked, in a process called Multiple Object Tracking (MOT), to build the moving trajectories of surrounding obstacles. Since MOT is designed to be robust against errors in object detection, it poses a general challenge to existing attack techniques that blindly target objection detection: we find that a success rate of over 98% is needed for them to actually affect the tracking results, a requirement that no existing attack technique can satisfy. In this paper, we are the first to study adversarial machine learning attacks against the complete visual perception pipeline in autonomous driving, and discover a novel attack technique, tracker hijacking, that can effectively fool MOT using AEs on object detection. Using our technique, successful AEs on as few as one single frame can move an existing object in to or out of the headway of an autonomous vehicle to cause potential safety hazards. We perform evaluation using the Berkeley Deep Drive dataset and find that on average when 3 frames are attacked, our attack can have a nearly 100% success rate while attacks that blindly target object detection only have up to 25%. 
    more » « less
  2. Multiobject tracking provides situational awareness that enables new applications for modern convenience, applied ocean sciences, public safety, and homeland security. In many multiobject tracking applications, including radar and sonar tracking, after coherent prefiltering of the received signal, measurement data is typically structured in cells, where each cell represents, e.g., a different range and bearing value. While conventional detect-then-track (DTT) multiobject tracking approaches convert the cell-structured data within a detection phase into so-called point measurements to reduce the amount of data, track-before-detect (TBD) methods process the cell-structured data directly, avoiding a potential information loss. However, many TBD tracking methods are computationally intensive or achieve a reduced tracking accuracy when objects interact, i.e., when they come into close proximity. In contrast, our resulting TBD tracking method achieves good tracking accuracy even for interacting objects by introducing the new concept of probabilistic object-to-cell contributions. More specifically, our approach uses a probabilistic association of objects to data cells and a new object contribution model to further link cell contributions to objects occupying the same data cell. To keep computational complexity and filter runtimes low, we use an efficient Poisson-multi-Bernoulli filter approach in combination with belief propagation for fast probabilistic data assignment. We demonstrate numerically that our method achieves significantly increased tracking performance compared to state-of-the-art TBD tracking approaches, where performance differences are particularly pronounced when multiple objects interact. 
    more » « less
  3. Recently, adversarial examples against object detection have been widely studied. However, it is difficult for these attacks to have an impact on visual perception in autonomous driving because the complete visual pipeline of real-world autonomous driving systems includes not only object detection but also object tracking. In this paper, we present a novel tracker hijacking attack against the multi-target tracking algorithm employed by real-world autonomous driving systems, which controls the bounding box of object detection to spoof the multiple object tracking process. Our approach exploits the detection box generation process of the anchor-based object detection algorithm and designs new optimization methods to generate adversarial patches that can successfully perform tracker hijacking attacks, causing security risks. The evaluation results show that our approach has 85% attack success rate on two detection models employed by real-world autonomous driving systems. We discuss our potential next step for this work. 
    more » « less
  4. Real-time detection of 3D obstacles and recognition of humans and other objects is essential for blind or low- vision people to travel not only safely and independently but also confidently and interactively, especially in a cluttered indoor environment. Most existing 3D obstacle detection techniques that are widely applied in robotic applications and outdoor environments often require high-end devices to ensure real-time performance. There is a strong need to develop a low-cost and highly efficient technique for 3D obstacle detection and object recognition in indoor environments. This paper proposes an integrated 3D obstacle detection system implemented on a smartphone, by utilizing deep-learning-based pre-trained 2D object detectors and ARKit- based point cloud data acquisition to predict and track the 3D positions of multiple objects (obstacles, humans, and other objects), and then provide alerts to users in real time. The system consists of four modules: 3D obstacle detection, 3D object tracking, 3D object matching, and information filtering. Preliminary tests in a small house setting indicated that this application could reliably detect large obstacles and their 3D positions and sizes in the real world and small obstacles’ positions, without any expensive devices besides an iPhone. 
    more » « less
  5. This paper focuses on vision-based pose estimation for multiple rigid objects placed in clutter, especially in cases involving occlusions and objects resting on each other. Progress has been achieved recently in object recognition given advancements in deep learning. Nevertheless, such tools typically require a large amount of training data and significant manual effort to label objects. This limits their applicability in robotics, where solutions must scale to a large number of objects and variety of conditions. Moreover, the combinatorial nature of the scenes that could arise from the placement of multiple objects is hard to capture in the training dataset. Thus, the learned models might not produce the desired level of precision required for tasks, such as robotic manipulation. This work proposes an autonomous process for pose estimation that spans from data generation to scene-level reasoning and self-learning. In particular, the proposed framework first generates a labeled dataset for training a Convolutional Neural Network (CNN) for object detection in clutter. These detections are used to guide a scene-level optimization process, which considers the interactions between the different objects present in the clutter to output pose estimates of high precision. Furthermore, confident estimates are used to label online real images from multiple views and re-train the process in a self-learning pipeline. Experimental results indicate that this process is quickly able to identify in cluttered scenes physically-consistent object poses that are more precise than the ones found by reasoning over individual instances of objects. Furthermore, the quality of pose estimates increases over time given the self-learning process. 
    more » « less