skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Tracking objects using QR codes and deep learning
Despite recent advances in deep learning, object detection and tracking still require considerable manual and computational effort. First, we need to collect and create a database of hundreds or thousands of images of the target objects. Next we must annotate or curate the images to indicate the presence and position of the target objects within those images. Finally, we must train a CNN (convolution neural network) model to detect and locate the target objects in new images. This training is usually computationally intensive, consists of thousands of epochs, and can take tens of hours for each target object. Even after the model training in completed, there is still a chance of failure if the real-time tracking and object detection phases lack sufficient accuracy, precision, and/or speed for many important applications. Here we present a system and approach which minimizes the computational expense of the various steps in the training and real-time tracking process outlined above of for applications in the development of mixed-reality science laboratory experiences by using non-intrusive object-encoding 2D QR codes that are mounted directly onto the surfaces of the lab tools to be tracked. This system can start detecting and tracking it immediately and eliminates the laborious process of acquiring and annotating a new training dataset for every new lab tool to be tracked.  more » « less
Award ID(s):
1918045
PAR ID:
10432261
Author(s) / Creator(s):
; ; ;
Editor(s):
Zhou, Jianhong ; Osten, Wolfgang; Nikolaev, Dmitry P.
Date Published:
Journal Name:
Proc. SPIE 12701, Fifteenth International Conference on Machine Vision (ICMV 2022)
Volume:
12701
Issue:
06
Page Range / eLocation ID:
43
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Recent work in adversarial machine learning started to focus on the visual perception in autonomous driving and studied Adversarial Examples (AEs) for object detection models. However, in such visual perception pipeline the detected objects must also be tracked, in a process called Multiple Object Tracking (MOT), to build the moving trajectories of surrounding obstacles. Since MOT is designed to be robust against errors in object detection, it poses a general challenge to existing attack techniques that blindly target objection detection: we find that a success rate of over 98% is needed for them to actually affect the tracking results, a requirement that no existing attack technique can satisfy. In this paper, we are the first to study adversarial machine learning attacks against the complete visual perception pipeline in autonomous driving, and discover a novel attack technique, tracker hijacking, that can effectively fool MOT using AEs on object detection. Using our technique, successful AEs on as few as one single frame can move an existing object in to or out of the headway of an autonomous vehicle to cause potential safety hazards. We perform evaluation using the Berkeley Deep Drive dataset and find that on average when 3 frames are attacked, our attack can have a nearly 100% success rate while attacks that blindly target object detection only have up to 25%. 
    more » « less
  2. Recently, adversarial examples against object detection have been widely studied. However, it is difficult for these attacks to have an impact on visual perception in autonomous driving because the complete visual pipeline of real-world autonomous driving systems includes not only object detection but also object tracking. In this paper, we present a novel tracker hijacking attack against the multi-target tracking algorithm employed by real-world autonomous driving systems, which controls the bounding box of object detection to spoof the multiple object tracking process. Our approach exploits the detection box generation process of the anchor-based object detection algorithm and designs new optimization methods to generate adversarial patches that can successfully perform tracker hijacking attacks, causing security risks. The evaluation results show that our approach has 85% attack success rate on two detection models employed by real-world autonomous driving systems. We discuss our potential next step for this work. 
    more » « less
  3. Real-time detection of 3D obstacles and recognition of humans and other objects is essential for blind or low- vision people to travel not only safely and independently but also confidently and interactively, especially in a cluttered indoor environment. Most existing 3D obstacle detection techniques that are widely applied in robotic applications and outdoor environments often require high-end devices to ensure real-time performance. There is a strong need to develop a low-cost and highly efficient technique for 3D obstacle detection and object recognition in indoor environments. This paper proposes an integrated 3D obstacle detection system implemented on a smartphone, by utilizing deep-learning-based pre-trained 2D object detectors and ARKit- based point cloud data acquisition to predict and track the 3D positions of multiple objects (obstacles, humans, and other objects), and then provide alerts to users in real time. The system consists of four modules: 3D obstacle detection, 3D object tracking, 3D object matching, and information filtering. Preliminary tests in a small house setting indicated that this application could reliably detect large obstacles and their 3D positions and sizes in the real world and small obstacles’ positions, without any expensive devices besides an iPhone. 
    more » « less
  4. This paper focuses on vision-based pose estimation for multiple rigid objects placed in clutter, especially in cases involving occlusions and objects resting on each other. Progress has been achieved recently in object recognition given advancements in deep learning. Nevertheless, such tools typically require a large amount of training data and significant manual effort to label objects. This limits their applicability in robotics, where solutions must scale to a large number of objects and variety of conditions. Moreover, the combinatorial nature of the scenes that could arise from the placement of multiple objects is hard to capture in the training dataset. Thus, the learned models might not produce the desired level of precision required for tasks, such as robotic manipulation. This work proposes an autonomous process for pose estimation that spans from data generation to scene-level reasoning and self-learning. In particular, the proposed framework first generates a labeled dataset for training a Convolutional Neural Network (CNN) for object detection in clutter. These detections are used to guide a scene-level optimization process, which considers the interactions between the different objects present in the clutter to output pose estimates of high precision. Furthermore, confident estimates are used to label online real images from multiple views and re-train the process in a self-learning pipeline. Experimental results indicate that this process is quickly able to identify in cluttered scenes physically-consistent object poses that are more precise than the ones found by reasoning over individual instances of objects. Furthermore, the quality of pose estimates increases over time given the self-learning process. 
    more » « less
  5. The presence of fog in the background can prevent small and distant objects from being detected, let alone tracked. Under safety-critical conditions, multi-object tracking models require faster tracking speed while maintaining high object-tracking accuracy. The original DeepSORT algorithm used YOLOv4 for the detection phase and a simple neural network for the deep appearance descriptor. Consequently, the feature map generated loses relevant details about the track being matched with a given detection in fog. Targets with a high degree of appearance similarity on the detection frame are more likely to be mismatched, resulting in identity switches or track failures in heavy fog. We propose an improved multi-object tracking model based on the DeepSORT algorithm to improve tracking accuracy and speed under foggy weather conditions. First, we employed our camera-radar fusion network (CR-YOLOnet) in the detection phase for faster and more accurate object detection. We proposed an appearance feature network to replace the basic convolutional neural network. We incorporated GhostNet to take the place of the traditional convolutional layers to generate more features and reduce computational complexities and costs. We adopted a segmentation module and fed the semantic labels of the corresponding input frame to add rich semantic information to the low-level appearance feature maps. Our proposed method outperformed YOLOv5 + DeepSORT with a 35.15% increase in multi-object tracking accuracy, a 32.65% increase in multi-object tracking precision, a speed increase by 37.56%, and identity switches decreased by 46.81%. 
    more » « less