skip to main content


Title: A region-based deep learning algorithm for detecting and tracking objects in manufacturing plants
In today's competitive production era, the ability to identify and track important objects in a near real-time manner is greatly desired among manufacturers who are moving towards the streamline production. Manually keeping track of every object in a complex manufacturing plant is infeasible; therefore, an automatic system of that functionality is greatly in need. This study was motivated to develop a Mask Region-based Convolutional Neural Network (Mask RCNN) model to semantically segment objects and important zones in manufacturing plants. The Mask RCNN was trained through transfer learning that used a neural network (NN) pre-trained with the MS-COCO dataset as the starting point and further fine-tuned that NN using a limited number of annotated images. Then the Mask RCNN model was modified to have consistent detection results from videos, which was realized through the use of a two-staged detection threshold and the analysis of the temporal coherence information of detected objects. The function of object tracking was added to the system for identifying the misplacement of objects. The effectiveness and efficiency of the proposed system were demonstrated by analyzing a sample of video footages.  more » « less
Award ID(s):
1646162
NSF-PAR ID:
10129789
Author(s) / Creator(s):
; ; ; ; ;
Date Published:
Journal Name:
The 25th International Conference on Production Research (ICPR’19)
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Deep learning (DL) convolutional neural networks (CNNs) have been rapidly adapted in very high spatial resolution (VHSR) satellite image analysis. DLCNN-based computer visions (CV) applications primarily aim for everyday object detection from standard red, green, blue (RGB) imagery, while earth science remote sensing applications focus on geo object detection and classification from multispectral (MS) imagery. MS imagery includes RGB and narrow spectral channels from near- and/or middle-infrared regions of reflectance spectra. The central objective of this exploratory study is to understand to what degree MS band statistics govern DLCNN model predictions. We scaffold our analysis on a case study that uses Arctic tundra permafrost landform features called ice-wedge polygons (IWPs) as candidate geo objects. We choose Mask RCNN as the DLCNN architecture to detect IWPs from eight-band Worldview-02 VHSR satellite imagery. A systematic experiment was designed to understand the impact on choosing the optimal three-band combination in model prediction. We tasked five cohorts of three-band combinations coupled with statistical measures to gauge the spectral variability of input MS bands. The candidate scenes produced high model detection accuracies for the F1 score, ranging between 0.89 to 0.95, for two different band combinations (coastal blue, blue, green (1,2,3) and green, yellow, red (3,4,5)). The mapping workflow discerned the IWPs by exhibiting low random and systematic error in the order of 0.17–0.19 and 0.20–0.21, respectively, for band combinations (1,2,3). Results suggest that the prediction accuracy of the Mask-RCNN model is significantly influenced by the input MS bands. Overall, our findings accentuate the importance of considering the image statistics of input MS bands and careful selection of optimal bands for DLCNN predictions when DLCNN architectures are restricted to three spectral channels. 
    more » « less
  2. Recent efforts in deploying Deep Neural Networks for object detection in real world applications, such as autonomous driving, assume that all relevant object classes have been observed during training. Quantifying the performance of these models in settings when the test data is not represented in the training set has mostly focused on pixel-level uncertainty estimation techniques of models trained for semantic segmentation. This paper proposes to exploit additional predictions of semantic segmentation models and quantifying its confidences, followed by classification of object hypotheses as known vs. unknown, out of distribution objects. We use object proposals generated by Region Proposal Network (RPN) and adapt distance aware uncertainty estimation of semantic segmentation using Radial Basis Functions Networks (RBFN) for class agnostic object mask prediction. The augmented object proposals are then used to train a classifier for known vs. unknown objects categories. Experimental results demonstrate that the proposed method achieves parallel performance to state of the art methods for unknown object detection and can also be used effectively for reducing object detectors' false positive rate. Our method is well suited for applications where prediction of non-object background categories obtained by semantic segmentation is reliable. 
    more » « less
  3. null (Ed.)
    Traditional deep neural networks (NNs) have significantly contributed to the state-of-the-art performance in the task of classification under various application domains. However, NNs have not considered inherent uncertainty in data associated with the class probabilities where misclassification under uncertainty may easily introduce high risk in decision making in real-world contexts (e.g., misclassification of objects in roads leads to serious accidents). Unlike Bayesian NN that indirectly infer uncertainty through weight uncertainties, evidential NNs (ENNs) have been recently proposed to explicitly model the uncertainty of class probabilities and use them for classification tasks. An ENN offers the formulation of the predictions of NNs as subjective opinions and learns the function by collecting an amount of evidence that can form the subjective opinions by a deterministic NN from data. However, the ENN is trained as a black box without explicitly considering inherent uncertainty in data with their different root causes, such as vacuity (i.e., uncertainty due to a lack of evidence) or dissonance (i.e., uncertainty due to conflicting evidence). By considering the multidimensional uncertainty, we proposed a novel uncertainty-aware evidential NN called WGAN-ENN (WENN) for solving an out-of-distribution (OOD) detection problem. We took a hybrid approach that combines Wasserstein Generative Adversarial Network (WGAN) with ENNs to jointly train a model with prior knowledge of a certain class, which has high vacuity for OOD samples. Via extensive empirical experiments based on both synthetic and real-world datasets, we demonstrated that the estimation of uncertainty by WENN can significantly help distinguish OOD samples from boundary samples. WENN outperformed in OOD detection when compared with other competitive counterparts 
    more » « less
  4. Abstract In this article, we describe a modified implementation of Mask Region-based Convolutional Neural Networks (Mask-RCNN) for cosmic ray muon clustering in a liquid argon TPC and applied to MicroBooNE neutrino data. Our implementation of this network, called sMask-RCNN, uses sparse submanifold convolutions to increase processing speed on sparse datasets, and is compared to the original dense version in several metrics. The networks are trained to use wire readout images from the MicroBooNE liquid argon time projection chamber as input and produce individually labeled particle interactions within the image. These outputs are identified as either cosmic ray muon or electron neutrino interactions. We find that sMask-RCNN has an average pixel clustering efficiency of 85.9% compared to the dense network's average pixel clustering efficiency of 89.1%. We demonstrate the ability of sMask-RCNN used in conjunction with MicroBooNE's state-of-the-art Wire-Cell cosmic tagger to veto events containing only cosmic ray muons. The addition of sMask-RCNN to the Wire-Cell cosmic tagger removes 70% of the remaining cosmic ray muon background events at the same electron neutrino event signal efficiency. This event veto can provide 99.7% rejection of cosmic ray-only background events while maintaining an electron neutrino event-level signal efficiency of 80.1%. In addition to cosmic ray muon identification, sMask-RCNN could be used to extract features and identify different particle interaction types in other 3D-tracking detectors. 
    more » « less
  5. null (Ed.)
    A widely-regarded approach in Printed Circuit Board (PCB) reverse engineering (RE) uses non-destructive Xray computed tomography (CT) to produce three-dimensional volumes with several slices of data corresponding to multi-layered PCBs. The noise sources specific to X-ray CT and variability from designers make it difficult to acquire the features needed for the RE process. Hence, these X-ray CT images require specialized image processing techniques to examine the various features of a single PCB to later be translated to a readable CAD format. Previously, we presented an approach where the Hough Circle Transform was used for initial feature detection, and then an iterative false positive removal process was developed specifically for detecting vias on PCBs. Its performance was compared to an off-the-shelf application of the Mask Region-based Convolutional Network (M-RCNN). M-RCNN is an excellent deep learning approach that is able to localize and classify numerous objects of different scales within a single image. In this paper, we present a version of M-RCNN that is fine-tuned for via detection. Changes include polygon boundary annotations on the single X-ray images of vias for training and transfer learning to leverage the full potential of the network. We discuss the challenges of detecting vias using deep learning, our working solution, and our experimental procedure. Additionally, we provide a qualitative evaluation of our approach and use quantitative metrics to compare the proposed approach with the previous iterative one. 
    more » « less