Learning from counting: Leveraging temporal classification for weakly supervised object localization and detection

Hsu, C.Y.; Li, W.

Citation Details

This paper reports a new solution of leveraging temporal classification to support weakly supervised object detection (WSOD). Specifically, we introduce raster scan-order techniques to serialize 2D images into 1D sequence data, and then leverage a combined LSTM (Long, Short-Term Memory) and CTC (Connectionist Temporal Classification) network to achieve object localization based on a total count (of interested objects). We term our proposed network LSTM-CCTC (Count-based CTC). This “learning from counting” strategy differs from existing WSOD methods in that our approach automatically identifies critical points on or near a target object. This strategy significantly reduces the need of generating a large number of candidate proposals for object localization. Experiments show that our method yields state-of-the-art performance based on an evaluation on PASCAL VOC datasets. more »

Award ID(s):: 1853864

PAR ID:: 10276655

Author(s) / Creator(s):: Hsu, C.Y.; Li, W.

Date Published:: 2020-08-30

Journal Name:: 2020 British Machine Vision Conference (BMVC2020)

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this