MultiCLU: Multi-stage Context Learning and Utilization for Storefront Accessibility Detection and Evaluation

Wang, Xuan; Chen, Jiajun; Tang, Hao; Zhu, Zhigang

doi:10.1145/3512527.3531361

Citation Details

MultiCLU: Multi-stage Context Learning and Utilization for Storefront Accessibility Detection and Evaluation

In this work, a storefront accessibility image dataset is collected from Google street view and is labeled with three main objects for storefront accessibility: doors (for store entrances), doorknobs (for accessing the entrances) and stairs (for leading to the entrances). Then MultiCLU, a new multi-stage context learning and utilization approach, is proposed with the following four stages: Context in Labeling (CIL), Context in Training (CIT), Context in Detection (CID) and Context in Evaluation (CIE). The CIL stage automatically extends the label for each knob to include more local contextual information. In the CIT stage, a deep learning method is used to project the visual information extracted by a Faster R-CNN based object detector to semantic space generated by a Graph Convolutional Network. The CID stage uses the spatial relation reasoning between categories to refine the confidence score. Finally in the CIE stage, a new loose evaluation metric for storefront accessibility, especially for knob category, is proposed to efficiently help BLV users to find estimated knob locations. Our experiment results show that the proposed MultiCLU framework can achieve significantly better performance than the baseline detector using Faster R-CNN, with +13.4% on mAP and +15.8% on recall, respectively. Our new evaluation metric also introduces a new way to evaluate storefront accessibility objects, which could benefit BLV group in real life. more »

Award ID(s):: 1827505 2131186 1737533

PAR ID:: 10346695

Author(s) / Creator(s):: Wang, Xuan; Chen, Jiajun; Tang, Hao; Zhu, Zhigang

Date Published:: 2022-06-27

Journal Name:: ICMR '22: Proceedings of the 2022 International Conference on Multimedia Retrieval

Page Range / eLocation ID:: 304 to 312

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript
Conference Paper:
https://doi.org/10.1145/3512527.3531361

More Like this