Cross-Modal Center Loss for 3D Cross-Modal Retrieval

Jing, L; Vahdani, E; Tan, J; Tian, Y.

Citation Details

Cross-modal retrieval aims to learn discriminative and modal-invariant features for data from different modalities. Unlike the existing methods which usually learn from the features extracted by ofﬂine networks, in this paper, we pro- pose an approach to jointly train the components of cross- modal retrieval framework with metadata, and enable the network to ﬁnd optimal features. The proposed end-to-end framework is updated with three loss functions: 1) a novel cross-modal center loss to eliminate cross-modal discrepancy, 2) cross-entropy loss to maximize inter-class variations, and 3) mean-square-error loss to reduce modality variations. In particular, our proposed cross-modal center loss minimizes the distances of features from objects belonging to the same class across all modalities. Extensive experiments have been conducted on the retrieval tasks across multi-modalities including 2D image, 3D point cloud and mesh data. The proposed framework significantly outperforms the state-of-the-art methods for both cross-modal and in-domain retrieval for 3D objects on the ModelNet10 and ModelNet40 datasets. more »

Award ID(s):: 2041307

PAR ID:: 10279248

Author(s) / Creator(s):: Jing, L; Vahdani, E; Tan, J; Tian, Y.

Date Published:: 2021-06-19

Journal Name:: IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this