skip to main content


This content will become publicly available on October 2, 2024

Title: SUMMIT: Source-Free Adaptation of Uni-Modal Models to Multi-Modal Targets
Award ID(s):
1724341
NSF-PAR ID:
10479100
Author(s) / Creator(s):
Publisher / Repository:
IEEE
Date Published:
Journal Name:
IEEE/CVF Int. Conf. on Computer Vision
Format(s):
Medium: X
Location:
Paris, France
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Cross-modal retrieval aims to learn discriminative and modal-invariant features for data from different modalities. Unlike the existing methods which usually learn from the features extracted by offline networks, in this paper, we pro- pose an approach to jointly train the components of cross- modal retrieval framework with metadata, and enable the network to find optimal features. The proposed end-to-end framework is updated with three loss functions: 1) a novel cross-modal center loss to eliminate cross-modal discrepancy, 2) cross-entropy loss to maximize inter-class variations, and 3) mean-square-error loss to reduce modality variations. In particular, our proposed cross-modal center loss minimizes the distances of features from objects belonging to the same class across all modalities. Extensive experiments have been conducted on the retrieval tasks across multi-modalities including 2D image, 3D point cloud and mesh data. The proposed framework significantly outperforms the state-of-the-art methods for both cross-modal and in-domain retrieval for 3D objects on the ModelNet10 and ModelNet40 datasets. 
    more » « less