Concept-Centric Transformers: Enhancing Model Interpretability through Object-Centric Concept Learning within a Shared Global Workspace

Hong, Jinyung; Park, Keun Hee; Pavlic, Theodore P

doi:10.1109/WACV57701.2024.00481

Citation Details

Concept-Centric Transformers: Enhancing Model Interpretability through Object-Centric Concept Learning within a Shared Global Workspace

Many interpretable AI approaches have been proposed to provide plausible explanations for a model’s decision-making. However, configuring an explainable model that effectively communicates among computational modules has received less attention. A recently proposed shared global workspace theory showed that networks of distributed modules can benefit from sharing information with a bottle-necked memory because the communication constraints encourage specialization, compositionality, and synchronization among the modules. Inspired by this, we propose Concept-Centric Transformers, a simple yet effective configuration of the shared global workspace for interpretability, consisting of: i) an object-centric-based memory module for extracting semantic concepts from input features, ii) a cross-attention mechanism between the learned concept and input embeddings, and iii) standard classification and explanation losses to allow human analysts to directly assess an explanation for the model’s classification reasoning. We test our approach against other existing concept-based methods on classification tasks for various datasets, including CIFAR100, CUB-200-2011, and ImageNet, and we show that our model achieves better classification accuracy than all baselines across all problems but also generates more consistent concept-based explanations of classification output. more »

Award ID(s):: 2223839

PAR ID:: 10547690

Author(s) / Creator(s):: Hong, Jinyung; Park, Keun Hee; Pavlic, Theodore P

Publisher / Repository:: IEEE

Date Published:: 2024-01-03

ISBN:: 979-8-3503-1892-0

Page Range / eLocation ID:: 4868–4879

Format(s):: Medium: X

Location:: Waikoloa, HI, USA

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.1109/WACV57701.2024.00481

More Like this