ConceptX: A Framework for Latent Concept Analysis

Alam, Firoj; Dalvi, Fahim; Durrani, Nadir; Sajjad, Hassan; Khan, Abdul Rafae; Xu, Jia

doi:10.1609/aaai.v37i13.27057

Citation Details

ConceptX: A Framework for Latent Concept Analysis

The opacity of deep neural networks remains a challenge in deploying solutions where explanation is as important as precision. We present ConceptX, a human-in-the-loop framework for interpreting and annotating latent representational space in pre-trained Language Models (pLMs). We use an unsupervised method to discover concepts learned in these models and enable a graphical interface for humans to generate explanations for the concepts. To facilitate the process, we provide auto-annotations of the concepts (based on traditional linguistic ontologies). Such annotations enable development of a linguistic resource that directly represents latent concepts learned within deep NLP models. These include not just traditional linguistic concepts, but also task-specific or sensitive concepts (words grouped based on gender or religious connotation) that helps the annotators to mark bias in the model. The framework consists of two parts (i) concept discovery and (ii) annotation platform. more »

Award ID(s):: 2113906

PAR ID:: 10514665

Author(s) / Creator(s):: Alam, Firoj; Dalvi, Fahim; Durrani, Nadir; Sajjad, Hassan; Khan, Abdul Rafae; Xu, Jia

Publisher / Repository:: The AAAI Conference on Artificial Intelligence

Date Published:: 2023-06-27

Journal Name:: Proceedings of the AAAI Conference on Artificial Intelligence

Volume:: 37

Issue:: 13

ISSN:: 2159-5399

Page Range / eLocation ID:: 16395 to 16397

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
https://doi.org/10.1609/aaai.v37i13.27057

More Like this