Concept Detection and Caption Prediction in ImageCLEFmedical Caption 2023 with Convolutional Neural Networks; Vision and Text-to-Text Transfer Transformers

Hasan M; Layode O; Rahman M.

Citation Details

This work discusses the participation of CS_Morgan in the Concept Detection and Caption Prediction tasks of the ImageCLEFmedical 2023 Caption benchmark evaluation campaign. The goal of this task is to automatically identify relevant concepts and their locations in images, as well as generate coherent captions for the images. The dataset used for this task is a subset of the extended Radiology Objects in Context (ROCO) dataset. The implementation approach employed by us involved the use of pre-trained Convolutional Neural Networks (CNNs), Vision Transformer (ViT), and Text-to-Text Transfer Transformer (T5) architectures. These models were leveraged to handle the different aspects of the tasks, such as concept detection and caption generation. In the Concept Detection task, the objective was to classify multiple concepts associated with each image. We utilized several deep learning architectures with ‘sigmoid’ activation to enable multilabel classification using the Keras framework. We submitted a total of five (5) runs for this task, and the best run achieved an F1 score of 0.4834, indicating its effectiveness in detecting relevant concepts in the images. For the Caption Prediction task, we successfully submitted eight (8) runs. Our approach involved combining the ViT and T5 models to generate captions for the images. For the caption prediction task, the ranking is based on the BERTScore, and our best run achieved a score of 0.5819 based on generating captions using the fine-tuned T5 model from keywords generated using the pretrained ViT as the encoder. more »

Award ID(s):: 2131207

PAR ID:: 10476132

Author(s) / Creator(s):: Hasan M; Layode O; Rahman M.

Editor(s):: Aliannejadi, M; Faggioli, G; Ferro, N; Vlachos, M.

Publisher / Repository:: https://ceur-ws.org/Vol-3497/

Date Published:: 2023-10-04

Journal Name:: Working Notes of the Conference and Labs of the Evaluation Forum (CLEF 2023)

ISSN:: 1613-0073

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Proceeding:
The DOI is not currently available.

More Like this