MedICaT: A Dataset of Medical Images, Captions, and Textual References

Subramanian, Sanjay; Wang, Lucy Lu; Bogin, Ben; Mehta, Sachin; van Zuylen, Madeleine; Parasa, Sravanthi; Singh, Sameer; Gardner, Matt; Hajishirzi, Hannaneh

doi:10.18653/v1/2020.findings-emnlp.191

Citation Details

MedICaT: A Dataset of Medical Images, Captions, and Textual References

Understanding the relationship between figures and text is key to scientific document understanding. Medical figures in particular are quite complex, often consisting of several subfigures (75% of figures in our dataset), with detailed text describing their content. Previous work studying figures in scientific papers focused on classifying figure content rather than understanding how images relate to the text. To address challenges in figure retrieval and figure-to-text alignment, we introduce MedICaT, a dataset of medical images in context. MedICaT consists of 217K images from 131K open access biomedical papers, and includes captions, inline references for 74% of figures, and manually annotated subfigures and subcaptions for a subset of figures. Using MedICaT, we introduce the task of subfigure to subcaption alignment in compound figures and demonstrate the utility of inline references in image-text matching. Our data and code can be accessed at https://github.com/allenai/medicat. more »

Award ID(s):: 1817183

PAR ID:: 10462672

Author(s) / Creator(s):: Subramanian, Sanjay; Wang, Lucy Lu; Bogin, Ben; Mehta, Sachin; van Zuylen, Madeleine; Parasa, Sravanthi; Singh, Sameer; Gardner, Matt; Hajishirzi, Hannaneh

Date Published:: 2020-01-01

Journal Name:: Findings of the Association for Computational Linguistics: EMNLP

Page Range / eLocation ID:: 2112 to 2120

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.18653/v1/2020.findings-emnlp.191

More Like this