MultiChartQA: Benchmarking Vision-Language Models on Multi-Chart Problems

Zhu, Zifeng; Jia, Mengzhao; Zhang, Zhihan; Li, Lang; Jiang, Meng

Citation Details

Multimodal Large Language Models (MLLMs) have demonstrated impressive abilities across various tasks, including visual question answering and chart comprehension, yet existing benchmarks for chart-related tasks fall short in capturing the complexity of real-world multi-chart scenarios. Current benchmarks primarily focus on single-chart tasks, neglecting the multi-hop reasoning required to extract and integrate information from multiple charts, which is essential in practical applications. To fill this gap, we introduce MultiChartQA, a benchmark that evaluates MLLMs’ capabilities in four key areas: direct question answering, parallel question answering, comparative reasoning, and sequential reasoning. Our evaluation of a wide range of MLLMs reveals significant performance gaps compared to humans. These results highlight the challenges in multi-chart comprehension and the potential of MultiChartQA to drive advancements in this field. Our code and data are available at https://github.com/Zivenzhu/Multi-chart-QA. more »

Award ID(s):: 2234058 2137396 2142827 2119531 1901059

PAR ID:: 10590837

Author(s) / Creator(s):: Zhu, Zifeng; Jia, Mengzhao; Zhang, Zhihan; Li, Lang; Jiang, Meng

Publisher / Repository:: Association for Computational Linguistics

Date Published:: 2025-04-27

ISBN:: 979-8-89176-189-6

Format(s):: Medium: X

Location:: Albuquerque, New Mexico

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript
Conference Paper:
The DOI is not currently available.

More Like this