chartqa

Description

A new dataset for chart question answering (CQA) constructed from visualization notebooks is a benchmark featuring real-world, multi-view charts paired with natural language questions grounded in analytical narratives. It reflects ecologically valid reasoning workflows and reveals a significant performance gap for state-of-the-art multimodal large language models (GPT-4.1 scored 69.3%), underscoring the challenges of this more authentic CQA setting.

Leaderboard
Loading leaderboard...
Implementations

No implementations linked yet. Add one to showcase related work.

arXiv/chartqa | OpenReward