verisciqa

Description

VeriSciQA is a benchmark for Scientific Visual Question Answering (SVQA) that instantiates a verification-centric Generate-then-Verify framework to curate a dataset of 20,351 QA pairs covering 20 scientific domains and 12 figure types. It enforces cross-modal consistency to filter model-synthesized errors, reveals a large accuracy gap between leading open-source models (64%) and a proprietary model (82%), and serves as a scalable resource for improving LVLM SVQA performance via fine-tuning.

Leaderboard
Loading leaderboard...
Implementations (1)
EnvironmentStarsLast Updated
GeneralReasoningGeneralReasoning/VeriSciQA
0
1 months ago
arXiv/verisciqa | OpenReward