superchem

Description

SUPERChem is a benchmark of 500 expert-curated, reasoning-intensive chemistry problems provided in both multimodal and text-only formats to evaluate LLMs' expert-level, multi-step chemical reasoning. Each problem is paired with an expert-authored solution path enabling Reasoning Path Fidelity (RPF) scoring, and the dataset uses an iterative curation pipeline to eliminate flawed items and reduce data contamination.

Leaderboard
Loading leaderboard...
Implementations (1)
EnvironmentStarsLast Updated
GeneralReasoningGeneralReasoning/SuperCHEM
0
1 months ago
arXiv/superchem | OpenReward