superchem
Description
SUPERChem is a benchmark of 500 expert-curated, reasoning-intensive chemistry problems provided in both multimodal and text-only formats to evaluate LLMs' expert-level, multi-step chemical reasoning. Each problem is paired with an expert-authored solution path enabling Reasoning Path Fidelity (RPF) scoring, and the dataset uses an iterative curation pipeline to eliminate flawed items and reduce data contamination.
Leaderboard
Loading leaderboard...
Implementations (1)
| Environment | Stars | Last Updated | |
|---|---|---|---|
0 | 1 months ago |