SuperCHEM

API Endpoint
Leaderboard
Loading leaderboard...
Implementation of
README

SUPERChem

OpenReward Environment Hugging Face Dataset

Description

SUPERChem is an environment for evaluating multimodal chemistry reasoning with 500 expert-curated problems. Questions feature molecular structure images and cover four core domains: Structure and Properties, Reaction and Synthesis, Principles and Calculations, and Experimental Design and Analysis. Answers are multiple choice (A-H).

Capabilities

  • Multimodal chemistry reasoning with molecular structure images
  • Multiple-choice evaluation across 4 chemistry domains
  • Expert-curated questions from non-public examinations

Compute Requirements

Agents are given a standard environment with no sandbox or file system access.

License

MIT.

Tasks

There is one split in this environment:

  • test: 500 tasks

Questions span four chemistry domains: Structure and Properties, Reaction and Synthesis, Principles and Calculations, and Experimental Design and Analysis.

Reward Structure

Single-turn evaluation with deterministic grading. The agent submits a single letter answer (A-H) via the submit_answer tool. The submitted answer is compared via exact match against the ground truth. Reward is 1.0 if correct, 0.0 if incorrect.

Data

SUPERChem-500.parquet (40.5 MB, 500 problems) sourced from HuggingFace ZehuaZhao/SUPERChem. Stored on the OpenReward platform.

Tools

ToolDescription
submit_answerSubmit a single letter answer (A-H). Deterministic evaluation via exact match. Ends the episode.

Time Horizon

Single-turn. The agent reads the multimodal chemistry question (text and molecular images) and submits one answer.

Environment Difficulty

SUPERChem evaluates multimodal chemistry reasoning at expert level:

ModelAccuracy
GPT-5 (High)38.5%
Human (2nd-year chemistry majors)40.3%

Frontier models struggle most in high-order reasoning tasks, particularly predicting product structures, elucidating reaction mechanisms, and analyzing structure-activity relationships.

Other Environment Requirements

There are no further environment requirements; SUPERChem works out of the box with the OpenReward endpoint without any external API keys.

Safety

Agents in SUPERChem solve chemistry reasoning problems in a standard environment. The environment does not present direct safety risks.

Citation

@article{zhao2025superchem,
  title={SUPERChem: A Multimodal Reasoning Benchmark in Chemistry},
  author={Zhao, Zehua and Huang, Zhixian and Li, Junren and Lin, Siyu and Zhou, Junting and Cao, Fengqi and Zhou, Kun and Ge, Rui and Long, Tingting and Zhu, Yuexiang and Liu, Yan and Zheng, Jie and Wei, Junnian and Zhu, Rong and Zou, Peng and Li, Wenyu and Cheng, Zekai and Ding, Tian and Wang, Yaxuan and Yan, Yizhao and Wei, Tingru and Ming, Haowei and Mao, Weijie and Sun, Chen and Liu, Yiming and Wang, Zichen and Zhang, Zuo and Yang, Tong and Ma, Hao and Gao, Zhen and Pei, Jian},
  journal={arXiv preprint arXiv:2512.01274},
  year={2025}
}
GeneralReasoning/SuperCHEM | OpenReward