MMCircuitEval

API Endpoint
Leaderboard
Loading leaderboard...
Implementation of
README

MMCircuitEval

OpenReward Environment Hugging Face Dataset

Description

MMCircuitEval is the first multimodal benchmark for evaluating LLMs on Electronic Design Automation (EDA) tasks. It contains 3,614 question-answer pairs across 2,871 circuit problems, covering both digital and analog circuits. Questions span general knowledge, specifications, front-end design, and back-end design stages with circuit diagram images.

Capabilities

  • Multimodal circuit understanding with schematic images
  • Evaluation across 4 EDA workflow stages
  • Multiple question types: single-choice, multi-choice, fill-in-blank, open-ended

Compute Requirements

Agents are given a standard environment with no sandbox or file system access.

Tasks

Four splits in this environment:

  • general: 905 tasks (general circuit knowledge)
  • spec: 877 tasks (circuit specifications)
  • frontend: 922 tasks (front-end design)
  • backend: 910 tasks (back-end design)

Total: 3,614 question-answer pairs across 2,871 circuit problems.

Reward Structure

Single-turn evaluation with LLM-graded rewards. The agent submits an answer via the answer tool. Answers are graded by gpt-5-mini using question-type-specific prompts for single-choice, multi-choice, fill-in-blank, and open-ended questions. Reward is 1.0 if correct, 0.0 if incorrect.

Data

Four parquet files (~192 MB total) sourced from HuggingFace charlie314159/MMCircuitEval. Stored on the OpenReward platform.

Tools

ToolDescription
answerSubmit an answer. LLM-graded with question-type-specific evaluation. Ends the episode.

Time Horizon

Single-turn. The agent reads the multimodal circuit question (text and schematic images) and submits one answer.

Environment Difficulty

MMCircuitEval evaluates circuit understanding across digital and analog domains. Extensive evaluations reveal significant performance gaps among existing LLMs, particularly in back-end design and complex computations. Current models are generally underdeveloped in circuit and EDA fields compared to general-purpose tasks.

Other Environment Requirements

OpenAI API key required for LLM-based grading. Pass via secrets={"openai_api_key": "..."} when creating a session.

Safety

Agents in MMCircuitEval solve circuit analysis problems in a standard environment. The environment does not present direct safety risks.

Citation

@inproceedings{zhao2025mmcircuiteval,
  title={MMCircuitEval: A Comprehensive Multimodal Circuit-Focused Benchmark for Evaluating LLMs},
  author={Zhao, Chenchen and Shi, Zhengyuan and Wen, Xiangyu and Liu, Chengjie and Liu, Yi and Zhou, Yunhao and Zhao, Yuxiang and Feng, Hefei and Zhu, Yinan and Wan, Gwok-Waa and Cheng, Xin and Chen, Weiyu and Fu, Yongqi and Chen, Chujie and Xue, Chenhao and Sun, Guangyu and Wang, Ying and Lin, Yibo and Yang, Jun and Xu, Ning and Wang, Xi and Xu, Qiang},
  booktitle={ICCAD},
  year={2025}
}
GeneralReasoning/MMCircuitEval | OpenReward