MMCircuitEval
MMCircuitEval
Description
MMCircuitEval is the first multimodal benchmark for evaluating LLMs on Electronic Design Automation (EDA) tasks. It contains 3,614 question-answer pairs across 2,871 circuit problems, covering both digital and analog circuits. Questions span general knowledge, specifications, front-end design, and back-end design stages with circuit diagram images.
Capabilities
- Multimodal circuit understanding with schematic images
- Evaluation across 4 EDA workflow stages
- Multiple question types: single-choice, multi-choice, fill-in-blank, open-ended
Compute Requirements
Agents are given a standard environment with no sandbox or file system access.
Tasks
Four splits in this environment:
- general: 905 tasks (general circuit knowledge)
- spec: 877 tasks (circuit specifications)
- frontend: 922 tasks (front-end design)
- backend: 910 tasks (back-end design)
Total: 3,614 question-answer pairs across 2,871 circuit problems.
Reward Structure
Single-turn evaluation with LLM-graded rewards. The agent submits an answer via the answer tool. Answers are graded by gpt-5-mini using question-type-specific prompts for single-choice, multi-choice, fill-in-blank, and open-ended questions. Reward is 1.0 if correct, 0.0 if incorrect.
Data
Four parquet files (~192 MB total) sourced from HuggingFace charlie314159/MMCircuitEval. Stored on the OpenReward platform.
Tools
| Tool | Description |
|---|---|
answer | Submit an answer. LLM-graded with question-type-specific evaluation. Ends the episode. |
Time Horizon
Single-turn. The agent reads the multimodal circuit question (text and schematic images) and submits one answer.
Environment Difficulty
MMCircuitEval evaluates circuit understanding across digital and analog domains. Extensive evaluations reveal significant performance gaps among existing LLMs, particularly in back-end design and complex computations. Current models are generally underdeveloped in circuit and EDA fields compared to general-purpose tasks.
Other Environment Requirements
OpenAI API key required for LLM-based grading. Pass via secrets={"openai_api_key": "..."} when creating a session.
Safety
Agents in MMCircuitEval solve circuit analysis problems in a standard environment. The environment does not present direct safety risks.
Citation
@inproceedings{zhao2025mmcircuiteval,
title={MMCircuitEval: A Comprehensive Multimodal Circuit-Focused Benchmark for Evaluating LLMs},
author={Zhao, Chenchen and Shi, Zhengyuan and Wen, Xiangyu and Liu, Chengjie and Liu, Yi and Zhou, Yunhao and Zhao, Yuxiang and Feng, Hefei and Zhu, Yinan and Wan, Gwok-Waa and Cheng, Xin and Chen, Weiyu and Fu, Yongqi and Chen, Chujie and Xue, Chenhao and Sun, Guangyu and Wang, Ying and Lin, Yibo and Yang, Jun and Xu, Ning and Wang, Xi and Xu, Qiang},
booktitle={ICCAD},
year={2025}
}