frontierscience
Description
FrontierScience is a benchmark for evaluating expert-level scientific reasoning in frontier language models. It comprises two complementary tracks—Olympiad, with international olympiad problems at IPhO, IChO, and IBO level authored by medalists and national-team coaches, and Research, with PhD-level, open-ended research sub-tasks assessed via a granular rubric-based evaluation—and contains several hundred questions (including 160 in the open-sourced gold set) spanning physics, chemistry, and biology from quantum electrodynamics to synthetic organic chemistry.
Leaderboard
Loading leaderboard...
Implementations (1)
| Environment | Stars | Last Updated | |
|---|---|---|---|
0 | 1 months ago |