frontierscience

Description

FrontierScience is a benchmark for evaluating expert-level scientific reasoning in frontier language models. It comprises two complementary tracks—Olympiad, with international olympiad problems at IPhO, IChO, and IBO level authored by medalists and national-team coaches, and Research, with PhD-level, open-ended research sub-tasks assessed via a granular rubric-based evaluation—and contains several hundred questions (including 160 in the open-sourced gold set) spanning physics, chemistry, and biology from quantum electrodynamics to synthetic organic chemistry.

Leaderboard
Loading leaderboard...
Implementations (1)
EnvironmentStarsLast Updated
GeneralReasoningGeneralReasoning/FrontierScience
0
1 months ago
OpenAI/frontierscience | OpenReward