FrontierMath

Name: EpochAI/FrontierMath
Author: EpochAI

EpochAI/FrontierMath

Description

FrontierMath is a benchmark of hundreds of original, expert-vetted, exceptionally challenging mathematics problems spanning major branches of modern mathematics, using new unpublished problems and automated verification to minimize data contamination. Solving typical problems requires hours to days of expert effort, current state-of-the-art models solve under 2% of problems, and FrontierMath provides a rigorous testbed to quantify AI progress toward expert-level mathematical abilities.

arXiv

Leaderboard

Loading leaderboard...

Implementations

No implementations linked yet. Add one to showcase related work.