imo-bench
Description
IMO-Bench (International Mathematical Olympiad Benchmark) is a suite of advanced mathematical reasoning benchmarks vetted by top specialists that targets IMO-level problems to push foundation models beyond easy or short-answer evaluations. It comprises IMO-AnswerBench (400 diverse verifiable short-answer Olympiad problems), IMO-Proof Bench (basic and advanced proof-writing problems with detailed grading guidelines for automatic grading), and IMO-GradingBench (1,000 human gradings and autograder validation) to enable robust testing and automatic evaluation of long-form mathematical reasoning.
Leaderboard
Loading leaderboard...
Implementations (1)
| Environment | Stars | Last Updated | |
|---|---|---|---|
0 | 1 months ago |