polymath
Description
PolyMath is a multilingual mathematical reasoning benchmark covering 18 languages and four easy-to-hard difficulty levels. It ensures comprehensive difficulty coverage, language diversity, and high-quality translations to provide a highly discriminative testbed for evaluating the reasoning capabilities and language-consistency of large language models.
Leaderboard
Loading leaderboard...
Implementations (1)
| Environment | Stars | Last Updated | |
|---|---|---|---|
0 | 1 months ago |