mmlu-prox
Description
MMLU-ProX is a comprehensive benchmark for assessing cross-linguistic reasoning in LLMs across 29 languages, built on an English benchmark with each language version containing 11,829 identical questions (and a lite version of 658 questions per language) to enable direct comparisons. It was created using translations by multiple powerful LLMs followed by expert review and is used to evaluate models (36 state-of-the-art LLMs) revealing substantial performance drops in low-resource languages.
Leaderboard
Loading leaderboard...
Implementations (1)
| Environment | Stars | Last Updated | |
|---|---|---|---|
0 | 1 months ago |