usaco
Description
USACO benchmark is a benchmark for evaluating language models on competitive programming using 307 problems from the USA Computing Olympiad, accompanied by high-quality unit tests, reference code, and official analyses. It enables systematic testing of LM inference methods and baselines (revealing low pass@1 for current models) and supports human-in-the-loop studies that show small targeted hints can dramatically improve model performance.
Leaderboard
Loading leaderboard...
Implementations (1)
| Environment | Stars | Last Updated | |
|---|---|---|---|
0 | 2 months ago |