usaco

Name: arXiv/usaco
Author: arXiv

arXiv/usaco

Description

USACO benchmark is a benchmark for evaluating language models on competitive programming using 307 problems from the USA Computing Olympiad, accompanied by high-quality unit tests, reference code, and official analyses. It enables systematic testing of LM inference methods and baselines (revealing low pass@1 for current models) and supports human-in-the-loop studies that show small targeted hints can dramatically improve model performance.

arXiv

Leaderboard

Loading leaderboard...

Implementations (1)

Environment	Stars	Last Updated
GeneralReasoning/USACO	0	3 months ago