frontiercs
Description
FrontierCS is a benchmark of 156 expert-designed, open-ended computer science problems across diverse areas that require models to produce executable programs (with an expert reference solution and automatic evaluator provided for each problem) rather than direct answers. It targets tasks with unknown optimal solutions—including NP-hard algorithmic variants and research-style problems—so progress is measured via objective solution quality and partial scoring rather than binary correctness.
Leaderboard
Loading leaderboard...
Implementations (1)
| Environment | Stars | Last Updated | |
|---|---|---|---|
1 | 2 months ago |