swe-perf

Description

SWE-Perf is the first benchmark for systematically evaluating LLMs on code performance optimization tasks at the repository level. It comprises 140 instances sourced from real performance-improving GitHub pull requests, each including the relevant codebase, target functions, performance tests, expert-authored patches, and executable environments.

Leaderboard
Loading leaderboard...
Implementations (1)
EnvironmentStarsLast Updated
GeneralReasoningGeneralReasoning/SWE-Perf
0
1 months ago
arXiv/swe-perf | OpenReward