GDPval

Description

GDPval is a benchmark for evaluating AI model capabilities on real-world economically valuable tasks. It covers the majority of U.S. Bureau of Labor Statistics Work Activities for 44 occupations across the top nine sectors contributing to U.S. GDP, uses tasks constructed from the representative work of seasoned industry professionals, includes an open-source gold subset of 220 tasks, and provides a public automated grading service.

Leaderboard
Loading leaderboard...
Implementations (1)
EnvironmentStarsLast Updated
GeneralReasoningGeneralReasoning/GDPVal
0
1 months ago
OpenAI/GDPval | OpenReward