researchrubrics

Description

ResearchRubrics is a standardized benchmark for Deep Research (DR) that pairs realistic, domain-diverse prompts with 2,500+ expert-written, fine-grained rubrics (built with over 2,800+ hours of human labor) to assess factual grounding, reasoning soundness, and clarity. It also provides a complexity framework across conceptual breadth, logical nesting, and exploration, along with human and model-based evaluation protocols and release of all prompts, rubrics, and evaluation code.

Leaderboard
Loading leaderboard...
Implementations

No implementations linked yet. Add one to showcase related work.

ScaleAI/researchrubrics | OpenReward