spreadsheetbench

Description

SpreadsheetBench is a challenging spreadsheet-manipulation benchmark derived exclusively from real-world scenarios to immerse LLMs in actual spreadsheet user workflows. It comprises 912 real questions from online Excel forums paired with complex, diverse spreadsheets (multiple and non‑standard relational tables and abundant non-textual elements) and uses an online-judge style evaluation with multiple test-case files per instruction to assess robustness across varying values.

Leaderboard
Loading leaderboard...
Implementations (1)
EnvironmentStarsLast Updated
GeneralReasoningGeneralReasoning/SpreadsheetBench
0
2 months ago
arXiv/spreadsheetbench | OpenReward