secque

Description

SECQUE is a benchmark for evaluating large language models on financial analysis tasks using 565 expert-written questions covering SEC filings across four categories: comparison analysis, ratio calculation, risk assessment, and financial insight generation. It also includes SECQUE-Judge, an LLM-based multi-judge evaluation mechanism shown to align strongly with human evaluation.

Leaderboard
Loading leaderboard...
Implementations (1)
EnvironmentStarsLast Updated
GeneralReasoningGeneralReasoning/SECQUE
0
1 months ago
arXiv/secque | OpenReward