FinQA

API Endpoint
Leaderboard
Loading leaderboard...
README

FinQA

OpenReward Environment

Description

FinQA is an environment for evaluating numerical reasoning over financial data. Agents must analyze financial documents containing tables and contextual text, perform multi-step calculations, and produce accurate numerical answers. Tasks are derived from real earnings reports and SEC filings requiring domain-specific financial reasoning.

Capabilities

  • Numerical reasoning over financial tables
  • Multi-step calculation and computation
  • Understanding financial document context
  • Extracting and combining values from tables and text

Compute Requirements

Agents are given a sandboxed environment with 0.5 CPU and 0.5 GB RAM, with access to CLI tools for computation.

License

MIT.

Tasks

There are three splits in this environment:

  • train: 6,251 tasks
  • dev: 883 tasks
  • test: 1,147 tasks

Each task presents a financial document with pre-table context, a data table, post-table context, and a numerical question requiring calculation.

Reward Structure

This is a multi-turn environment. Agents can use CLI tools (bash, read, write, grep, etc.) to analyze data and perform calculations. The agent submits a final numerical answer via the submit_answer tool. Validation uses numerical comparison with tolerance for percentages and decimal variations. Reward is binary: 1.0 if correct, 0.0 if incorrect.

Data

Data consists of JSON files (train.json, dev.json, test.json) containing financial documents with tables, context text, and QA pairs. Data is stored on the OpenReward platform.

Tools

ToolDescription
bashExecute shell commands for computation
readRead file contents
writeWrite content to files
grepSearch for patterns in files
submit_answerSubmit your final numerical answer. Ends the episode.

Time Horizon

Multi-turn. Agents can perform multiple computation steps before submitting a final answer.

Environment Difficulty

FinQA evaluates financial numerical reasoning capabilities requiring table understanding and multi-step calculation.

Other Environment Requirements

There are no further environment requirements; FinQA works out of the box with the OpenReward endpoint without any external API keys.

Safety

Agents in FinQA perform financial calculations in a sandboxed environment. The environment does not present direct safety risks.

Citation

@inproceedings{chen2021finqa,
  title={FinQA: A Dataset of Numerical Reasoning over Financial Data},
  author={Chen, Zhiyu and Chen, Wenhu and Smiley, Charese and Shah, Sameena and Borova, Iana and Langdon, Dylan and Moussa, Reema and Beane, Matt and Huang, Ting-Hao and Routledge, Bryan and Wang, William Yang},
  booktitle={Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing},
  pages={7672--7685},
  year={2021}
}
GeneralReasoning/FinQA | OpenReward