API Endpoint

Leaderboard

Loading leaderboard...

README

PrincipiaCollection

Description

PrincipiaCollection is a large-scale training environment for STEM mathematical derivation via RL. It contains 554K synthetic problems across two grading modes: mathematical objects (LLM-judged) and numerical answers (exact match).

Capabilities

Mathematical derivation and symbolic reasoning
Numerical computation
STEM knowledge across diverse mathematical topics

Compute Requirements

Mathematical object split: requires OpenAI API access for LLM-based equivalence judging
Numerical split: no external API needed (exact match grading)

Tasks

train: 248,743 mathematical object problems (LLM-judged)
train_numerical: 305,656 numerical problems (exact match)
Each task has: id, problem_statement, topic, answer_type, split
Answer types include: Set, Interval, Equation, Inequality, Matrix, Integer, Decimal, Fraction

Reward Structure

Binary reward (0.0 or 1.0).

train split: single LLM equivalence judge call
train_numerical split: exact numeric match with small tolerance

Data

Source: facebook/principia-collection on HuggingFace. Two parquet files (mathematical_object and numerical splits). Mounted at /orwd_data in production.

Tools

submit(answer: str) — Submit an answer for grading. Ends the episode.

Time Horizon

Single-turn. One tool call per episode.

Environment Difficulty

Ranges from introductory to advanced undergraduate across diverse mathematical topics.

Other Environment Requirements

OpenAI API key required for train split (passed via secrets["openai_api_key"])
No API key needed for train_numerical split

Safety

No safety concerns — environment grades mathematical derivations only.

Citations

@misc{aggarwal2026reasoningmathematicalobjects,
      title={Reasoning over mathematical objects: on-policy reward modeling and test time aggregation},
      author={Pranjal Aggarwal and Marjan Ghazvininejad and Seungone Kim and Ilia Kulikov and Jack Lanchantin and Xian Li and Tianjian Li and Bo Liu and Graham Neubig and Anaelia Ovalle and Swarnadeep Saha and Sainbayar Sukhbaatar and Sean Welleck and Jason Weston and Chenxi Whitehouse and Adina Williams and Jing Xu and Ping Yu and Weizhe Yuan and Jingyu Zhang and Wenting Zhao},
      year={2026},
      eprint={2603.18886},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
}

Repository

Source repository

EnvCommons/PrincipiaCollection

Clone Repository

Tools

Tools available in the environment

No tools available for this environment, it probably hasn't been indexed yet.

Compute Configuration

Resource allocation for this environment.

Component	Configuration
Environment Server	2 vCPUs / 8 GB RAM
Sandbox Machine	Not configured

Estimated Cost

Pay per second of active session usage. Billing starts when your session begins and stops when it ends.

Component	Cost / second
Environment	$0.0000640
Sandbox	Not configured
Total	$0.0000640

Examples

5-minute session$0.0192

1-hour session$0.2304

PrincipiaCollection

GeneralReasoning/PrincipiaCollection

PrincipiaCollection

Description

Capabilities

Compute Requirements

Tasks

Reward Structure

Data

Tools

Time Horizon

Environment Difficulty

Other Environment Requirements

Safety

Citations

Repository

Clone Repository

Tools

Compute Configuration

Estimated Cost

Examples