API Endpoint

Leaderboard

Loading leaderboard...

README

PatentQATrain

Description

PatentQATrain is an ORS training environment for patent question answering, based on the PatentQA task from LAB-Bench-2. Agents are given questions about specific details from patents across diverse technology domains and must use web search to find and verify answers from Google Patents.

Capabilities

Researching patent details using web search
Extracting specific information from patent claims and specifications
Navigating patent documents to find numerical thresholds, compositions, method steps, and claim elements
Cross-domain patent understanding spanning pharmaceutical, biotech, chemistry, electronics, software, mechanical, materials, energy, medical devices, and telecom

Compute Requirements

No sandbox or special compute requirements. Uses external web search (Tavily API) for patent retrieval.

License

MIT

Tasks

There are 994 training tasks distributed across 10 patent domains:

Domain	Count	Description
pharmaceutical	100	Drug compositions, formulations, drug delivery
biotech	100	Genetic engineering, antibodies, cell therapy
chemistry	99	Chemical compounds, synthesis, catalysis
electronics	100	Semiconductors, circuits, displays, sensors
software	100	Algorithms, data processing, networking
mechanical	100	Engines, mechanisms, manufacturing
materials	100	Polymers, composites, coatings, nanomaterials
energy	99	Solar cells, batteries, fuel cells
medical_devices	99	Surgical instruments, imaging, prosthetics
telecom	97	Wireless protocols, signal processing

Each task presents a question about a specific patent that requires distinctive specificity — the question includes the patent number and asks about a verifiable fact uniquely traceable to that patent.

Reward Structure

Sparse, binary reward:

1.0 for correct answers (as judged by LLM grader)
0.0 for incorrect or unsure answers

Grading uses semantic equivalence checking: answers that are numerically/semantically equivalent are accepted, even if phrased differently. The grader is based on LABBench2's structured evaluation prompt.

We do not use exact string matching. The LLM grader (gpt-5-mini) evaluates whether the submitted answer captures the core factual content of the expected answer.

Data

Ground-truth data consists of QA pairs derived from Google Patents documents. Each task includes:

A question referencing a specific patent number
A concise expected answer (under 200 characters)
The source patent URL
A key passage from the patent supporting the answer

Data is stored on the OpenReward platform.

Tools

Agents have access to three tools:

Tool	Description
`web_search`	Search the web using Tavily. Returns titles, URLs, and snippets.
`fetch_url`	Fetch full text content from a URL using Tavily extract. Supports pagination for long documents.
`submit_answer`	Submit a final answer with explanation. Triggers LLM grading and ends the episode.

Time Horizon

PatentQATrain is a multi-turn environment. Agents typically perform several web searches and URL fetches before submitting an answer.

Environment Difficulty

[Statistics on environment difficulty here]

Other Environment Requirements

This environment requires the following API keys passed via the secrets parameter:

openai_api_key: For LLM-based answer grading
tavily_api_key: For web search and URL content extraction

Safety

PatentQATrain focuses on factual information retrieval from publicly available patent records. The environment does not involve intellectual property creation, legal advice, or access to non-public information. All source patents are publicly accessible via Google Patents.

Citations

This environment is inspired by the PatentQA task from LAB-Bench-2:

@misc{labbench2,
  author    = {Laurent, Jon M. and Bou, Albert and Pieler, Michael and Igoe, Conor and Andonian, Alex and Narayanan, Siddharth and Braza, James and Vassopoulos, Alexandros Sanchez and Steenwyk, Jacob L. and Lash, Blake and White, Andrew D. and Rodriques, Samuel G.},
  title     = {LABBench2: An Improved Benchmark for AI Systems Performing Biology Research},
  year      = {2026},
  url       = {https://github.com/EdisonScientific/labbench2}
}

Repository

Source repository

EnvCommons/PatentQATrain

Clone Repository

Tools

Tools available in the environment

No tools available for this environment, it probably hasn't been indexed yet.

Compute Configuration

Resource allocation for this environment.

Component	Configuration
Environment Server	1 vCPU / 4 GB RAM
Sandbox Machine	Not configured

Estimated Cost

Pay per second of active session usage. Billing starts when your session begins and stops when it ends.

Component	Cost / second
Environment	$0.0000320
Sandbox	Not configured
Total	$0.0000320

Examples

5-minute session$0.0096

1-hour session$0.1152

PatentQATrain

GeneralReasoning/PatentQATrain

PatentQATrain

Description

Capabilities

Compute Requirements

License

Tasks

Reward Structure

Data

Tools

Time Horizon

Environment Difficulty

Other Environment Requirements

Safety

Citations

Repository

Clone Repository

Tools

Compute Configuration

Estimated Cost

Examples