PatentQATrain

API Endpoint
Leaderboard
Loading leaderboard...
README

PatentQATrain

⭐ OpenReward Environment

Description

PatentQATrain is an ORS training environment for patent question answering, based on the PatentQA task from LAB-Bench-2. Agents are given questions about specific details from patents across diverse technology domains and must use web search to find and verify answers from Google Patents.

Capabilities

  • Researching patent details using web search
  • Extracting specific information from patent claims and specifications
  • Navigating patent documents to find numerical thresholds, compositions, method steps, and claim elements
  • Cross-domain patent understanding spanning pharmaceutical, biotech, chemistry, electronics, software, mechanical, materials, energy, medical devices, and telecom

Compute Requirements

No sandbox or special compute requirements. Uses external web search (Tavily API) for patent retrieval.

License

MIT

Tasks

There are 994 training tasks distributed across 10 patent domains:

DomainCountDescription
pharmaceutical100Drug compositions, formulations, drug delivery
biotech100Genetic engineering, antibodies, cell therapy
chemistry99Chemical compounds, synthesis, catalysis
electronics100Semiconductors, circuits, displays, sensors
software100Algorithms, data processing, networking
mechanical100Engines, mechanisms, manufacturing
materials100Polymers, composites, coatings, nanomaterials
energy99Solar cells, batteries, fuel cells
medical_devices99Surgical instruments, imaging, prosthetics
telecom97Wireless protocols, signal processing

Each task presents a question about a specific patent that requires distinctive specificity — the question includes the patent number and asks about a verifiable fact uniquely traceable to that patent.

Reward Structure

Sparse, binary reward:

  • 1.0 for correct answers (as judged by LLM grader)
  • 0.0 for incorrect or unsure answers

Grading uses semantic equivalence checking: answers that are numerically/semantically equivalent are accepted, even if phrased differently. The grader is based on LABBench2's structured evaluation prompt.

We do not use exact string matching. The LLM grader (gpt-5-mini) evaluates whether the submitted answer captures the core factual content of the expected answer.

Data

Ground-truth data consists of QA pairs derived from Google Patents documents. Each task includes:

  • A question referencing a specific patent number
  • A concise expected answer (under 200 characters)
  • The source patent URL
  • A key passage from the patent supporting the answer

Data is stored on the OpenReward platform.

Tools

Agents have access to three tools:

ToolDescription
web_searchSearch the web using Tavily. Returns titles, URLs, and snippets.
fetch_urlFetch full text content from a URL using Tavily extract. Supports pagination for long documents.
submit_answerSubmit a final answer with explanation. Triggers LLM grading and ends the episode.

Time Horizon

PatentQATrain is a multi-turn environment. Agents typically perform several web searches and URL fetches before submitting an answer.

Environment Difficulty

[Statistics on environment difficulty here]

Other Environment Requirements

This environment requires the following API keys passed via the secrets parameter:

  • openai_api_key: For LLM-based answer grading
  • tavily_api_key: For web search and URL content extraction

Safety

PatentQATrain focuses on factual information retrieval from publicly available patent records. The environment does not involve intellectual property creation, legal advice, or access to non-public information. All source patents are publicly accessible via Google Patents.

Citations

This environment is inspired by the PatentQA task from LAB-Bench-2:

@misc{labbench2,
  author    = {Laurent, Jon M. and Bou, Albert and Pieler, Michael and Igoe, Conor and Andonian, Alex and Narayanan, Siddharth and Braza, James and Vassopoulos, Alexandros Sanchez and Steenwyk, Jacob L. and Lash, Blake and White, Andrew D. and Rodriques, Samuel G.},
  title     = {LABBench2: An Improved Benchmark for AI Systems Performing Biology Research},
  year      = {2026},
  url       = {https://github.com/EdisonScientific/labbench2}
}
GeneralReasoning/PatentQATrain | OpenReward