RefSeqTrain

Description

RefSeqTrain is an ORS training environment for genomics question answering about NCBI RefSeq and Gene database records. Each question asks about a specific verifiable fact from a gene, transcript, or protein record (e.g. sequence lengths, exon counts, chromosomal locations, CDS ranges, protein domains). Questions are designed to be specific enough that they can only be answered by looking up the correct NCBI record, and answers require navigating the RefSeq and Gene databases via web search.

Capabilities

Question answering from NCBI RefSeq and Gene database records
Web search and information retrieval from genomic databases
Multi-step research: searching, reading NCBI records, and extracting precise facts
Cross-species genomic queries across human, mouse, rat, zebrafish, fruit fly, and nematode

Compute Requirements

Agents are given a standard environment with no sandbox or file system access.

License

MIT.

Tasks

There is one split: train with 1,000 tasks spanning 10 genomics domains:

Domain	Count	Description
`transcript_metadata`	100	mRNA lengths, accession types (NM/XM/NR)
`protein_metadata`	100	Protein lengths, accession types (NP/XP)
`gene_transcript_relationships`	100	Isoform counts, transcript variants
`coding_sequence`	100	CDS ranges, reading frames
`exon_structure`	100	Exon counts, exon architecture
`chromosomal_location`	100	Chromosome, band, coordinates
`gene_nomenclature`	100	Full names, symbols, aliases
`cross_species`	100	Orthologs across model organisms
`protein_features`	100	Domains, signal peptides, annotations
`functional_annotation`	100	Gene summaries, pathways, map locations

Each task provides a question and metadata (accession, source NCBI URL, domain, question type). The agent prompt contains only the question; the agent must find the answer through web search and NCBI record retrieval.

Reward Structure

Reward is sparse and binary, emitted only when the agent calls submit_answer (which ends the episode). The web_search and fetch_url tools always return reward 0.0 and do not end the episode.

On submission, the agent's answer is evaluated by an LLM grader (gpt-5-mini) that checks semantic equivalence against the reference answer. The grader accounts for equivalent numeric formats, abbreviations, and minor formatting differences. Empty or whitespace-only submissions receive reward 0.0 without invoking the grader.

1.0: Submitted answer is semantically equivalent to the reference answer
0.0: Submitted answer is incorrect, missing, or empty

Data

Data consists of a single JSONL file containing 1,000 QA pairs generated from NCBI RefSeq and Gene database records. Each row contains a question, answer, source NCBI URL, accession, key passage from the record, domain, and question type. Data is stored on the OpenReward platform.

Tools

Tool	Description
`web_search`	Search the web using Tavily API. Returns up to 5 results with titles, URLs, and snippets.
`fetch_url`	Fetch full text content from a URL. Supports pagination for long documents.
`submit_answer`	Submit a final answer with explanation for LLM grading. Ends the episode.

Note that the fetch_url and web_search tools require Tavily, but are optional. If you want to use a different provider for search you can exclude these tools and use external tools instead.

Time Horizon

Multi-turn. Agents can perform multiple web searches and URL fetches before submitting a final answer.

Environment Difficulty

[To be determined]

Other Environment Requirements

OpenAI API key required for LLM-based grading. Pass via secrets={"openai_api_key": "..."}.
Tavily API key required for web search and URL fetching. Pass via secrets={"tavily_api_key": "..."}.

Safety

Agents in RefSeqTrain answer genomics questions using web search in a standard environment. The environment focuses on factual information retrieval from publicly available NCBI genomic records and does not involve access to non-public data or sensitive personal genomic information. The environment does not present direct safety risks.

Citations

RefSeqTrain uses data derived from the NCBI RefSeq database. Please cite the original RefSeq publication:

@article{oleary2016refseq,
  title={Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation},
  author={O'Leary, Nuala A and Wright, Mathew W and Brister, J Rodney and others},
  journal={Nucleic Acids Research},
  volume={44},
  number={D1},
  pages={D733--D745},
  year={2016},
  publisher={Oxford University Press}
}

@dataset{GRRefSeqTrain,
  author    = {General Reasoning Inc. Team},
  title     = {RefSeqTrain},
  year      = {2026},
  publisher = {OpenReward},
  url       = {https://openreward.ai/GeneralReasoning/RefSeqTrain}
}

Repository

Source repository

EnvCommons/RefSeqQATrain

Clone Repository

Tools

Tools available in the environment

No tools available for this environment, it probably hasn't been indexed yet.

Compute Configuration

Resource allocation for this environment.

Component	Configuration
Environment Server	1 vCPU / 4 GB RAM
Sandbox Machine	Not configured

Estimated Cost

Pay per second of active session usage. Billing starts when your session begins and stops when it ends.

Component	Cost / second
Environment	$0.0000320
Sandbox	Not configured
Total	$0.0000320

Examples

5-minute session$0.0096

1-hour session$0.1152