API Endpoint

Leaderboard

Loading leaderboard...

README

NCBIGenomeTrain

Description

NCBIGenomeTrain is a ORS training environment for genome-level question answering about the hg38 human reference genome. Each question requires retrieving or computing verifiable facts from the GRCh38/hg38 assembly, such as reference DNA sequences at specific coordinates, GC content of genes, chromosome sizes, cytoband mappings, and restriction enzyme site counts. Answers are structured JSON objects, and questions are designed to require querying NCBI or UCSC Genome Browser databases via web search.

This environment complements RefSeqTrain, which focuses on gene/transcript/protein metadata, by instead focusing on genome-level coordinate-based queries and sequence computations.

Capabilities

Retrieving reference DNA sequences from specific hg38 genomic coordinates
Computing sequence properties (GC content, nucleotide frequencies, motif counts)
Looking up gene genomic coordinates and spans in the hg38 assembly
Querying chromosome sizes, cytobands, and assembly-level statistics
Multi-step research: searching databases, fetching genomic data, and computing derived values

Compute Requirements

Agents are given a standard environment with no sandbox or file system access.

License

MIT.

Tasks

There is one split: train with 1,000 tasks spanning 10 genome-level domains:

Domain	Count	Description
`reference_sequence`	100	Direct DNA sequence retrieval from hg38 coordinates
`gc_content`	100	GC content of genes or genomic regions
`gene_genomic_span`	100	Total genomic span (bp) of genes in hg38
`chromosome_stats`	100	Chromosome lengths, size comparisons, rankings
`cytoband_mapping`	100	Map genomic positions to cytobands and vice versa
`intergenic_distance`	100	Distance between neighboring genes
`exon_properties`	100	Exon counts and individual exon lengths
`nucleotide_composition`	100	A/T/G/C counts in specific genomic regions
`coding_noncoding_ratio`	100	Intronic percentage of gene spans
`sequence_motif`	100	Restriction enzyme site counts in genomic regions

Each task provides a question about the hg38 genome. Answers are JSON objects (e.g., {"reference_sequence": "ATCG..."}, {"gc_content": "0.61"}, {"motif_count": "15"}). The agent must find the answer through web search and database queries.

Reward Structure

Reward is sparse and binary, emitted only when the agent calls submit_answer (which ends the episode). The web_search and fetch_url tools always return reward 0.0 and do not end the episode.

On submission, the agent's answer is evaluated using programmatic grading tailored to the question domain:

Sequence match: Case-insensitive exact DNA sequence comparison
Exact match: Exact string/numeric comparison for counts, coordinates, and names
Numeric tolerance: Accepts values within a specified tolerance for computed quantities (e.g., GC content +/- 0.02, intronic percentage +/- 1.0)

If programmatic grading fails (e.g., non-standard answer format), an LLM grader (gpt-5-mini) is used as fallback.

1.0: Submitted answer matches the reference answer within the domain-specific criteria
0.0: Submitted answer is incorrect, missing, or malformed

Data

Data consists of a single JSONL file containing 1,000 QA pairs derived from the hg38 human reference genome assembly. Each row contains a question, JSON-formatted answer, domain, source coordinates/genes, grading type, and tolerance. Answers were computed programmatically from the UCSC Genome Browser REST API and NCBI E-utilities, ensuring deterministic correctness. Data is stored on the OpenReward platform.

Tools

Tool	Description
`web_search`	Search the web using Tavily API. Returns up to 5 results with titles, URLs, and snippets.
`fetch_url`	Fetch full text content from a URL. Supports pagination for long documents.
`submit_answer`	Submit a final JSON answer with explanation for grading. Ends the episode.

Note that the fetch_url and web_search tools require Tavily, but are optional. If you want to use a different provider for search you can exclude these tools and use external tools instead.

Time Horizon

Multi-turn. Agents can perform multiple web searches and URL fetches before submitting a final answer.

Environment Difficulty

[To be determined]

Other Environment Requirements

OpenAI API key required for LLM-based grading fallback. Pass via secrets={"openai_api_key": "..."}.
Tavily API key required for web search and URL fetching. Pass via secrets={"tavily_api_key": "..."}.

Safety

Agents in NCBIGenomeTrain answer genome informatics questions using web search in a standard environment. The environment focuses on factual information retrieval and computation from publicly available hg38 reference genome data. It does not involve access to non-public data or personal genomic information. The environment does not present direct safety risks.

Citations

NCBIGenomeTrain uses data derived from the GRCh38 human reference genome assembly. Please cite the Genome Reference Consortium:

@article{church2011modernizing,
  title={Modernizing reference genome assemblies},
  author={Church, Deanna M and Schneider, Valerie A and Graves, Tina and others},
  journal={PLoS Biology},
  volume={9},
  number={7},
  pages={e1001091},
  year={2011},
  publisher={Public Library of Science}
}

@dataset{GRNCBIGenomeTrain,
  author    = {General Reasoning Inc. Team},
  title     = {NCBIGenomeTrain},
  year      = {2026},
  publisher = {OpenReward},
  url       = {https://openreward.ai/GeneralReasoning/NCBIGenomeTrain}
}

Repository

Source repository

EnvCommons/NCBIGenomeTrain

Clone Repository

Tools

Tools available in the environment

No tools available for this environment, it probably hasn't been indexed yet.

Compute Configuration

Resource allocation for this environment.

Component	Configuration
Environment Server	1 vCPU / 4 GB RAM
Sandbox Machine	Not configured

Estimated Cost

Pay per second of active session usage. Billing starts when your session begins and stops when it ends.

Component	Cost / second
Environment	$0.0000320
Sandbox	Not configured
Total	$0.0000320

Examples

5-minute session$0.0096

1-hour session$0.1152