API Endpoint

Leaderboard

Loading leaderboard...

README

Nemotron Competitive Coding

Name: NVIDIA/Nemotron-RL-coding-competitive_coding
Author: NVIDIA

Description

Nemotron Competitive Coding is an environment for evaluating agents on competitive programming problems. It wraps the Nemotron-RL-coding-competitive_coding dataset from NVIDIA, consisting of 16,083 competitive coding problems in Python sourced from CodeContests (DeepMind) and Codeforces (Open-R1). Each problem includes hidden unit tests for automated verification. Agents use CLI tools (bash, write, read, edit) to develop and test their Python solution in a sandbox, then submit the file path for evaluation against hidden test cases.

Capabilities

Competitive programming problem solving
Algorithm design and implementation
Handling edge cases and constraints
Producing correct I/O format from problem descriptions

Compute Requirements

Submitted code is executed in a sandbox with 0.5 CPUs and 1 GB of RAM. Each test case has a 10-second time limit.

License

CC-BY-SA-4.0.

Tasks

There is one split: train (16,083 tasks). Each task presents a competitive programming problem. Problems are sourced from CodeContests (DeepMind) and Codeforces (Open-R1). Test cases per problem range from 1 to 430, with an average of ~59.

Reward Structure

This is a sparse reward environment with continuous scoring. The agent calls the submit tool once with a file path to its Python solution. The environment executes the code in a sandbox against hidden unit tests. The reward is the fraction of test cases passed:

$\text{Reward} = \frac{\text{passed test cases}}{\text{total test cases}}$

Scores range from 0.0 to 1.0. We do not use LLM graders for this task. Verification is purely test-case-based.

Data

Problems are sourced from the Nemotron-RL-coding-competitive_coding dataset by NVIDIA, which is part of the NeMo Gym framework for reinforcement learning on LLMs. The original problems come from CodeContests and Codeforces. Data files are stored on the OpenReward platform.

Tools

CLI tools (inherited from CLIEnvironment):

bash: Execute bash commands in the sandbox
write: Write content to a file
read: Read file contents
edit: Perform exact string replacement in a file
multi_edit: Perform multiple edits on a single file
glob: Find files matching a glob pattern
grep: Search for patterns in files
ls: List files and directories
todo_write: Manage a todo list for task planning

Evaluation tool:

submit: Submit a Python solution file for evaluation against hidden test cases. Takes a file_path parameter pointing to the solution. The code is executed in the sandbox against hidden test cases. Returns the number of test cases passed and the reward. This tool can only be called once per task.

Time Horizon

Nemotron Competitive Coding is a multi-step environment. The agent receives a problem statement, develops and tests a solution using CLI tools, then submits the file path for evaluation.

Other Environment Requirements

Nemotron Competitive Coding requires an OpenReward API key (api_key secret) for sandbox execution. No OpenAI API key is needed.

Safety

Agents are asked to write Python solutions to competitive programming problems. Submitted code is executed in a sandboxed environment with network access blocked. The environment does not present direct safety risks.

Citations

@dataset{nvidia_nemotron_rl_coding,
  author    = {NVIDIA Corporation},
  title     = {Nemotron-RL-coding-competitive\_coding},
  year      = {2026},
  publisher = {Hugging Face},
  url       = {https://huggingface.co/datasets/nvidia/Nemotron-RL-coding-competitive_coding},
  license   = {CC-BY-SA-4.0}
}

Implementations

No implementations linked yet. Add one to showcase related work.

Repository

Source repository

EnvCommons/Nemotron-RL-coding-competitive_coding-

Clone Repository

Tools

Tools available in the environment

No tools available for this environment, it probably hasn't been indexed yet.

Compute Configuration

Resource allocation for this environment.

Component	Configuration
Environment Server	1 vCPU / 4 GB RAM
Sandbox Machine	0.5 vCPUs / 1 GB RAM

Estimated Cost

Pay per second of active session usage. Billing starts when your session begins and stops when it ends.

Component	Cost / second
Environment	$0.0000320
Sandbox	$0.0000115
Total	$0.0000435

Examples

5-minute session$0.0131

1-hour session$0.1566