CTF
CTF
Description
CTF (Capture The Flag) is an environment for evaluating language model agents on Capture The Flag security challenges. Agents are given a sandboxed environment with access to challenge files and must solve security problems across domains including binary exploitation, cryptography, web security, and steganography to extract hidden flags.
Capabilities
- Binary exploitation and buffer overflow analysis
- Cryptographic puzzle solving
- Web security vulnerability identification
- Reverse engineering and code analysis
- Steganography and hidden data extraction
- Autonomous security tool usage in sandbox
Compute Requirements
Agents in CTF are given a sandbox with 0.5 CPU and 1GB of RAM. Network access is enabled for challenges requiring external connections.
License
Tasks
There are two splits in this environment:
- train: 179 challenges for training
- test: 6 challenges for evaluation
Challenges span multiple security domains:
- Binary exploitation (buffer overflows, format strings)
- Cryptography (ciphers, encoding, hashing)
- Web security (injection, authentication bypass)
- Reverse engineering (binary analysis)
- Steganography (hidden data in images/files)
Each challenge provides a description and associated files. The agent must analyze the challenge, exploit the vulnerability or solve the puzzle, and extract the flag.
Reward Structure
This is a sparse, verifiable reward environment. Rewards are issued when the agent submits an answer:
- 1.0: Flag correctly extracted and submitted (case-insensitive substring match)
- 0.0: Incorrect flag or already submitted
No LLM grader is used. Flag validation is exact matching against known flags extracted from challenge files. The environment supports 60+ flag formats from various CTF platforms (e.g., flag{...}, CTF{...}, picoCTF{...}, RITSEC{...}).
Data
The benchmark consists of 185 challenges sourced from various CTF competitions. Each challenge contains:
- Challenge description with context and hints
- Associated files (binaries, scripts, images, etc.)
- Hidden flag to extract
Challenge files are mounted read-only at /tmp/gr-datasets/ctf/challenges/{task_id}/ in the sandbox.
Tools
Agents have access to 6 tools:
- bash: Execute bash commands in the sandbox (returns stdout/stderr)
- list_files: List directory contents with options for hidden files and recursion
- read_file: Read file contents (50KB limit)
- write_file: Write content to files in the sandbox
- submit_answer: Submit the extracted flag for verification
- todo_write: Task planning and progress tracking
Time Horizon
CTF is a multi-turn environment where agents iteratively explore challenges, run tools, and analyze results before submitting the flag.
[Statistics on average tool calls here]
Environment Difficulty
[Statistics on environment difficulty here]
Other Environment Requirements
There are no other requirements for running this environment.
Safety
CTF challenges are run in isolated sandbox environments. Agents interact only with pre-defined challenge files and cannot affect external systems. The environment is designed for educational security research and competitive CTF solving.
Challenge content is sourced from public CTF competitions and does not include novel exploit development for real-world systems. However, as a dual-use domain, models that are better at cybersecurity tasks may be directed to use these capabilities for cyberattacks. When training with multi-environment reinforcement learning, is important that rewards in these domains are complemented with rewards for alignment with ethical behaviour.
Citations
@dataset{GRCTF,
author = {General Reasoning Inc. Team},
title = {CTF Environment},
year = {2026},
publisher = {OpenReward},
url = {https://openreward.ai/GeneralReasoning/CTF}
}