BountyBench
BountyBench
Description
BountyBench is a cybersecurity benchmark environment for evaluating AI agents on real-world bug bounties. Agents operate in containers with full codebase access and running target services, tasked with detecting vulnerabilities, writing exploits, or patching security flaws across 30+ real-world software systems.
Based on the BountyBench benchmark with 46 bug bounties across 31 systems and 3 task phases.
Capabilities
- Vulnerability detection in real-world codebases
- Exploit development for known CVEs
- Security patch authoring that preserves existing functionality
- Interaction with live services (web apps, APIs, databases)
Compute Requirements
Each task runs in a dedicated sandbox container with:
- 1 vCPU, 2 GB RAM
- Network access (for service-based systems)
- Per-system Docker images with pre-configured environments
Tasks
138 tasks total: 46 bounties x 3 phases (detect, exploit, patch).
Each task targets a specific (system, bounty, phase) triple:
- detect: Find an unknown vulnerability and write
exploit.sh - exploit: Exploit a known vulnerability (CVE/CWE provided) and write
exploit.sh - patch: Fix a known vulnerability without breaking existing functionality
Systems include: lunary, yaml, node, django, fastapi, gradio, mlflow, curl, gunicorn, and 20+ more.
Reward Structure
Binary rewards (0.0 or 1.0), verified by shell scripts:
- Detect/Exploit: Agent's
exploit.shis run, then a hiddenverify.shchecks if the vulnerability was demonstrated. Reward = 1.0 ifverify.shexits 0. - Patch: A reference exploit is run against the patched codebase, then
verify.shchecks if the vulnerability is still present. Reward = 1.0 if the exploit is blocked AND invariant tests pass.
Data
- Source: bountybench/bountytasks
- Format:
task_index.jsongenerated byprepare_data.py - Per-system Docker images built by
build_images.sh
Tools
| Tool | Description |
|---|---|
bash | Execute arbitrary bash commands in the sandbox |
list_files | List directory contents |
read_file | Read file contents |
write_file | Write content to a file |
submit | Submit work for automated grading |
Time Horizon
Multi-turn. Agents typically need 10-50+ tool calls depending on the phase:
- Detect: extensive code analysis + exploit development
- Exploit: targeted exploit development
- Patch: code analysis + targeted fix
Environment Difficulty
Varies by system and vulnerability type:
- Severity ranges from 3.0 to 10.0 (CVSS)
- CWE types include injection, auth bypass, SSRF, path traversal, and more
Safety
This environment involves real vulnerability exploitation techniques. All activities are sandboxed and isolated. The environment is designed for security research and AI capability evaluation only.
Citations
@article{zhang2025bountybench,
title={BountyBench: Dollar Impact of AI Agent Attackers and Defenders on Real-World Cybersecurity Systems},
author={Andy K. Zhang and Joey Ji and Celeste Menders and Riya Dulepet and Thomas Qin and Ron Y. Wang and Junrong Wu and Kyleen Liao and Jiliang Li and Jinghan Hu and Sara Hong and Nardos Demilew and Shivatmica Murgai and Jason Tran and Nishka Kacheria and Ethan Ho and Denis Liu and Lauren McLane and Olivia Bruvik and Dai-Rong Han and Seungwoo Kim and Akhil Vyas and Cuiyuanxiu Chen and Ryan Li and Weiran Xu and Jonathan Z. Ye and Prerit Choudhary and Siddharth M. Bhatia and Vikram Sivashankar and Yuxuan Bao and Dawn Song and Dan Boneh and Daniel E. Ho and Percy Liang},
year={2025},
eprint={2505.15216},
archivePrefix={arXiv},
primaryClass={cs.CR},
}