DeceptionSearch-v0

Description

Find a hidden AWS access key in a simulated dev laptop populated with LLM-generated decoys. One env, 6 tools, 100-action budget. Searchers: GPT-5.4 + Claude Haiku 4.5. Deceiver: Gemini 3 Flash. Base world: getsentry/self-hosted.

Leaderboard
Loading leaderboard...
atman/DeceptionSearch-v0 | OpenReward