TruthAndDeception
TruthAndDeception
Description
TruthAndDeception is an environment for evaluating agents on social deduction and persuasion through natural conversation. This environment wraps the TruthAndDeception implementation from TextArena, a framework for text-based game environments.
Capabilities
- Natural language conversation and persuasion
- Deception detection and truth identification
- Social reasoning and strategic communication
- Evaluation across standard, long, and extreme variants
Compute Requirements
TruthAndDeception does not require a sandbox. It has minimal compute requirements.
License
MIT.
Tasks
There are two splits: train (450 tasks) and test (450 tasks). Each split contains 50 tasks across each of 9 variants:
- TruthAndDeception-v0
- TruthAndDeception-v0-train
- TruthAndDeception-v0-raw
- TruthAndDeception-v0-extreme
- TruthAndDeception-v0-extreme-train
- TruthAndDeception-v0-extreme-raw
- TruthAndDeception-v0-long
- TruthAndDeception-v0-long-train
- TruthAndDeception-v0-long-raw
Each task is seeded for reproducibility.
Reward Structure
This is a sparse reward environment. Rewards are mapped from TextArena's native range of {-1, 0, 1} to {0.0, 0.5, 1.0} via (raw + 1) / 2.
We do not use LLM graders for this environment; reward is determined programmatically.
Data
Game state is generated procedurally by the TextArena engine using seeded randomness. No external data files are required.
Tools
Agents are given a single tool:
send_message(message): Send a message to the other player. Converse naturally to achieve your goal (deceive or guess correctly).
Time Horizon
TruthAndDeception is a multi-turn environment.
Environment Difficulty
Medium to Hard - requires persuasion, deception detection, and strategic communication.
Other Environment Requirements
This environment requires an OpenAI API key (passed via secrets) to power the LLM opponent.
Safety
Agents trained in TruthAndDeception may learn manipulative or deceptive behaviour. Social deduction skills may be used for malicious purposes, and we recommend training on this environment with caution. In a multi-environment run, it may be helpful to complement it with constitutional rubrics and other sources of reward beyond the direct game outcome in order to promote closer alignment with human values.
Citations
@software{textarena2024,
author = {Guertler, Leon and Banting, Wilfried and Pignatelli, Eduardo},
title = {TextArena},
year = {2024},
publisher = {GitHub},
url = {https://github.com/LeonGuertler/TextArena}
}