API Endpoint

Leaderboard

Loading leaderboard...

README

ARC-AGI-1

Description

ARC-AGI-1 is an environment for evaluating abstract reasoning and pattern recognition capabilities. Agents are given training examples demonstrating a transformation pattern from input grids to output grids, then must apply the deduced rule to new test inputs. Each grid is a 2D array of integers (0-9) representing colors.

Capabilities

Abstract reasoning and pattern induction
Visual transformation rule discovery
Grid-based spatial reasoning

Compute Requirements

Agents are given a standard environment with no sandbox or file system access.

License

Apache 2.0.

Tasks

Two splits in this environment:

training: 400 tasks
evaluation: 400 tasks

Each task includes training examples showing input-output transformations and test inputs requiring predicted outputs.

Reward Structure

Multi-attempt evaluation with deterministic grading. The agent submits predicted output grids via the answer tool. Up to 3 attempts are allowed per task. The submitted outputs are compared via exact match against the ground truth. Reward is 1.0 if all outputs are correct, 0.0 otherwise. Episode ends on correct answer or after 3 failed attempts.

Data

Dataset loaded from HuggingFace lordspline/arc-agi. Tasks contain training examples and test inputs.

Tools

Tool	Description
`answer`	Submit predicted output grids as list of objects with "output" keys. Up to 3 attempts. Ends the episode on success or final attempt.

Time Horizon

Multi-attempt. The agent analyzes training examples, deduces the transformation rule, and submits outputs with up to 3 attempts.

Environment Difficulty

ARC-AGI-1 evaluates abstract reasoning capabilities:

Model	Accuracy
o3-preview (low)	75.7%
o3 (high)	60.8%
o4-mini (high)	58.7%
Claude Sonnet 4 (Thinking)	40.0%
Claude Opus 4 (Thinking)	35.7%
Gemini 2.5 Flash	33.3%
Gemini 2.5 Pro	33.0%
DeepSeek R1	21.2%

ARC-AGI-1 is approaching saturation, with top systems now exceeding 75% accuracy.

Other Environment Requirements

There are no further environment requirements; ARC-AGI-1 works out of the box with the OpenReward endpoint without any external API keys.

Safety

Agents in ARC-AGI-1 solve abstract reasoning puzzles in a standard environment. The environment does not present direct safety risks.

Citation

@misc{chollet2019arc,
  title={On the Measure of Intelligence},
  author={Fran{\c{c}}ois Chollet},
  year={2019},
  eprint={1911.01547},
  archivePrefix={arXiv}
}

Tools

Tools available in the environment

No tools available for this environment, it probably hasn't been indexed yet.

Compute Configuration

Resource allocation for this environment.

Component	Configuration
Environment Server	1 vCPU / 4 GB RAM
Sandbox Machine	Not configured

Estimated Cost

Pay per second of active session usage. Billing starts when your session begins and stops when it ends.

Component	Cost / second
Environment	$0.0000320
Sandbox	Not configured
Total	$0.0000320

Examples

5-minute session$0.0096

1-hour session$0.1152

arc-agi-1

GeneralReasoning/arc-agi-1

ARC-AGI-1

Description

Capabilities

Compute Requirements

License

Tasks

Reward Structure

Data

Tools

Time Horizon

Environment Difficulty

Other Environment Requirements

Safety

Citation

Tools

Compute Configuration

Estimated Cost

Examples