GAIA

Description

GAIA (General AI Assistants) is a benchmark for evaluating progress toward AGI by posing real-world questions that require fundamental abilities such as reasoning, multi-modality handling, web browsing, and general tool-use proficiency. It comprises 466 conceptually simple-for-humans yet challenging questions, with answers to 300 withheld to power a public leaderboard.

Leaderboard
Loading leaderboard...
Implementations

No implementations linked yet. Add one to showcase related work.

Meta/GAIA | OpenReward