AARDAct

Description

AARD-Feedforward is an environment that tests the ability of agents to perform architecture research for language models.

Leaderboard
Loading leaderboard...
GeneralReasoning/AARDAct | OpenReward