GeneralReasoning/agent-world-model | OpenReward