vln-ego

Name: arXiv/vln-ego
Author: arXiv

arXiv/vln-ego

Description

VLN-Ego is a benchmark for vision-language navigation that provides egocentric video streams and expert action demonstrations in the Habitat 3D simulator to enable training and evaluation of end-to-end LVLM-based continuous navigation agents. It consists of simulated egocentric trajectories paired with natural language instructions, action-level expert demonstrations, and Long-Short Memory Sampling to balance historical and current observations for supervised and reinforcement fine-tuning.

arXiv

Leaderboard

Loading leaderboard...

Implementations

No implementations linked yet. Add one to showcase related work.