Wanderland: Geometrically Grounded Simulation for Open-World Embodied AI

Why Existing Pipelines Fail?

Casual videos tend to have unidirectional capture, whereas our capture has diverse camera views.
Vision-only 3D reconstruction is still NOT as good as multi-sensor fusion SLAM (click tab below).

Vid2Sim Ours

GaussGym Ours

Dataset Visual Highlights

Wanderland captures large-scale indoor, outdoor, and mixed lighting scenes with metrically aligned geometry. These synchronized fly-throughs highlight the dataset's visual diversity, coverage scale, and embodied evaluation setup.

Multimodal Data Quality

Aligned multimodal signals, reconstruction quality, and lighting variation across captured scenes.

Scene Coverage and Scale

Large and non-overlapping trajectories that stress open-world coverage.

Embodied Navigation

Unity walkthroughs with metric mesh collision and photorealistic 3DGS rendering.

Navigation Task Formats

Language, image-goal, and point-goal task examples for embodied evaluation.

Our Framework

Our pipeline begins with multi-sensor capture using the MetaCam device in real-world urban spaces.
MetaCam Studio processes the raw data via LIV-SLAM to produce a colorized, globally consistent metric point cloud and accurate camera poses.
We initialize 3D Gaussians from the metric point cloud and render per-view depth maps from this initialization.
The 3DGS model is optimized with both photometric and depth losses.
In parallel, we extract a reliable collision mesh from the same global point cloud.
We integrate the trained 3DGS model and the collision mesh into a single USD scene.
The USD scene can be directly loaded into Isaac Sim for training and evaluating navigation policies.

Data Statistics and Comparison

Acknowledgements

The work was supported in part through NSF grants 2514030, 2238968, and 2345139, in part by NVIDIA Academic Grant Program, and the NYU IT High Performance Computing resources, services, and staff expertise. We thank SkylandX for their technical support. We thank Hellon Luo, Shiqi Wang, Ying Wang, Zhicheng Yang, and Yining Zheng for their help in data collection. We thank Juexiao Zhang and Sihang Li for insightful discussion.

BibTeX


@inproceedings{liu2026wanderland,
  title={Wanderland: Geometrically Grounded Simulation for Open-World Embodied AI},
  author={Liu, Xinhao and Li, Jiaqi and Deng, Youming and Chen, Ruxin and Zhang, Yingjia and Ma, Yifei and Guo, Li and Li, Yiming and Zhang, Jing and Feng, Chen},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={1041--1052},
  year={2026}
}

Wanderland

CVPR 2026 Highlight

TL;DR