Wanderland Icon

Wanderland

Geometrically Grounded Simulation
for Open-World Embodied AI

1 New York University 2 Cornell University
⚖️: Equal contribution, random order ✉️: Corresponding author

TL;DR

Visual realism is insufficient for embodied AI. Trustworthy benchmarking demands the metric-scale geometric grounding that previous pipelines lack.





Why Existing Pipelines Fail?



  • Casual videos tend to have unidirectional capture, whereas our capture has diverse camera views.
  • Vision-only 3D reconstruction is still NOT as good as multi-sensor fusion SLAM (click tab below).
Vid2Sim Ours
GaussGym Ours

Our Framework

Overview Image
  1. Our pipeline begins with multi-sensor capture using the MetaCam device in real-world urban spaces.
  2. MetaCam Studio processes the raw data via LIV-SLAM to produce a colorized, globally consistent metric point cloud and accurate camera poses.
  3. We initialize 3D Gaussians from the metric point cloud and render per-view depth maps from this initialization.
  4. The 3DGS model is optimized with both photometric and depth losses.
  5. In parallel, we extract a reliable collision mesh from the same global point cloud.
  6. We integrate the trained 3DGS model and the collision mesh into a single USD scene.
  7. The USD scene can be directly loaded into Isaac Sim for training and evaluating navigation policies.

Data Statistics and Comparison

Dataset Statistics and Comparison

Acknowledgements

The work was supported in part through NSF grants 2514030, 2238968, and 2345139, in part by NVIDIA Academic Grant Program, and the NYU IT High Performance Computing resources, services, and staff expertise. We thank SkylandX for their technical support. We thank Hellon Luo, Shiqi Wang, Ying Wang, Zhicheng Yang, and Yining Zheng for their help in data collection. We thank Juexiao Zhang and Sihang Li for insightful discussion.


BibTeX



@inproceedings{liu2026wanderland,
  title={Wanderland: Geometrically Grounded Simulation for Open-World Embodied AI},
  author={Liu, Xinhao and Li, Jiaqi and Deng, Youming and Chen, Ruxin and Zhang, Yingjia and Ma, Yifei and Guo, Li and Li, Yiming and Zhang, Jing and Feng, Chen},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={1041--1052},
  year={2026}
}