Publications

*
CityWalker: Learning Embodied Urban Navigation from Web-Scale Videos
Multiview Scene Graph
Memorize What Matters: Emergent Scene Decomposition from Multitraverse
VLM See, Robot Do: Human Demo Video to Robot Action Plan via Vision Language Model
FusionSense: Bridging Common Sense, Vision, and Touch for Robust Sparse-View Reconstruction
Multiagent Multitraversal Multimodal Self-Driving: Open MARS Dataset
EgoPAT3Dv2: Predicting 3D Action Target from 2D Egocentric Vision for Human-Robot Interaction
Tell Me Where You Are: Multimodal LLMs Meet Place Recognition
NYC-Indoor-VPR: A Long-Term Indoor Visual Place Recognition Dataset with Semi-Automatic Annotation
LUWA Dataset: Learning Lithic Use-Wear Analysis on Microscopic Images
ActFormer: Scalable Collaborative Perception via Active Queries
LiDAR-based 4D Occupancy Completion and Forecasting
SSCBench: A Large-Scale 3D Semantic Scene Completion Benchmark for Autonomous Driving
Among Us: Adversarially Robust Collaborative Perception by Consensus
Collaborative Multi-Object Tracking with Conformal Uncertainty Propagation
Metric-Free Exploration for Topological Mapping by Task and Motion Imitation in Feature Space
Toward Zero-Shot Sim-to-Real Transfer Learning for Pneumatic Soft Robot 3D Proprioceptive Sensing
VoxFormer: Sparse Voxel Transformer for Camera-based 3D Semantic Scene Completion
DeepMapping2: Self-Supervised Large-Scale LiDAR Map Optimization
Learning Simultaneous Navigation and Construction in Grid Worlds
Uncertainty Quantification of Collaborative Detection for Self-Driving
Multi-Robot Scene Completion: Towards Task-Agnostic Collaborative Perception
Self-Supervised Visual Place Recognition by Mining Temporal and Feature Neighborhoods
V2X-Sim: Multi-Agent Collaborative Perception Dataset and Benchmark for Autonomous Driving
Self-supervised Spatial Reasoning on Multi-View Line Drawings
Egocentric Prediction of Action Target in 3D
A Deep Reinforcement Learning Environment for Particle Robot Navigation and Object Manipulation
Deep Weakly Supervised Positioning for Indoor Mobile Robots
Learning Distilled Collaboration Graph for Multi-Agent Perception
NYU-VPR: Long-Term Visual Place Recognition Benchmark with View Direction and Data Anonymization Influences
Mobile 3D Printing Robot Simulation with Viscoelastic Fluids
Mobile projective augmented reality for collaborative robots in construction
Projector-Guided Non-Holonomic Mobile 3D Printing
Simultaneous Navigation and Construction Benchmarking Environments
Fooling LiDAR Perception via Adversarial Trajectory Perturbation
3D Point Cloud Processing and Learning for Autonomous Driving: Impacting Map Creation, Localization, and Perception
A scene-adaptive descriptor for visual SLAM-based locating applications in built environments
SPARE3D: A Dataset for SPAtial REasoning on Three-View Line Drawings
Regularizing Neural Networks via Minimizing Hyperspherical Energy
Real-time Soft Body 3D Proprioception via Deep Vision-based Sensing
LUVLi Face Alignment: Estimating Landmarks' Location, Uncertainty, and Visibility Likelihood
Deep unsupervised learning of 3D point clouds via graph topology inference and filtering
An assistive low-vision platform that augments spatial cognition through proprioceptive guidance: Point-to-Tell-and-Touch
An Occupancy Grid Mapping enhanced visual SLAM for real-time locating applications in indoor GPS-denied environments
DeepMapping: Unsupervised Map Estimation From Multiple Point Clouds
VLASE: Vehicle Localization by Aggregating Semantic Edges
Simultaneous Edge Alignment and Learning
Mining Point Cloud Local Structures by Kernel Correlation and Graph Pooling
FoldingNet: Point cloud auto-encoder via deep grid deformation
FasTFit: A fast T-spline fitting algorithm
Fast Resampling of Three-Dimensional Point Clouds via Graphs
Direct Multichannel Tracking
CASENet: Deep Category-Aware Semantic Edge Detection
Deep Active Learning for Civil Infrastructure Defect Detection and Classification
Camera marker networks for articulated machine pose estimation
Fast plane extraction in organized point clouds using agglomerative hierarchical clustering
Point-plane SLAM for hand-held 3D sensors