Publications

Check the latest through Google Scholar.

2026

  1. CVPR 2026
    spacedrive_arch.png
    SpaceDrive: Infusing Spatial Awareness into VLM-based Autonomous Driving
    Infusing explicit spatial representations into vision-language models for robust autonomous driving with 3D spatial reasoning.
    In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2026
  2. ArXiv 2026
    stickyglance_teaser.png
    Sticky-Glance: Robust Intent Recognition for Human Robot Collaboration via Single-Glance
    Robust single-glance intent recognition for natural human-robot collaboration in shared workspaces.
    Yuzhi Lai, Shenghai Yuan, Peizheng Li, and Andreas Zell
    arXiv preprint arXiv:2603.06121, 2026

2025

  1. ICCV 2025
    ago_arch.jpg
    AGO: Adaptive Grounding for Open World 3D Occupancy Prediction
    Adaptive grounding framework that bridges 2D vision-language features to open-world 3D occupancy prediction without manual vocabulary.
    In International Conference on Computer Vision (ICCV), 2025
  2. ArXiv 2025
    tqd_track_arch.jpg
    TQD-Track: Temporal Query Denoising for 3D Multi-Object Tracking
    Temporal query denoising approach for robust 3D multi-object tracking in autonomous driving scenarios.
    Shuxiao Ding, Yutong Yang, Julian Wiederer, Markus Braun, Peizheng Li, Juergen Gall, and Bin Yang
    arXiv preprint arXiv:2504.03258, 2025
  3. ArXiv 2025
    fam_hri_arch.jpg
    FAM-HRI: Foundation-Model Assisted Multi-Modal Human-Robot Interaction Combining Gaze and Speech
    Foundation-model assisted multimodal human-robot interaction combining gaze tracking and speech understanding.
    arXiv preprint arXiv:2503.16492, 2025
  4. ArXiv 2025
    seer_var_arch.png
    SEER-VAR: Semantic Egocentric Environment Reasoner for Vehicle Augmented Reality
    Semantic egocentric environment reasoning for vehicle augmented reality with spatial understanding.
    Yuzhi Lai, Shenghai Yuan, Peizheng Li, Jun Lou, and Andreas Zell
    arXiv preprint arXiv:2508.17255, 2025

2024

  1. ECCV 2024
    seflow_arch.png
    SeFlow: A Self-Supervised Scene Flow Method in Autonomous Driving
    Self-supervised scene flow estimation from point clouds, eliminating the need for expensive human annotations.
    In European Conference on Computer Vision (ECCV), 2024

2023

  1. IJCAI 2023
    powerbev_arch.png
    PowerBEV: A Powerful Yet Lightweight Framework for Instance Prediction in Bird’s-Eye View
    Lightweight yet powerful BEV framework for joint instance segmentation and future motion prediction.
    In International Joint Conference on Artificial Intelligence (IJCAI), Aug 2023