Haochen Zhang

prof_pic.jpg

Hi! I’m Haochen and I’m a second year Master’s of Science in Robotics student at Carnegie Mellon University’s Robotics Institute. I conduct research as part of the Field Robotics Center under the supervision of Dr. Ji Zhang and Dr. Wenshan Wang. My main research interests are in 3D semantic scene understanding, vision-language navigation, and interactive navigation agents.

Previously, I obtained my BASc in Engineering Science, Electrical and Computer Engineering at the University of Toronto. I completed my undergrad thesis under Dr. Scott Sanner in natural language preference retrieval at the Data-Driven Decision Making (D3M) Lab. I also obtained a Minor in Artificial Intelligence and Certificate in Engineering Leadership.

news

Jun 15, 2025 Paper on object grounding for navigation accepted to IROS 2025!
Jun 02, 2025 Our 2025 CMU Vision-Language-Autonomy Challenge is now open for registration! Results will be presented at the 2nd AI Meets Autonomy workshop at IROS 2025.
Jan 27, 2025 Benchmark paper on referential grounding accepted to ICRA 2025!
Oct 19, 2024 I organized and hosted our workshop, AI Meets Autonomy: Vision, Language, and Autonomous Systems at the IROS 2024 conference in Abu Dhabi! Read my recap about the workshop here!
Oct 09, 2024 Attended the 4th Space Imaging Workshop in Atlanta, Georgia where our extended abstract “Using Simulated Lunar Imagery to Train Real Networks” was presented. Read about it here. :new_moon:

teaching

Spring 2025 Graduate Teaching Assistant for 16-831 Introduction to Robot Learning

selected publications

  1. sort3d_fig.png
    SORT3D: Spatial Object-centric Reasoning Toolbox for Zero-Shot 3D Grounding Using Large Language Models
    Nader Zantout*Haochen Zhang*, Pujith Kachana , and 3 more authors
    To Appear, IROS 2025, 2025
  2. irefvla_fig.png
    IRef-VLA: A Benchmark for Interactive Referential Grounding with Imperfect Language in 3D Scenes
    Haochen Zhang*, Nader Zantout*, Pujith Kachana , and 2 more authors
    IEEE International Conference on Robotics and Automation, ICRA 2025, 2025
  3. vla-3d_sample.png
    VLA-3D: A Dataset for 3D Semantic Scene Understanding and Navigation
    Haochen Zhang, Nader Zantout, Pujith Kachana , and 3 more authors
    2024
  4. recipempr_table.png
    Recipe-MPR: A Test Collection for Evaluating Multi-aspect Preference-based Natural Language Retrieval
    Haochen Zhang*, Anton Korikov*, Parsa Farinneya , and 8 more authors
    In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval , 2023
  5. copath_workflow.png
    Computational Pathology: A Survey Review and The Way Forward
    Mahdi S Hosseini, Babak Ehteshami Bejnordi, Vincent Quoc-Huy Trinh , and 8 more authors
    Journal of Pathology Informatics, 2024