Benjin ZHU

Benjin ZHU 本金 朱

Ph.D Candidate

Chinese University of Hong Kong

Biography

📢 Open to Full-Time positions related to VLMs, VLA, or Content Generation (2025)

Benjin ZHU is a final-year Ph.D candidate at the Department of Electronic Engineering, The Chinese University of Hong Kong since 2021, where he is affiliated to the MultiMedia Lab, and supervised by Prof. Hongsheng LI and Prof. Xiaogang WANG. He earned his Bachelor’s in Software Engineering from South China University of Technology in 2018.

Benjin’s current research interests include VLA Models, Image/Video Generation, and Driving World Simulators. His recent works cover 3D driving scene understanding, reconstruction & synthesis, and HD Mapping. His works have been recognized at TOP conferences like CVPR/ICCV/ECCV. He has also published influential works on Object Detection and Self-Supervised Pretraining. His achievements include winning multiple TOP international competitions like the first nuScenes 3D Object Detection Challenge at WAD, CVPR 2019, where he proposed CBGS (widely adopted by both academia and industry). Benjin has also made significant contributions to open-source computer vision frameworks, including Det3D, CVPods, and EFG that garner substantial popularity.

Interests
  • Image / Video Generation
  • VLMs / VLA Models
  • Simulation & Data Engine
  • AI Infrastructure
Education
  • Ph.D. in Electronic Engineering, 2021 ~ Present

    The Chinese Universityh of Hong Kong (CUHK)

  • B.Eng. in Software Engineering, 2014 ~ 2018

    South China University of Technology (SCUT)

News

  • 2024-07 The high-res nuCraft 3D Occupancy Dataset is accepted by ECCV 2024. ✨
  • 2023-07 TrajectoryFormer is accepted by ICCV 2023.
  • 2023-03 EFG, an Efficient, Flexible, and General deep learning framework is public avaiable!
  • 2022-12 ConQueR is accepted by CVPR 2023, and selected as a Highlight (Top 2.5%). ✨
  • 2022-07 MPPNet ranks 1st on WOD 3D Object Detection, and is accepted by ECCV 2022.

Work Experience

 
 
 
 
 
MEGVII Research
Researcher
January 2019 – May 2021 Beijing, China
End-to-end Object Detection, Unsupervised/Self-supervised Learning, Research Infrustrature.
 
 
 
 
 
Horizon Robotics
Perception Algorithm Engineer
April 2018 – January 2019 Beijing, China
Full-stack Point Cloud 3D Object Detection Research, Development, and Depolyment.

More Publications

(2020). EqCo: Equivalent Rules for Self-supervised Contrastive Learning. arXiv.

PDF Cite Code

(2020). AutoAssign: Differentiable Label Assignment for Dense Object Detection. arXiv.

PDF Cite Code

(2019). Class-balanced Grouping and Sampling for Point Cloud 3D Object Detection. arXiv.

PDF Cite Code

Projects

For all projects, see here.
EFG: An Efficient, Flexible, and General deep learning framework that retains minimal.
Easy-to-use research codebase. Users can use EFG to explore any research topics following project templates.
EFG: An Efficient, Flexible, and General deep learning framework that retains minimal.
CVPods: All-in-one Toolbox for Computer Vision Research.
Welcome to cvpods, a versatile and efficient codebase for many computer vision tasks. The aim of cvpods is to achieve efficient experiments management and smooth tasks-switching.
CVPods: All-in-one Toolbox for Computer Vision Research.
Det3D: World’s First General Purpose 3D Object Detection Codebase.
Winner solution of nuScenes 3D Detection Challenge at WAD, CVPR 2019 and more.
Det3D: World's First General Purpose 3D Object Detection Codebase.