Benjin ZHU
Benjin ZHU
Home
Publications
Projects
Blogs
Talks
Light
Dark
Automatic
Planning
Driving Intents Amplify Planning-Oriented Reinforcement Learning
DIAL — intent-CFG sampling + multi-intent GRPO lift WOD-E2E preference RL past prior best (RAP 8.5) and human demos for the first time.
Hengtong Lu
,
Victor Shea-Jay Huang
,
Chengmin Yang
,
Pengfei Jing
,
Jifeng Dai
,
Yan Xie
,
Benjin ZHU
PDF
Cite
Project
Cite
×