Robotics paper index
OpenHLM: An Empirical Recipe for Whole-Body Humanoid Loco-Manipulation
One-line summary
A robotics research paper on OpenHLM: An Empirical Recipe for Whole-Body Humanoid Loco-Manipulation.
Engineering notes
Engineering notes will be added by the Robot Papers editorial team.
Chinese explanation / 中文解读
中文解读待补充:本站会优先为 VLA、具身智能、人形机器人控制、机器人操作等高价值论文补充中文说明。
Original abstract
Whole-body humanoid loco-manipulation requires coordinating the robot's entire kinematic chain. However, most existing systems typically decouple the upper and lower bodies into separate controllers, limiting such coordination and yielding behaviors similar to those of a wheeled dual-arm platform. In this paper, we ask what it takes to build a whole-body native vision-language-action (VLA) model that maps language and pixels directly to all of the humanoid's degrees of freedom. We conduct a systematic empirical study organized as a roadmap of one-variable-at-a-time experiments across three phases: whole-body teleoperation, VLA model design, and heterogeneous co-training. Our study yields several intriguing findings: a joint-based whole-body teleoperation interface outperforms alternatives that only partially expose the humanoid's degrees of freedom; a VLA pretrained on static and wheeled dual-arm platforms transfers surprisingly well to a humanoid's full action space; and co-training with HuMI, the humanoid analog of UMI, extends the policy to new objects and instructions without additional whole-body teleoperation on those targets. Following this roadmap yields OpenHLM, an open-source recipe for whole-body humanoid loco-manipulation. In a challenging long-horizon task that spans a wide vertical range of the humanoid, OpenHLM outperforms two state-of-the-art humanoid VLA baselines (GR00T N1.6 and $Ψ_0$) using less than half the total demonstration time. Our code, training data, and model checkpoints are available at [https://openhlm-project.github.io/].
Links and sources
Need this topic turned into a technical roadmap?
Robot Papers can prepare a custom robotics literature review, code map, dataset map, and B2B technology assessment.
Request B2B research
Comments