2026 Spring Robotics Colloquium: David Held (Carnege Mellon University)
Why It Matters
Bridging generalist perception models and specialist control could enable robots to perform complex, real-world tasks (like cooking or flexible manufacturing) reliably across novel objects and settings, reducing reliance on impractically large datasets. The hierarchical approach offers a scalable path to deployable robot manipulation by improving task precision and generalization simultaneously.
Summary
David Held of Carnegie Mellon outlined research toward robot manipulation that is both precise and generalizable, arguing that foundation models have achieved broad world knowledge but lack the task-level accuracy specialist systems provide. He presented ArticuBad, a simulation-generated dataset of over 40,000 trajectories for articulated-object manipulation, and showed end-to-end policies trained on that data failed to generalize to unseen objects. Held demonstrated that a hierarchical imitation-learning approach—predicting keyframes with a high-level policy and conditioning a low-level controller to execute motions between them—substantially improved transfer to novel objects and tasks, drawing on ideas from contact-aware grasping and motion planning. He framed this work as part of a broader agenda to combine large-scale data and expressive policies with structured, task-specific control to handle deformable and articulated object manipulation with high precision.
Comments
Want to join the conversation?
Loading comments...