2026 Spring Robotics Colloquium: David Held (Carnege Mellon University)

UW CSE (Allen School)
UW CSE (Allen School)May 13, 2026

Why It Matters

Bridging generalist perception models and specialist control could enable robots to perform complex, real-world tasks (like cooking or flexible manufacturing) reliably across novel objects and settings, reducing reliance on impractically large datasets. The hierarchical approach offers a scalable path to deployable robot manipulation by improving task precision and generalization simultaneously.

Summary

David Held of Carnegie Mellon outlined research toward robot manipulation that is both precise and generalizable, arguing that foundation models have achieved broad world knowledge but lack the task-level accuracy specialist systems provide. He presented ArticuBad, a simulation-generated dataset of over 40,000 trajectories for articulated-object manipulation, and showed end-to-end policies trained on that data failed to generalize to unseen objects. Held demonstrated that a hierarchical imitation-learning approach—predicting keyframes with a high-level policy and conditioning a low-level controller to execute motions between them—substantially improved transfer to novel objects and tasks, drawing on ideas from contact-aware grasping and motion planning. He framed this work as part of a broader agenda to combine large-scale data and expressive policies with structured, task-specific control to handle deformable and articulated object manipulation with high precision.

Original Description

Title: Precise and Generalizable Robot Manipulation
Speaker: David Held (Carnege Mellon University)
Date: Friday, May 8, 2026
Abstract: Robots in factories are still largely limited to structured environments with known object models. How can we bring robots into the more diverse, unstructured settings of our daily lives, where objects vary widely in shape and appearance, while maintaining reliable performance? A popular direction today is to train generalist robot policies on large-scale internet data and broad robot datasets. However, today’s generalist policies still lack the precision needed for robust real-world operation. In this talk, I argue that closing this gap requires learning a hierarchy over robot motion: learning both what subgoals to achieve as well as how to move the robot end-effector to achieve them. I will present hierarchical motion policies that combine high-level subgoal prediction with a learned low-level policy. I will show how this hierarchical approach has enabled us to achieve both generalizable and precise object manipulation.
Bio: David Held is an Associate Professor at Carnegie Mellon University in the Robotics Institute and is the director of the RPAD Lab: Robots Perceiving And Doing. His research focuses on perceptual robot learning, i.e. developing new methods at the intersection of robot perception and planning for robots to learn to interact with novel, perceptually challenging, and deformable objects. Prior to coming to CMU, David was a post-doctoral researcher at U.C. Berkeley, and he completed his Ph.D. in Computer Science at Stanford University. David also has a B.S. and M.S. in Mechanical Engineering at MIT. David is a recipient of the Google Faculty Research Award in 2017 and the NSF CAREER Award in 2021.
This video is in the process of being closed captioned.

Comments

Want to join the conversation?

Loading comments...