Speaker Biography
Dr. Fuchun Sun is a Tenured Professor in the Department of Computer Science and Technology at Tsinghua University, where he also serves as the Director of the Intelligent Robotics Center at the Institute of Artificial Intelligence and Deputy Director of the Committee of Tenured Professors. He is currently the Vice Chairman of the Chinese Association for Artificial Intelligence (CAAI) and an Executive Director of the Chinese Association for Automation (CAA). His research focuses on robotic perception, skill learning, cross-modal learning, and intelligent control. Dr. Sun has led teams to win championships in the Autonomous Grasp Challenges at IROS in 2016 and 2019, and at ICRA in 2015 and 2024. He was elected IEEE Fellow and CAAI Fellow in 2019, and CAA Fellow in 2020. He is also a recipient of the Excellent Doctoral Dissertation Award of China (2000) by the Chinese Ministry of Education, the Choon-Gang Academic Award by Korea (2003), and was recognized as a Distinguished Young Scholar by the National Natural Science Foundation of China in 2006. He has served as Editor-in-Chief of Cognitive Computation and Systems and AI and Autonomous Systems, and as an Associate Editor for IEEE Transactions on Fuzzy Systems.
Abstract
The Vision-Language-Action (VLA) paradigm has significantly advanced robotic control through Internet-scale pre-training. However, its application to real-world manipulation tasks, particularly those requiring high precision in contact-rich scenarios or dealing with complex dynamics, is often limited by a lack of fine-grained physical grounding. To address this, we propose a Knowledge-Guided Tactile VLA framework that enhances traditional vision-language-action models with robust physical reasoning capabilities through tactile sensing and world modeling. Our Unified Digital Physics System (UDPS) incorporates tactile perception with physical knowledge prior via a novel tokenization scheme that encodes geometry, physics, and tactile cues into a unified representation. The cross-domain alignment distilled from geometry invariances substantially improving sim-to-real transfer for contact-rich manipulation. Simultaneously, physical token enables the modelling of dynamic and complex physical process, including soft-body deformation and contact transitions. The framework is rigorously validated in two demanding tasks: precision 3C assembly and humanoid handkerchief dancing. In 3C assembly, UDPS taking tactile feedback as position offset in sim-to-real transfer and achieves sub-millimeter precision in connector mating in a zero-shot manner. For handkerchief manipulation, the physical tokens models complex fabric dynamics, enabling stable rhythmic motions through whole-body coordination. These results demonstrate the critical importance of integrating physical knowledge and tactile sensing for solving complex, contact-rich manipulation tasks in real-world environments without real-world fine-tuning.
Comments
Want to join the conversation?
Loading comments...