Training Hardest Robots, Then Simplifying, Yields Universal Dexterity
🔥 #ICRA2026 Best Paper Finalist The era of "robot VLA = single-arm gripper" is ending. Introducing Dexora — the first open-source Vision-Language-Action system for dual-arm, dual-hand, 36-DoF dexterous manipulation. 🦾 Dual Arms 🖐️ Dual Hands 🎯 36 DoF Control 🌍 Open Source Trained on: • 100K simulated trajectories • 10K real-world demonstrations Dexora achieves: ✓ 90%+ success on basic manipulation ✓ Strong dexterous manipulation performance ✓ Cross-embodiment generalization Our key hypothesis: Train on the hardest embodiment. Transfer to simpler robots later. Instead of scaling up gripper policies, we train directly in the most expressive action space and project downward to simpler embodiments. This may be a practical path toward universal robot controllers. 🎥 Demos: https://t.co/Qkvzl8d5Dl 📄 Paper: https://t.co/InWVHE9k8S
Train 1B 3D Gaussians on One 24GB GPU
🚀 #ICML2026 TideGS trains over 1 BILLION 3D Gaussian primitives on a single 24GB GPU. No multi-GPU cluster. No 80GB H100 requirement. We rethink 3DGS training as a working-set caching problem instead of persistent VRAM residency. 💡 Key idea: Only visible Gaussians are active each iteration,...
LLMs Still Falter at 3D Vision Scientific Coding
#CVPR2026 Can frontier LLMs write PhD-level 3D vision code? We introduce GeoCodeBench, a benchmark that asks models to read real 3D geometric vision papers and implement core functions. Best result so far: GPT-5 reaches only 36.6%. This suggests that scientific coding in 3D...
Simulated HOI Data Replaces Human Data with PAM
[CVPR 2026] The embodied AI community is going all-in on human data: teleop, mocap, ego videos… We take a different path: 👉 Generate HOI data from simulation 👉 Bridge to realism with diffusion 👉 Use it to replace human data Introducing PAM — a unified...
4D Occupancy Drives Photoreal
🚀 ORV: 4D Occupancy-centric Robot Video Generation (CVPR 2026) https://t.co/9ILR3w3XND What if we could generate photorealistic robot manipulation videos with precise 4D control? With ORV, we condition video generation on 4D semantic occupancy, enabling: ✨ High-fidelity robot videos with fine-grained motion control 🎥 Multi-view generation...