
The video chronicles a creator’s effort to teach a Unitree G1 quadruped to walk using reinforcement‑learning techniques, emphasizing the transition from pure simulation (Sim2Sim) to real‑world deployment (Sim2Real). After years of attempting Sim2Real, the presenter finally succeeded thanks to advances in actuator quality and a more faithful simulation environment. Key technical takeaways include the choice of MJLab as the simulation platform, the shift from an implicit to an explicit PD controller to avoid privileged data, and the practical training setup—running dozens of parallel environments on an RTX 4090, with policies converging after roughly 5,000 to 50,000 iterations depending on terrain complexity. The presenter also highlights the importance of matching observation spaces, noting that the real G1 lacks direct linear‑velocity measurements, which complicates transfer. Notable moments feature the creator’s exuberant reaction when the robot first balanced on its own, a shout‑out to Kevin Zaka (author of MJLab) for community support, and a candid discussion of the robot’s blind navigation—struggling on stairs and relying solely on proprioception. The policy, while not perfect, demonstrates stable walking on uneven foam pits and modest rough‑terrain handling, underscoring both the promise and current limits of the approach. The broader implication is that Sim2Sim pipelines, when carefully aligned with real‑world control loops, can now be leveraged by individual developers rather than large labs. This lowers the entry barrier for deploying learning‑based locomotion on affordable quadrupeds, paving the way for more complex tasks such as household assistance, provided sensor gaps (e.g., velocity estimation) are addressed.
This is a vertically integrated end to end deep neural network performing forward pass inference real-time, controlling individual actuator's torque output for bidpedal gait generation in adverse, GPS denied, envs. ok its standard PPO rl trained in mjlab, strapped to a...
idk when first mover advantage was actually a thing, but it sure seems to not be a thing for AI and robotics.
im convinced getting a robot to be generally good at going into a random kitchen and unloading a dishwasher with e2e ml is orders of magnitude harder than self driving cars with e2e ml.
gpt-5.2 is a fine a model, but tbh, I'm going back to gemini because it's also a fine model and the CLI ui/ux is like 10x better. Really feels like codex cli is falling way behind competition.
just when I think i'm out, openai pulls me back in with gpt-5.2. Let's see what it's all about. Any opinions on 5.2 in the context of coding agents?
Still my fav model and cli. Gemini 3 pro + cli also is the first agent/terminal combo that doesn't routinely make my blood boil while using it.
Today, we're hoping for the best, preparing for the worst. In gait training, the episode terminates if a fall is detected, so a fall in real is wildly out of distribution. So we need to handle for falls. Eventually be cool to...
This might be the best humanoid jog I've seen yet. Doubt this is pure RL, I'd love to know the curriculum here.
Geoff the G1 preparing to go offroading IRL. I did a terrible job at the reward function here and was actually just tuning in to see all of what was broken and instead found a pretty good model. The robots just want...

After a couple days of heavy use, tldr: it’s a good model and gemini cli is very good. The pair might be the best for terminal-based agents. But it’s not perfect. Pros: I never hit any limit for gemini 3 pro and...
Ladies and gentlemen, we have our first successful sim2real transfer on Geoff the G1! https://t.co/eywBjbiLxy
we found ourselves a nice lil local optimum. https://t.co/PXevVKYcpL