UZH Robotics and Perception Group

Publication

0 followers

Research group at University of Zurich sharing advances in autonomous drones and vision-based robotics, focusing on machines navigating with onboard cameras (no GPS).

Actor-Critic MPC: Differentiable Optimization Meets Reinforcement Learning for Agile Flight (TRO'25)

The paper presents Actor‑Critic Model Predictive Control (ACMPC), a hybrid framework that merges a differentiable MPC module with an actor‑critic reinforcement‑learning architecture to achieve agile flight in highly nonlinear quadrotor systems. By embedding a dynamics model directly into the MPC, the agent receives prior knowledge before any data is collected, while a deep cost‑map network translates raw observations into the MPC’s cost function. Experiments demonstrate that ACMPC maintains robustness in out‑of‑distribution conditions and adapts to substantial variations in system parameters without retraining, outperforming both model‑free RL and conventional MPC. On the split‑S track, ACMPC’s success rate remained high as dynamics parameters were perturbed, whereas standard MPC degraded sharply. The authors also introduce Model Predictive Value Expansion, leveraging MPC predictions to refine the critic’s value function, which yields markedly better sample efficiency. Visualizations of the learned value function reveal rapid shifts toward upcoming gates, producing emergent mode‑switching behavior that traditional MPC cannot replicate. These results suggest a pathway to safer, more interpretable autonomous flight controllers that require less hand‑tuning and can generalize across changing environments, opening opportunities for deployment in commercial drones, delivery services, and other high‑risk robotics applications.

By UZH Robotics and Perception Group

Video•Jan 14, 2026

Multi-Task Reinforcement Learning for Quadrotors

The video introduces a multitask reinforcement‑learning framework that trains a single, generalist controller for quadrotors capable of handling stabilization, high‑speed racing, and velocity‑tracking commands. By partitioning sensor inputs into shared and task‑specific observations, the system feeds each through a common...

By UZH Robotics and Perception Group

Video•Jan 14, 2026

Learning on the Fly: Rapid Policy Adaptation via Differentiable Simulation (RA-L 2026)

The paper introduces a novel online learning framework—Rapid Policy Adaptation via Differentiable Simulation (RA‑L 2026)—that lets quadrotor controllers adjust to unknown disturbances in seconds during real‑world deployment. The method starts with a low‑fidelity, fully differentiable dynamics model to train a policy...

By UZH Robotics and Perception Group

Technology Pulse

UZH Robotics and Perception Group

Recent Posts

Actor-Critic MPC: Differentiable Optimization Meets Reinforcement Learning for Agile Flight (TRO'25)

Multi-Task Reinforcement Learning for Quadrotors

Learning on the Fly: Rapid Policy Adaptation via Differentiable Simulation (RA-L 2026)

Technology Pulse

UZH Robotics and Perception Group

Recent Posts

Actor-Critic MPC: Differentiable Optimization Meets Reinforcement Learning for Agile Flight (TRO'25)

Multi-Task Reinforcement Learning for Quadrotors

Learning on the Fly: Rapid Policy Adaptation via Differentiable Simulation (RA-L 2026)