Linux Patches Posted To Fix ~2x Performance Drop For CPU Workloads On NVIDIA Vera Rubin

•March 27, 2026

Phoronix•Mar 27, 2026

Key Takeaways

•NVIDIA Vera Rubin suffers ~2× CPU performance loss.
•Patches add SMT‑aware scheduling to Linux kernel.
•Scheduler now prefers idle cores over busy SMT siblings.
•Improves asymmetric CPU capacity handling for future SMT platforms.
•Expected merge into Linux v7.1 mainline kernel.

Summary

NVIDIA engineers identified a severe performance regression on the upcoming Vera Rubin CPUs, where enabling Simultaneous Multi‑Threading (SMT) caused up to a two‑fold slowdown for CPU‑intensive workloads. To address this, Linux kernel developer Andrea Righi submitted a patch series that adds SMT‑aware asymmetric CPU capacity scheduling, preferring fully idle cores and avoiding busy SMT siblings. The patches have been posted to the Linux Kernel Mailing List and are slated for inclusion in the upcoming Linux v7.1 merge window. If merged, the changes will protect data‑center workloads from the current bottleneck as Vera Rubin ships at scale.

Pulse Analysis

The upcoming NVIDIA Vera Rubin processor line introduced a surprising bottleneck: when Simultaneous Multi‑Threading (SMT) is enabled, the firmware reports slight frequency variations as asymmetric CPU capacity, causing the Linux scheduler to treat partially‑idle sibling threads as full‑capacity resources. In practice, this mis‑allocation led to up to a 2× slowdown for CPU‑bound workloads, a critical issue for data‑center operators who rely on predictable performance from new silicon. Recognizing the risk, NVIDIA’s Linux engineer Andrea Righi crafted a patch set that injects SMT awareness into the scheduler’s asymmetric capacity logic, ensuring idle cores are favored while busy SMT siblings are deprioritized.

The patch series modifies the SD_ASYM_CPUCAPACITY policy, a kernel component that selects idle CPUs based on reported capacity. By adding a check for sibling activity, the scheduler now distinguishes between truly idle cores and those sharing a physical core with an active thread. This refinement not only recovers the lost performance on Vera Rubin but also establishes a more robust framework for handling future platforms with uneven core capabilities or mixed‑generation designs. Early testing shows the changes restore near‑baseline throughput for synthetic CPU‑intensive benchmarks, confirming the efficacy of the approach compared to alternative strategies such as normalizing capacity values across all cores.

Beyond the immediate fix, the patches signal a broader shift in Linux kernel development toward finer‑grained awareness of heterogeneous hardware. As more vendors adopt SMT and asymmetric core configurations, kernel schedulers must evolve to prevent similar regressions. The community’s rapid review on LKML and the anticipated inclusion in Linux v7.1 demonstrate a collaborative commitment to maintaining performance parity across diverse architectures, ultimately benefiting enterprises that depend on Linux for scalable, high‑performance computing workloads.