NVIDIA and Google Cloud Collaborate to Advance Agentic and Physical AI
Key Takeaways
- •NVIDIA Vera Rubin A5X instances cut inference cost per token 10x
- •Google Cloud can scale to 960,000 NVIDIA Rubin GPUs across sites
- •Confidential G4 VMs protect prompts and data with encrypted GPU memory
- •Open models like Nemotron 3 Super run on Gemini Enterprise Agent Platform
- •Over 90,000 developers use the joint NVIDIA‑Google Cloud AI platform
Pulse Analysis
The latest generation of NVIDIA‑Google Cloud hardware redefines the economics of large‑scale inference. Powered by the Vera Rubin NVL72 rack‑scale system, the A5X bare‑metal instances combine ConnectX‑9 SuperNICs with Google’s Virgo networking to deliver up to ten times lower cost per token and ten times higher token throughput per megawatt compared with the previous generation. This efficiency, coupled with the ability to cluster up to 80,000 Rubin GPUs in a single site and 960,000 across multiple sites, gives customers unprecedented compute density for training frontier models, multimodal reasoning and real‑time robotics simulations.
Security has become a first‑class feature of the platform. NVIDIA Confidential Computing on Blackwell GPUs enables Gemini and other large language models to run in encrypted memory, ensuring that prompts, fine‑tuning data and model weights remain invisible to cloud operators. The preview of Confidential G4 VMs extends this protection to multi‑tenant environments, opening the door for regulated sectors such as finance, healthcare and defense to adopt generative AI without compromising compliance. At the same time, the open‑weight ecosystem—highlighted by Nemotron 3 Super on the Gemini Enterprise Agent Platform—gives developers the flexibility to customize and deploy both proprietary and community models.
The joint stack is already fueling production workloads across a diverse set of enterprises. OpenAI relies on the GB300 and GB200 systems for high‑throughput inference, while Snap reduces A/B testing costs by moving GPU‑accelerated Spark pipelines to Google Cloud. In the life‑science arena, Schrödinger compresses weeks‑long drug‑discovery simulations into hours, and CrowdStrike leverages NVIDIA NeMo tools to generate synthetic data for cybersecurity models. With more than 90,000 developers in the ecosystem and two Google Cloud Partner of the Year awards, the collaboration is poised to accelerate the next wave of agentic and physical AI that will automate factories, autonomous vehicles and digital twins at scale.
NVIDIA and Google Cloud Collaborate to Advance Agentic and Physical AI
Comments
Want to join the conversation?