Rafay – Blog

Rafay – Blog

Publication
0 followers

Kubernetes platform operations and automation.

Automated GPU Health Monitoring with NVIDIA NVSentinel on the Rafay Platform
NewsApr 12, 2026

Automated GPU Health Monitoring with NVIDIA NVSentinel on the Rafay Platform

GPU clusters cost tens of thousands of dollars per unit, and hardware faults can halt AI training and inference, jeopardizing service‑level agreements. NVIDIA’s open‑source NVSentinel adds continuous health monitoring, automatic quarantine, and self‑healing to Kubernetes‑managed GPU fleets. Rafay integrates NVSentinel...

By Rafay – Blog
AI Factories Will Be Won on Efficiency | Rafay + Kubex Partnership
NewsApr 10, 2026

AI Factories Will Be Won on Efficiency | Rafay + Kubex Partnership

Enterprises are moving from AI experimentation to building "AI factories"—repeatable, governed platforms that can train, deploy, and operate models at scale. Rafay and Kubex announced a partnership that combines Rafay's Kubernetes‑based AI orchestration with Kubex's autonomous GPU optimization. The joint...

By Rafay – Blog
Compute Domains & Multi-Node NVLink in Kubernetes: Scaling GPU Workloads
NewsApr 9, 2026

Compute Domains & Multi-Node NVLink in Kubernetes: Scaling GPU Workloads

NVIDIA’s ComputeDomains add a Kubernetes‑native layer that dynamically creates and tears down multi‑node NVLink communication groups for GPU workloads. By extending the Dynamic Resource Allocation driver, the feature makes cross‑node bandwidth a schedulable resource rather than a static configuration. This...

By Rafay – Blog
The Telco AI Imperative: From Connectivity to Sovereign AI Infrastructure
NewsApr 8, 2026

The Telco AI Imperative: From Connectivity to Sovereign AI Infrastructure

Telecommunications operators are poised to shift from pure connectivity to sovereign AI infrastructure by leveraging their distributed data centers, edge fiber, and regulatory trust. The surge in AI training and inference workloads creates demand for GPU‑as‑a‑Service, token‑as‑a‑service, and AI marketplaces...

By Rafay – Blog
Token Factory GA: Monetize AI with Token-Based APIs | Rafay
NewsApr 2, 2026

Token Factory GA: Monetize AI with Token-Based APIs | Rafay

Rafay Systems has launched Token Factory, a token‑metered API layer for AI models that lets neocloud and AI‑factory operators monetize compute without building their own orchestration stack. The announcement coincides with the industry’s pivot to token‑based commerce, a trend underscored...

By Rafay – Blog
Understanding LLM Inference Metrics in Rafay's Token Factory
NewsMar 27, 2026

Understanding LLM Inference Metrics in Rafay's Token Factory

Rafay’s Token Factory turns GPU clusters into managed LLM inference APIs with built‑in multi‑tenancy, token‑metered billing and auto‑scaling. The platform ships a metrics dashboard that surfaces latency (TTFT, ITL, E2E), throughput and KV‑cache utilization at multiple percentiles, letting operators gauge...

By Rafay – Blog
Flexible GPU Billing Models for AI Clouds: Powering the AI Factory with Rafay
NewsMar 25, 2026

Flexible GPU Billing Models for AI Clouds: Powering the AI Factory with Rafay

Rafay announced the addition of a reservation‑based billing model to its GPU‑cloud platform, complementing existing on‑demand and monthly recurring charge options. The new feature guarantees customers access to a specified number of GPUs—such as 16 NVIDIA H200 units—for a fixed...

By Rafay – Blog
Kubernetes Makes GPUs First-Class: Advances in Allocation, Scheduling, and Isolation
NewsMar 25, 2026

Kubernetes Makes GPUs First-Class: Advances in Allocation, Scheduling, and Isolation

At KubeCon Europe 2026 NVIDIA donated its Dynamic Resource Allocation (DRA) driver, saw the KAI scheduler graduate to a CNCF Sandbox project, and added GPU support to Kata Containers. These moves turn GPUs into first‑class, community‑owned resources in Kubernetes, enabling...

By Rafay – Blog
Eliminate SSH Access with Rafay MKS Control Plane Overrides
NewsMar 24, 2026

Eliminate SSH Access with Rafay MKS Control Plane Overrides

Rafay has introduced Control Plane Overrides for its Managed Kubernetes Service (MKS), allowing administrators to customize API Server, Controller Manager, and Scheduler settings without SSHing into master nodes. The declarative approach lets users define extra arguments, volumes, and mounts directly...

By Rafay – Blog
OpenClaw on Kubernetes: Designing Always-On AI as a Platform Service Meta Description
NewsMar 23, 2026

OpenClaw on Kubernetes: Designing Always-On AI as a Platform Service Meta Description

OpenClaw is an open‑source, gateway‑centric runtime that turns generative AI into an always‑on service deployed on Kubernetes. It provides a unified onboarding flow for workspaces, channels and skills, and ships with a documented Kubernetes install path and operator. The platform...

By Rafay – Blog
A Self-Service GPU Experience That Feels Instant | Rafay
NewsMar 23, 2026

A Self-Service GPU Experience That Feels Instant | Rafay

Rafay’s Developer Pods let developers request GPU‑enabled environments through a simple UI, bypassing tickets, YAML, and long wait times. Within roughly 30 seconds, a pod spins up and is reachable via SSH, offering pre‑built images such as Ubuntu and various...

By Rafay – Blog
Developer Pods for Platform Teams: Designing Self-Service GPU Experiences
NewsMar 23, 2026

Developer Pods for Platform Teams: Designing Self-Service GPU Experiences

Rafay’s SKU Studio enables platform teams to package GPU‑powered Kubernetes environments as ready‑to‑use Developer Pods. By defining curated SKUs with clear descriptions, guided inputs and prescriptive outputs, teams turn raw infrastructure into a self‑service product that launches in about 30...

By Rafay – Blog
Instant Developer Pods: Rethinking GPU Access for AI Teams | Rafay
NewsMar 23, 2026

Instant Developer Pods: Rethinking GPU Access for AI Teams | Rafay

Rafay’s Developer Pods redefine GPU access by delivering ready‑to‑use Ubuntu environments with CUDA in roughly 30 seconds, eliminating the multi‑day ticket queues and bulky VM provisioning that plague many enterprises. The solution abstracts Kubernetes away from developers, offering a simple...

By Rafay – Blog
Stop Paying for Unused Kubernetes Resources | Optimize Pod Efficiency
NewsMar 23, 2026

Stop Paying for Unused Kubernetes Resources | Optimize Pod Efficiency

Kubernetes platforms often suffer from over‑provisioned pods as developers pad CPU and memory requests to avoid OOM or throttling. Rafay’s App Resizing feature, introduced in the 4.1 release, collects 30‑day utilization metrics and generates per‑pod reports comparing requests to P90,...

By Rafay – Blog