Rafay – Blog - Latest News and Information

All News Deals Social Blogs Videos Podcasts Digests

Rafay – Blog

Publication

0 followers

Kubernetes platform operations and automation.

Token Factory GA: Monetize AI with Token-Based APIs | Rafay

News•Apr 2, 2026

Token Factory GA: Monetize AI with Token-Based APIs | Rafay

Rafay Systems has launched Token Factory, a token‑metered API layer for AI models that lets neocloud and AI‑factory operators monetize compute without building their own orchestration stack. The announcement coincides with the industry’s pivot to token‑based commerce, a trend underscored by Jensen Huang at GTC 2026. The GPU‑as‑a‑Service market is projected to reach $26.43 billion by 2031, creating a lucrative opportunity for operators who can sell token plans instead of raw GPU hours. Early adopters across six continents are already deploying the solution.

By Rafay – Blog

Understanding LLM Inference Metrics in Rafay's Token Factory

News•Mar 27, 2026

Understanding LLM Inference Metrics in Rafay's Token Factory

Rafay’s Token Factory turns GPU clusters into managed LLM inference APIs with built‑in multi‑tenancy, token‑metered billing and auto‑scaling. The platform ships a metrics dashboard that surfaces latency (TTFT, ITL, E2E), throughput and KV‑cache utilization at multiple percentiles, letting operators gauge...

By Rafay – Blog

Flexible GPU Billing Models for AI Clouds: Powering the AI Factory with Rafay

News•Mar 25, 2026

Flexible GPU Billing Models for AI Clouds: Powering the AI Factory with Rafay

Rafay announced the addition of a reservation‑based billing model to its GPU‑cloud platform, complementing existing on‑demand and monthly recurring charge options. The new feature guarantees customers access to a specified number of GPUs—such as 16 NVIDIA H200 units—for a fixed...

By Rafay – Blog

Kubernetes Makes GPUs First-Class: Advances in Allocation, Scheduling, and Isolation

News•Mar 25, 2026

Kubernetes Makes GPUs First-Class: Advances in Allocation, Scheduling, and Isolation

At KubeCon Europe 2026 NVIDIA donated its Dynamic Resource Allocation (DRA) driver, saw the KAI scheduler graduate to a CNCF Sandbox project, and added GPU support to Kata Containers. These moves turn GPUs into first‑class, community‑owned resources in Kubernetes, enabling...

By Rafay – Blog

Eliminate SSH Access with Rafay MKS Control Plane Overrides

News•Mar 24, 2026

Eliminate SSH Access with Rafay MKS Control Plane Overrides

Rafay has introduced Control Plane Overrides for its Managed Kubernetes Service (MKS), allowing administrators to customize API Server, Controller Manager, and Scheduler settings without SSHing into master nodes. The declarative approach lets users define extra arguments, volumes, and mounts directly...

By Rafay – Blog

OpenClaw on Kubernetes: Designing Always-On AI as a Platform Service Meta Description

News•Mar 23, 2026

OpenClaw on Kubernetes: Designing Always-On AI as a Platform Service Meta Description

OpenClaw is an open‑source, gateway‑centric runtime that turns generative AI into an always‑on service deployed on Kubernetes. It provides a unified onboarding flow for workspaces, channels and skills, and ships with a documented Kubernetes install path and operator. The platform...

By Rafay – Blog

A Self-Service GPU Experience That Feels Instant | Rafay

News•Mar 23, 2026

A Self-Service GPU Experience That Feels Instant | Rafay

Rafay’s Developer Pods let developers request GPU‑enabled environments through a simple UI, bypassing tickets, YAML, and long wait times. Within roughly 30 seconds, a pod spins up and is reachable via SSH, offering pre‑built images such as Ubuntu and various...

By Rafay – Blog

Developer Pods for Platform Teams: Designing Self-Service GPU Experiences

News•Mar 23, 2026

Developer Pods for Platform Teams: Designing Self-Service GPU Experiences

Rafay’s SKU Studio enables platform teams to package GPU‑powered Kubernetes environments as ready‑to‑use Developer Pods. By defining curated SKUs with clear descriptions, guided inputs and prescriptive outputs, teams turn raw infrastructure into a self‑service product that launches in about 30...

By Rafay – Blog

Instant Developer Pods: Rethinking GPU Access for AI Teams | Rafay

News•Mar 23, 2026

Instant Developer Pods: Rethinking GPU Access for AI Teams | Rafay

Rafay’s Developer Pods redefine GPU access by delivering ready‑to‑use Ubuntu environments with CUDA in roughly 30 seconds, eliminating the multi‑day ticket queues and bulky VM provisioning that plague many enterprises. The solution abstracts Kubernetes away from developers, offering a simple...

By Rafay – Blog

Stop Paying for Unused Kubernetes Resources | Optimize Pod Efficiency

News•Mar 23, 2026

Stop Paying for Unused Kubernetes Resources | Optimize Pod Efficiency

Kubernetes platforms often suffer from over‑provisioned pods as developers pad CPU and memory requests to avoid OOM or throttling. Rafay’s App Resizing feature, introduced in the 4.1 release, collects 30‑day utilization metrics and generates per‑pod reports comparing requests to P90,...

By Rafay – Blog

How Rafay & NVIDIA Help NeoClouds Monetize AI with Token Factories

News•Mar 18, 2026

How Rafay & NVIDIA Help NeoClouds Monetize AI with Token Factories

The AI surge has spurred a new class of GPU‑first cloud providers, called neoclouds, that initially sold raw GPU capacity. Rafay’s Token Factory now lets these providers expose models as token‑metered APIs, turning infrastructure into a consumable AI service. Deep...

By Rafay – Blog

‍Rafay Launches AI Grid Orchestration Solution to Help Telcos Intelligently Deploy Distributed AI Infrastructure‍

News•Mar 17, 2026

‍Rafay Launches AI Grid Orchestration Solution to Help Telcos Intelligently Deploy Distributed AI Infrastructure‍

Rafay, an NVIDIA Inception startup, unveiled an AI Grid orchestration platform that turns existing telco edge infrastructure into a self‑service, multi‑tenant AI factory. The solution lets operators express intent—such as latency, cost, or security requirements—and automatically places GPU workloads across...

By Rafay – Blog

From Infrastructure Validation to Market Validation: Rafay and NVIDIA DSX Air

News•Mar 16, 2026

From Infrastructure Validation to Market Validation: Rafay and NVIDIA DSX Air

NVIDIA DSX Air provides a full‑stack simulation that lets cloud providers validate networking, GPU servers, storage and connectivity before any rack is shipped. Rafay layers a self‑service orchestration platform on top, enabling multi‑tenant, governance and workflow testing alongside the hardware...

By Rafay – Blog

AI Assistants for Kubernetes: Secure Cluster Operations with MCP and Rafay ZTKA

News•Mar 10, 2026

AI Assistants for Kubernetes: Secure Cluster Operations with MCP and Rafay ZTKA

The Model Context Protocol (MCP) lets AI assistants run Kubernetes commands through a local server while Rafay’s Zero Trust Kubectl Access (ZTKA) supplies a secure, token‑less kubeconfig. This architecture places the MCP server on the admin workstation, routes traffic via...

By Rafay – Blog

Run GPU Hackathons at Scale: How Rafay Enables GPU Cloud Providers

News•Mar 10, 2026

Run GPU Hackathons at Scale: How Rafay Enables GPU Cloud Providers

Rafay’s platform lets GPU cloud operators provision and manage thousands of GPU‑backed Jupyter notebooks for hackathons through a declarative API and templated SKUs. By batching parallel API calls and using an inventory‑aware scheduler, operators can spin up 1,000 environments in...

By Rafay – Blog

Validate GPU Health in Kubernetes with Rafay Zero Trust Kubectl Access

News•Mar 10, 2026

Validate GPU Health in Kubernetes with Rafay Zero Trust Kubectl Access

Rafay’s zero‑trust kubectl lets operators run commands inside pods on remote GPU‑enabled Kubernetes clusters without exposing the API or using bastion hosts. Using this workflow, they open an exec session to the nvidia‑dcgm‑exporter pod and execute nvidia‑smi to verify driver,...

By Rafay – Blog

Rafay Joins VAST Cosmos to Enable Governed GPU-Powered AI Services

News•Feb 25, 2026

Rafay Joins VAST Cosmos to Enable Governed GPU-Powered AI Services

Rafay has joined the VAST Cosmos Community as a Technology Partner, aligning its AI‑native cloud control plane with VAST Data’s AI Operating System. The collaboration integrates Rafay’s orchestration platform with VAST’s governed storage services, creating a unified, multi‑tenant AI service...

By Rafay – Blog

What Is an AI Factory? Enterprise & Cloud Guide

News•Feb 17, 2026

What Is an AI Factory? Enterprise & Cloud Guide

An AI factory is an operational model that industrializes artificial‑intelligence development by linking high‑performance compute, data pipelines, orchestration, governance and deployment into a continuous production system. The concept, popularized by NVIDIA, moves AI from isolated experiments to repeatable, scalable outputs....

By Rafay – Blog

From Tickets to Self-Service AI Infrastructure

News•Feb 17, 2026

From Tickets to Self-Service AI Infrastructure

Many enterprises still provision AI resources through ticket systems, causing delays and underutilized GPUs. Modern developers now expect instant, self‑service access similar to hyperscaler offerings, making manual approval a competitive risk. The shift to automated, governed platforms improves utilization, speeds...

By Rafay – Blog

What Is Amazon EKS? EKS & EKS Anywhere Explained | Rafay

News•Feb 17, 2026

What Is Amazon EKS? EKS & EKS Anywhere Explained | Rafay

Amazon Elastic Kubernetes Service (EKS) dominates the managed Kubernetes market with roughly 50% share, offering a fully managed control plane, deep AWS integration, and serverless compute via Fargate. EKS Anywhere, launched in 2020, extends the same open‑source distro to on‑premise...

By Rafay – Blog

Migrating Existing Amazon EKS Clusters to EKS Auto Mode | Rafay

News•Feb 17, 2026

Migrating Existing Amazon EKS Clusters to EKS Auto Mode | Rafay

Amazon EKS Auto Mode automates node scaling, patching, and add‑on management, but AWS does not provide an automated path for migrating applications, storage, or ingress. Rafay offers a guided, cluster‑level migration process that includes converting to access entries, enabling Auto...

By Rafay – Blog