
The video demonstrates how to achieve true serverless behavior—automatic scaling to zero and back—using plain Kubernetes rather than proprietary services like AWS Lambda. By combining Crossplane, Envoy Gateway, KEDA (referred to as KDA), Prometheus, and a pod‑monitor, the author builds a self‑contained platform that provisions the entire stack on any cloud provider and wires the components together without vendor lock‑in. Key technical steps include defining a minimal cluster spec, letting Crossplane create the underlying infrastructure and install the required system apps, and then deploying an example workload. The demo progresses from a single static replica, to KEDA‑driven autoscaling based on Prometheus‑collected request metrics, and finally to a min‑replica setting of zero. An HTTP interceptor added by KEDA’s HTTP add‑on buffers incoming requests while pods spin up, ensuring no request loss during cold starts. During the live test, 100,000 requests at 200 RPS are sent repeatedly. With static scaling, one pod handles the load; with KEDA autoscaling, the system expands to the configured maximum of five pods and contracts back to the minimum when traffic subsides. When the minimum is zero, the interceptor holds the first request, triggers a scale‑up, and then forwards the buffered request, confirming zero‑loss behavior despite a brief cold‑start delay. The approach offers cost efficiency and multi‑cloud flexibility, but introduces cold‑start latency and the need for request buffering. Organizations can thus adopt a serverless‑like model on Kubernetes without surrendering control to a single cloud vendor, balancing operational simplicity against the modest overhead of an extra gateway and interceptor layer.

The video walks through building a self‑contained inference‑as‑a‑service platform on Kubernetes, from provisioning GPU‑enabled clusters to deploying the first model. It targets organizations in regulated sectors—healthcare, finance, government—where data must never leave the corporate network, and it demonstrates how a...