Cto Pulse News and Headlines
  • All Technology
  • AI
  • Autonomy
  • B2B Growth
  • Big Data
  • BioTech
  • ClimateTech
  • Consumer Tech
  • Cybersecurity
  • DevOps
  • Digital Marketing
  • Ecommerce
  • EdTech
  • Enterprise
  • FinTech
  • GovTech
  • Hardware
  • HealthTech
  • HRTech
  • LegalTech
  • Nanotech
  • PropTech
  • Quantum
  • Robotics
  • SaaS
  • SpaceTech
AllNewsDealsSocialBlogsVideosPodcastsDigests
HomeCto PulseNewsPowering the Agents: Workers AI Now Runs Large Models, Starting with Kimi K2.5
Powering the Agents: Workers AI Now Runs Large Models, Starting with Kimi K2.5
CTO PulseAI

Powering the Agents: Workers AI Now Runs Large Models, Starting with Kimi K2.5

•March 19, 2026
Cloudflare Blog
Cloudflare Blog•Mar 19, 2026

Why It Matters

By delivering frontier‑scale reasoning at open‑source pricing, Cloudflare removes the primary cost barrier to scaling enterprise and personal AI agents, accelerating broader AI adoption.

Key Takeaways

  • •Workers AI adds Kimi K2.5 with 256k context.
  • •Model cuts inference costs by ~77% versus proprietary LLMs.
  • •Prefix caching and session affinity boost throughput, lower token fees.
  • •New async API prevents capacity errors for batch agent workloads.
  • •Custom kernels deliver high GPU utilization without ML engineering.

Pulse Analysis

The AI landscape is rapidly shifting from proprietary giants to open‑source frontier models that rival commercial performance. Cloudflare’s integration of Moonshot AI’s Kimi K2.5 into Workers AI reflects this trend, giving developers access to a 256k context window and advanced tool‑calling capabilities without the hefty licensing fees. By embedding the model directly into its serverless edge platform, Cloudflare eliminates the need for separate hosting infrastructure, allowing teams to prototype, test, and deploy agents entirely within a unified environment.

Cost efficiency is a decisive factor for enterprises scaling AI workloads. Cloudflare reports that a security‑review agent processing 7 billion tokens per day saved roughly $2.4 million annually by switching to Kimi K2.5, a 77% reduction versus a mid‑tier proprietary model. This dramatic saving demonstrates how open‑source models can deliver comparable quality while dramatically lowering operational expenses, making continuous, high‑volume agentic tasks—such as code scanning, personal assistants, and real‑time analytics—economically viable at scale.

Beyond the model itself, Cloudflare introduced platform enhancements that address the practical challenges of serverless inference. Prefix caching with visible token metrics and a session‑affinity header reduces pre‑fill latency and improves token‑per‑second throughput, while the redesigned asynchronous API mitigates capacity bottlenecks for batch or non‑real‑time workloads. Custom inference kernels and advanced parallelization techniques further boost GPU utilization, delivering enterprise‑grade performance without requiring deep ML‑engineering expertise. Together, these innovations position Workers AI as a compelling choice for organizations seeking to deploy robust, cost‑effective AI agents across the edge.

Powering the agents: Workers AI now runs large models, starting with Kimi K2.5

Read Original Article

Comments

Want to join the conversation?

Loading comments...

Top Publishers

  • The Verge AI

    The Verge AI

    21 followers

  • TechCrunch AI

    TechCrunch AI

    19 followers

  • Crunchbase News AI

    Crunchbase News AI

    15 followers

  • TechRadar

    TechRadar

    15 followers

  • Hacker News

    Hacker News

    13 followers

See More →

Top Creators

  • Ryan Allis

    Ryan Allis

    194 followers

  • Elon Musk

    Elon Musk

    78 followers

  • Sam Altman

    Sam Altman

    68 followers

  • Mark Cuban

    Mark Cuban

    56 followers

  • Jack Dorsey

    Jack Dorsey

    39 followers

See More →

Top Companies

  • SaasRise

    SaasRise

    196 followers

  • Anthropic

    Anthropic

    39 followers

  • OpenAI

    OpenAI

    21 followers

  • Hugging Face

    Hugging Face

    15 followers

  • xAI

    xAI

    12 followers

See More →

Top Investors

  • Andreessen Horowitz

    Andreessen Horowitz

    16 followers

  • Y Combinator

    Y Combinator

    15 followers

  • Sequoia Capital

    Sequoia Capital

    12 followers

  • General Catalyst

    General Catalyst

    8 followers

  • A16Z Crypto

    A16Z Crypto

    5 followers

See More →
NewsDealsSocialBlogsVideosPodcasts