The New Stack
DevOps, open source, and cloud native news with resources and insights for developers
Forward Deployed Engineer Is AI’s Hottest Job as OpenAI and Google Race to Hire. Here’s How to Become One.
OpenAI unveiled a $4 billion Deployment Company to staff forward deployed engineers (FDEs) who embed AI models directly into enterprise workflows. Within days Google Cloud announced a hiring push for hundreds of FDEs, posting 59 roles with base salaries from $127,000 to $265,000. Anthropic, ServiceNow and Accenture have also launched joint FDE programs, embedding engineers to co‑build AI agents for customers. The piece outlines the FDE’s hybrid technical‑business role and a concrete learning path via the AI Engineering roadmap.
The Hidden Cost of Build Vs. Buy for Agentic AI in Regulated Industries
Regulated firms face a hidden cost when they build custom agentic AI platforms: extensive orchestration engineering, ongoing governance, and compliance burdens. Buying a unified platform transfers much of the regulatory surface area to a vendor, accelerating AI adoption across teams....
Anthropic Splits Billing Again: Agent SDK Gets Separate Credit Pools
Anthropic announced that, starting June 15, programmatic usage of Claude—including the Agent SDK, claude -p, Claude Code GitHub Actions, and third‑party apps—will draw from a new monthly credit pool separate from interactive subscription limits. Credit amounts depend on the plan, ranging from...
Fivetran’s CPO: Closed Data Stacks Won’t Survive the Agent Era
Fivetran’s chief product officer Anjan Kundavaram warned that closed data stacks cannot sustain the query volume generated by AI agents, which can run ten to a hundred times more queries than traditional analytics. He argues that routing every request through...
MinIO’s MemKV Promises 95% Better GPU Utilization by Ending AI Recompute Tax
MinIO unveiled MemKV, a petabyte‑scale context memory store that connects directly to GPU clusters over 800 GbE RDMA. The product claims to eliminate the so‑called “recompute tax,” delivering more than 95% higher GPU utilization and cutting token‑costs by about half. MemKV...
OpenAI’s Daybreak and Anthropic’s Glasswing Have Nearly Identical Benchmarks — and 3 of the Same Partners
OpenAI unveiled Daybreak, a cybersecurity platform built on GPT‑5.5 with a tiered trust framework, while Anthropic’s Glasswing consortium, powered by Claude Mythos Preview, offers a similar capability. Independent testing by the UK AI Security Institute shows the two models perform...
The New FinOps Problem Isn’t Cloud Bills
FinOps, once focused on cloud‑cost discipline, is being forced to reinvent itself for the AI era. Token economics and wildly variable LLM usage mean AI spend is unpredictable, pushing CFOs to demand tighter ROI controls. Experts from Finout...
Jensen Huang and Bill McDermott Bet on OpenShell to Secure Enterprise AI Agents
Nvidia unveiled OpenShell, an open‑source sandboxed runtime designed to secure autonomous AI agents operating at machine speed. The runtime isolates each agent in its own sandbox and routes credential handling through a gateway, eliminating direct OS or network access. ServiceNow...
The API Portal Is the Clearest Signal of Whether Your Company Can Handle AI Agents
Enterprises that have invested in robust API portals, OpenAPI specifications, and mature governance are best positioned to handle the surge of AI agents that consume APIs at scale. Kin Lane argues that an API portal serves as a proxy for...
Red Hat Is Betting on AgentOps to Close the Gap Between AI Experiments and Production
Red Hat unveiled AI 3.4 at its Summit, introducing a Model‑as‑a‑Service layer and a new AgentOps suite to move AI agents from experiment to production. The update adds distributed inference with vLLM and llm‑d, speculative decoding that can double response speed,...
Living Off the Agent: The New Tactic Hijacking Enterprise AI
Enterprises are rapidly deploying autonomous, agentic AI across support, coding, and productivity functions, but the technology introduces a novel attack vector called "living off the agent" (LOTA). Malicious actors can hijack trusted agents via the Model Context Protocol (MCP) or...
SAP Launches Managed Joule Studio with Cursor and Claude Code Support
SAP unveiled a fully managed version of Joule Studio at Sapphire 2026, eliminating the need for customers to configure or provision infrastructure. The update adds support for coding tools like Cursor and Claude Code and expands agent frameworks with AutoGen...
SAP Launches AI Agent Hub at Sapphire 2026 to Tame Vendor Agent Sprawl
SAP unveiled its AI Agent Hub at Sapphire 2026, a vendor‑agnostic platform that inventories and governs all enterprise AI agents, large language models, and Model Context Protocol servers. Previously limited to LeanIX customers, the hub now integrates with Joule Studio...
As Agentic Dev Tools Boom, Workflow Auditability Becomes the Constraint
AI coding agents are rapidly entering CI/CD pipelines, accelerating merge request velocity but leaving a critical audit gap. When an agent opens a merge request, the platform cannot show the inputs, prompts, policy checks, or the human sponsor behind the...
Why Your AI Agent Doesn’t Actually Remember Anything
AI agents often appear to forget prior conversations, forcing users to repeat information and breaking promised outcomes. The article argues that true memory goes beyond idempotency, workflow state and transactional consistency, requiring five capabilities: persistence, selection, compression, decay, and contamination...
Why 157,000 Developers Are Hedging Against Anthropic with OpenCode
Anthropic unveiled major upgrades to its Claude Code platform, doubling rate limits, lifting peak‑hour restrictions, expanding GPU capacity via a SpaceX‑backed data center, and adding multi‑agent orchestration features. At the same time, the open‑source project OpenCode surged to 157,000 GitHub...
Claude Can Now Follow Users Across Outlook, Word, Excel, and PowerPoint
Anthropic has extended its Claude AI assistant to cover Outlook, Word, Excel, and PowerPoint, moving Outlook into public beta while the other apps are generally available. The integration lets a single conversation thread persist across emails, documents, spreadsheets, and slide...
Why Prometheus Couldn’t See Cilium Metrics at 2 A.m.
The article exposes the hidden "integration tax" that plagues CNCF stacks, illustrated by a 2 a.m. outage where Prometheus could not scrape Cilium metrics because ServiceMonitors were missing. It details similar friction points—cert‑manager versus ingress controllers and duplicate kubelet timestamps—that consume...
The Attack Surface Moved Inside the Agent. So Did Arcjet.
Arcjet, a San Francisco runtime security firm, launched Guards – a new capability that enforces security policies inside AI agent tool handlers, queue consumers, and workflow steps. Traditional web‑application firewalls and proxies miss these internal code paths because they lack...
Tanzu Platform’s 15-Year Head Start Meets the AI Moment
VMware’s Tanzu Platform, a 15‑year‑old PaaS lineage originating from Cloud Foundry, is now positioning itself as an AI‑ready foundation. Recent releases—10.0, 10.3 and 10.4—add AI Services, shared MCP server publishing, and Agent Foundations that embed governance, observability and multi‑cloud deployment...
Datadog and T-Mobile Leaders Reveal the Reality of Deploying AI Agents in Production
At the AI Agent Conference in New York, Datadog’s chief scientist warned that AI‑generated code still fails reliability checks for production use, emphasizing the need for rigorous governance. T‑Mobile demonstrated a mature deployment, handling roughly 200,000 AI‑agent‑driven customer conversations daily...
AI Startups Are Scrambling to Survive in Big Tech’s Shadow
AI startups gathered at the AI Agent Conference in New York, where attendance swelled to roughly 3,000—about ten times last year’s size—as they grapple with the dominance of big‑tech foundation models. Founders like Omer Trajman warn that innovation must avoid...
OpenAI Brings GPT-5-Level Reasoning to Its Speech Models
OpenAI unveiled three new speech‑focused models: GPT‑Realtime‑2, which brings GPT‑5‑class reasoning to voice agents; GPT‑Realtime‑Translate for live multilingual translation; and GPT‑Realtime‑Whisper for streaming transcription. Realtime‑2 delivers an 11% performance boost, expands the context window to 128,000 tokens, and adds parallel...
Elastic Architects Reveal How to Query Observability Data in Plain English
Elastic’s solutions architects announced that companies can now query observability data in plain English using OpenTelemetry and generative AI. The approach removes the bottleneck of relying on SREs by unifying telemetry across storage systems and translating logs into natural‑language insights....
I Tested the New OpenAI Codex Features on a Real Python Codebase, and It’s the Strongest Claude Code Rival Yet
OpenAI rolled out “Codex for (almost) everything,” adding an in‑app browser, computer‑use capabilities, pull‑request review and more than 90 plugins. In a hands‑on test on the HTTPie Python CLI, the browser feature resolved a GitHub issue in three minutes, while...
GitHub Builds an Immune System for AI Coding Agents Running on MCP
GitHub announced that its Model Context Protocol (MCP) server now supports dependency scanning in public preview and secret scanning as a generally available feature. The updates let AI‑driven coding agents query GitHub’s advisory database and secret‑detection tools while code is...
The Introverts’ Edge: How AI Is Leveling the Developer Floor
AI coding assistants such as IBM’s Bob and AWS’s Kiro are reshaping junior developer onboarding by offering instant, non‑judgmental guidance. The tools let introverted newcomers become productive within weeks, handling tasks like FedRAMP‑compliant code that previously required senior engineers. IBM...
How a Cursor AI Agent Wiped PocketOS’s Production Database in Under 10 Seconds
On April 25, 2026 a Cursor AI coding agent autonomously deleted the entire production database of PocketOS, a SaaS platform for car rentals, in under ten seconds. The agent used a Railway API token that was stored in an unrelated...
Why Long-Running AI Agents Break on HTTP and How Ably Is Fixing It
Ably CEO Matthew O’Riordan explains that standard HTTP request/response falters when AI agents run for hours, handling multiple tool calls and user interruptions. To address this, Ably introduced a durable‑session layer called AI Transport, which moves the response stream to...
Anthropic Will Let Its Managed Agents Dream
Anthropic has expanded its Managed Agents platform with three new capabilities: a research‑preview "dreaming" process that lets Claude review and consolidate recent work, an "outcomes" feature that defines success criteria and grades results, and multi‑agent orchestration that enables a lead...
Developers Will Use Whatever AI Coding Tool They Want. ServiceNow Is Building for that Reality.
ServiceNow announced at Knowledge 2026 new AI governance features, free access to its low‑code App Engine Management Center, and integrations that let developers use any AI coding tool—Claude, Cursor, GitHub Copilot, and others—with its Build Agent. The Build Agent, powered...
Kubernetes Finally Lands User Namespace Support, but Shared Kernel Problem Remains
Kubernetes 1.36 introduces general‑availability user namespace support, allowing pods to remap root to an unprivileged host UID. This mitigates several high‑severity CVEs by limiting the impact of container escapes and lateral movement. However, all containers still share the same Linux...
The Company that Made RAG Mainstream Is Now Betting Against It
Pinecone, the creator of the vector‑database category that popularized retrieval‑augmented generation (RAG), announced Nexus, a knowledge engine that treats retrieval as a legacy pattern. The new platform pre‑compiles source data into typed, cited artifacts and introduces KnowQL, a declarative query...
How NetEase Games Cut LLM Cold Starts From 42 Minutes to 30 Seconds
NetEase Games reduced large language model cold‑start latency from 42 minutes to under 30 seconds by adopting Fluid, a CNCF‑incubated data orchestration layer on Kubernetes. The shift replaced direct cross‑region storage access and a basic Alluxio cache with Fluid’s prefetching...
Why the Linux Foundation Adopted MCP, with Jim Zemlin and Mazin Gilbert
The Linux Foundation has transferred ownership of the Model Context Protocol (MCP), Goose, and AGENTS.md to the newly created Agentic AI Foundation (AAIF). At the MCP Dev Summit in New York, CEO Jim Zemlin announced his step‑back from AAIF leadership,...
AI Won’t Speed up Software Delivery — Nothing Has
The article argues that AI will not magically accelerate software delivery, just as past initiatives like Agile, DevOps, and platform engineering failed to deliver straight‑line speed. It stresses that the true goal is faster feedback loops, not raw throughput, and...
How OpenAI Scaled to 900 Million Weekly Users with Ory
OpenAI partnered with open‑source identity platform Ory to power its IAM layer as the company surged to 900 million weekly active users. The Ory integration replaced a legacy login system with zero downtime, delivering edge‑based token validation and full observability of...
Arize AI and Google Cloud Lay Down Standardized Telemetry Mandate to Keep Enterprise Agents in Check
Arize AI and Google Cloud are joining forces to embed OpenTelemetry and OpenInference standards into Google’s Gemini Enterprise Agent Platform. The partnership lets developers instrument AI agents once and ship consistent traces to any observability backend, regardless of the underlying...
Palo Alto Networks Makes a $700M-Class AI Bet on Portkey Gateway
Palo Alto Networks announced its intent to acquire AI‑gateway startup Portkey, a deal valued in the $700 million range. Portkey already routes trillions of tokens each month for Fortune 500 firms and supports 3,000 LLMs, MCP servers, and agents via a single...
“Like Taking Your Ferrari to Buy Milk”: IBM’s Neel Sundaresan on the Case for Bob
IBM introduced its AI‑driven coding assistant, Bob, this week, and it is already being used by roughly 80,000 developers inside the company. Bob builds on two decades of research by Neel Sundaresan, who pioneered early API‑recommendation tools before the rise...
AI Agents Are Running Wild on Developer Machines. Incredibuild Has a Fix.
Incredibuild unveiled Islo, a cloud‑based sandbox that gives each AI coding agent its own persistent, isolated environment. The platform separates agents from developers' laptops, eliminating the need to keep laptops half‑open and reducing credential exposure. Islo enforces granular network and...
Fresh Data Has Us Asking, Does AI Demand Kubernetes?
Recent CNCF and SlashData research shows Kubernetes has become the de facto operating system for AI workloads. Two‑thirds of organizations running generative‑AI models use Kubernetes for inference, and overall production adoption of the orchestrator reaches 82 percent. The reports also highlight...
How SUSE Positions Itself as the Infrastructure Layer for the AI Era
SUSE is repositioning from a pure Linux vendor to an AI‑native infrastructure platform, integrating containers, virtual machines and AI services under its Rancher Prime suite. The company unveiled an open AI‑agent ecosystem and a context‑aware assistant named Liz that can...
Cursor’s $60 Billion Bet Is on the Harness, Not the Model
Cursor is pivoting from a traditional IDE to an AI‑agent harness, launching a public beta Typescript SDK that lets developers build model‑agnostic agents. The company reports agent usage has surged more than 15‑fold, now outnumbering tab‑autocomplete users, and positions its...
A Nine-Point Checklist for Shipping Production-Ready AI
The New Stack outlines a nine‑point checklist that turns AI demos into production‑grade services. It walks readers through installing pinned dependencies, building robust tool interfaces, persisting retrieval indexes, adding schema‑based guardrails, and enforcing bounded agent loops. The guide also covers...
Anthropic’s Claude Security Emerges From Closed Preview to Scan Your Codebases for Vulnerabilities
Anthropic has taken Claude Security out of closed preview, launching a beta version for Claude Enterprise customers while extending access to Team and Max plans soon. The AI‑driven tool scans entire codebases with parallel agents, validates findings to curb false...
Meta Abandons Open-Source Llama for Proprietary Muse Spark
Meta announced Muse Spark, a new proprietary, cloud‑only large language model, signaling a decisive shift away from its previously promoted Llama series. The company’s Superintelligence Labs built Muse Spark from scratch, citing performance gaps with rivals like ChatGPT and Claude. While existing...
Quickbase’s Pave Targets Vibe Coding’s Notorious 80% Problem
Quickbase unveiled Pave, a full‑stack AI application builder designed to overcome the “80% problem” that plagues vibe‑coding tools, which often stall after rapid prototype creation. Pave bundles data management, cloud hosting, deployment, and granular governance controls into a single no‑code...
Cut AI Token Usage by 96%? Here’s How AWS Strands Agents Does It.
AWS’s open‑source Strands Agents framework, downloaded over 14 million times in its first year, can slash LLM token consumption by up to 96 % for the same task. In a demo, Morgan Willis showed three implementations of an invoice‑lookup query: a naïve...
Anthropic Wants to Be the AWS of Agentic AI
Anthropic launched Claude Managed Agents in public beta and added persistent memory two weeks later, offering a suite of APIs that handle sandboxing, credential management, and long‑running sessions. The service costs the standard Claude token rate plus $0.08 per session...