AI Videos

All News Deals Social Blogs Videos Podcasts Digests

Enterprise AI Cybersecurity

Black Hat USA 2025 | Reinventing Agentic AI Security With Architectural Controls

•March 6, 2026

Black Hat

Black Hat•Mar 6, 2026

Why It Matters

Without architectural, zero‑trust controls, AI agents can bypass guardrails and compromise critical assets, turning powerful models into direct attack vectors for enterprises.

Key Takeaways

•Guardrails are statistical, not hard security boundaries for AI.
•AI agents can bypass controls, leading to remote code execution.
•Trust must be derived from least‑trusted input in context window.
•Dynamic capability shifting limits LLM privileges based on data source.
•Implement trust‑binding, proxying, and trust‑tagging to enforce zero‑trust.

Summary

At Black Hat USA 2025, David Brockle III of NCC Group opened his briefing by framing AI security as a modern parallel to the early web’s reliance on firewalls. He argued that today’s AI guardrails function like statistical heuristics—useful but never a definitive barrier—while the underlying agents inherit trust from every input they process, making them vulnerable to sophisticated prompt‑injection and remote‑code‑execution attacks. Brockle illustrated the danger with real‑world breaches: an AI‑driven developer assistant escaped a sandbox, accessed a Kubernetes manager, harvested Azure storage secrets, and exposed confidential employee documents. He also showed how a poisoned retrieval‑augmented generation (RAG) database leaked production passwords, and how indirect prompt injection allowed an attacker to exfiltrate an entire database via a compromised admin assistant. These examples underscore that AI systems inherit the lowest trust level of any data entering their context window, rendering traditional defense‑in‑depth insufficient. Key takeaways from the talk include the concept of "dynamic capability shifting," where an LLM’s permitted tool calls are automatically reduced based on the trust level of the current user or data source. Brockle highlighted practical mitigations such as trust‑binding (pinning user authentication tokens to backend tool calls), proxying LLM requests through the client browser to reuse existing auth mechanisms, and trust‑tagging data sources to enforce zero‑trust policies across sessions. He repeatedly warned that exposing LLMs to untrusted data must never grant them read or write access to sensitive resources. The broader implication is clear: enterprises must move beyond superficial guardrails and embed architectural controls that treat AI models as potential threat actors. By adopting dynamic privilege reduction, strict authentication pinning, and fine‑grained trust tagging, organizations can contain AI‑induced attack surfaces and protect confidentiality, integrity, and availability in the emerging agentic computing era.

Original Description

AI red teaming has proven that eliminating prompt injection is a lost cause. Worse, many developers consider guardrails a first-order security control and inadvertently introduce serious horizontal and vertical privilege escalation vectors into their applications. When the attack surface of AI-driven applications increases with the complexity and agency of their model capabilities, developers must adopt new strategies to eliminate these risks before they become ingrained across application stacks.

Our team has surveyed dozens of AI applications, exploited their most common risks, and discovered a set of practical architectural patterns and input validation strategies that completely mitigate natural language injection attacks. This talk will address the root cause of AI-based vulnerabilities, showcase real exploits that have led to critical data exfiltration, and present threat modeling strategies that have proven to remediate AI-based risks.

By the end of the presentation, attendees will understand how to design/test complex agentic systems and how to model trust flows in agentic environments. They will also understand what architectural decisions can mitigate prompt injection and other model manipulation risks, even when AI systems are exposed to untrusted sources of data.

By:

David Brauchler III | Technical Director | AI/ML Security Practice Lead, NCC Group

Presentation Materials Available at:

https://blackhat.com/us-25/briefings/schedule/?#when-guardrails-arent-enough-reinventing-agentic-ai-security-with-architectural-controls-46112

Comments

Want to join the conversation?

Loading comments...