SecTor 2025 | Exploiting Multi Agent Systems

Black Hat
Black HatMay 22, 2026

Why It Matters

As multi‑agent AI becomes integral to enterprise operations, unchecked prompt injections can compromise data, manipulate actions, and undermine trust, making proactive telemetry and strict orchestration controls essential for business security.

Key Takeaways

  • Prompt injection effectiveness depends on agent permissions and tool access.
  • Observability and telemetry are essential for early detection of attacks.
  • Multi‑agent attack surface expands with RAG, connectors, and memory.
  • Exploits are nondeterministic; repeated attempts often required for success.
  • Secure orchestration demands authentication, trust boundaries, and error‑log controls.

Summary

The SecTor 2025 talk focused on the emerging security challenges of multi‑agent AI systems, especially the ways attackers can exploit prompt injection and tool misuse. The speaker, a ServiceNow red‑team veteran, outlined how agents orchestrate tasks, interact with tools, and maintain long‑term memory, creating a complex attack surface. Key insights included the bounded nature of prompt injection—its impact is limited by the agent’s permissions—and the critical role of observability. Fast, accurate telemetry is essential for early detection, while the expanding surface—RAG pipelines, connectors, and shared memory—requires comprehensive threat modeling. The speaker emphasized that attacks are often nondeterministic, needing many iterations before succeeding. Notable examples highlighted an "audit mode" prompt‑stealing technique, where an attacker tricks a model into revealing its system prompt by mimicking its formatting. The presenter also warned that tools should never directly alter planning logic and that error‑log exposure can become a leakage vector. Quotes such as "you can't protect what you can't see" underscored the need for robust monitoring. The implications are clear: platform owners must enforce strict authentication between orchestrators and agents, isolate tool outputs, and implement granular logging. Red‑teamers gain new playbooks, while security architects must redesign kill‑chain defenses to address the unique risks of autonomous, multi‑step AI workflows.

Original Description

Large language model agents don't just talk, they collaborate, delegate and act. That orchestration layer opens a new attack surface: multi agent prompt injection. In this fast paced SecTor session you'll watch a red team walkthrough that starts with harvesting hidden system prompts, then escalates through mirrored pattern injections that subvert individual agents, corrupt the planner, and co opt tool calls. We'll dissect both direct and "second hand" (indirect) attacks that propagate across agent boundaries, chaining seemingly innocuous instructions into a full mission level takeover.
Defenders aren't powerless, but every control has a price. We map mitigations—from agent scoped content sanitization to policy enforced orchestrators and high fidelity telemetry—against their engineering effort and real world efficacy. You'll leave with a pragmatic checklist for building observability without violating user privacy, plus concrete design patterns to harden your own LLM ecosystems before attackers weaponize them for you.
By: Jeremy Richards | AI Red Team, ServiceNow

Comments

Want to join the conversation?

Loading comments...