SecTor 2025 | One Agent to Rule Them All: How One Malicious Agent Hijacks A2A System
Why It Matters
A compromised AI agent can silently commandeer critical business workflows, turning automation benefits into severe security liabilities for enterprises adopting generative‑AI platforms.
Key Takeaways
- •Multi‑agent AI frameworks expose new prompt‑injection attack surface.
- •Malicious agents can be installed like packages, hijacking A2A orchestrations.
- •Untargeted attacks use LLM reasoning to discover and exploit tools automatically.
- •Google’s A2A protocol injects raw agent cards into prompts, lacking sanitization.
- •Enterprises must vet agents, enforce sandboxing, and monitor prompt behavior.
Summary
The SecTor 2025 talk highlighted a growing security dilemma: multi‑agent generative‑AI systems, exemplified by Google’s A2A (Agent‑to‑Agent) protocol, can be weaponized by a single malicious agent that hijacks an entire automation ecosystem. The presenters, senior AI security researchers from Zenity and AI Atlas, walked through the architecture of modern AI agents, the ease of acquiring them from public URLs or future agent stores, and the role of the orchestrator that stitches together disparate agents to fulfill user requests.
Their core insight is that the discovery process pulls an agent’s JSON "card" directly into the LLM’s system prompt, creating an unchecked injection point. By crafting a prompt that forces the host agent to enumerate tools, reason about possible damage, and then execute actions, an attacker can launch untargeted attacks without prior knowledge of the target environment. The researchers demonstrated how a rogue agent, installed like a Python package, could silently exfiltrate database records, disable smart‑home controls, and manipulate cloud resources.
A striking example cited was the "self.agent" snippet in Google’s open‑source sample, which dumps raw agent information into the prompt without sanitization. This oversight allows malicious payloads embedded in seemingly benign images or text to become executable instructions for the LLM, effectively turning the AI into a malware delivery mechanism.
The implications are clear: enterprises must treat AI agents as third‑party code, enforce strict sandboxing, perform provenance checks, and monitor prompt interactions for anomalous behavior. Without these safeguards, the promise of AI‑driven automation could become a vector for large‑scale data breaches and operational sabotage.
Comments
Want to join the conversation?
Loading comments...