SecTor 2025 | One Agent to Rule Them All: How One Malicious Agent Hijacks A2A System

Black Hat
Black HatApr 13, 2026

Why It Matters

A compromised AI agent can silently commandeer critical business workflows, turning automation benefits into severe security liabilities for enterprises adopting generative‑AI platforms.

Key Takeaways

  • Multi‑agent AI frameworks expose new prompt‑injection attack surface.
  • Malicious agents can be installed like packages, hijacking A2A orchestrations.
  • Untargeted attacks use LLM reasoning to discover and exploit tools automatically.
  • Google’s A2A protocol injects raw agent cards into prompts, lacking sanitization.
  • Enterprises must vet agents, enforce sandboxing, and monitor prompt behavior.

Summary

The SecTor 2025 talk highlighted a growing security dilemma: multi‑agent generative‑AI systems, exemplified by Google’s A2A (Agent‑to‑Agent) protocol, can be weaponized by a single malicious agent that hijacks an entire automation ecosystem. The presenters, senior AI security researchers from Zenity and AI Atlas, walked through the architecture of modern AI agents, the ease of acquiring them from public URLs or future agent stores, and the role of the orchestrator that stitches together disparate agents to fulfill user requests.

Their core insight is that the discovery process pulls an agent’s JSON "card" directly into the LLM’s system prompt, creating an unchecked injection point. By crafting a prompt that forces the host agent to enumerate tools, reason about possible damage, and then execute actions, an attacker can launch untargeted attacks without prior knowledge of the target environment. The researchers demonstrated how a rogue agent, installed like a Python package, could silently exfiltrate database records, disable smart‑home controls, and manipulate cloud resources.

A striking example cited was the "self.agent" snippet in Google’s open‑source sample, which dumps raw agent information into the prompt without sanitization. This oversight allows malicious payloads embedded in seemingly benign images or text to become executable instructions for the LLM, effectively turning the AI into a malware delivery mechanism.

The implications are clear: enterprises must treat AI agents as third‑party code, enforce strict sandboxing, perform provenance checks, and monitor prompt interactions for anomalous behavior. Without these safeguards, the promise of AI‑driven automation could become a vector for large‑scale data breaches and operational sabotage.

Original Description

As multi-agent architectures become increasingly essential to enterprise workflows, Google's A2A and Anthropic's MCP have been proposed as standard protocols for agent communication and integration. These protocols have become foundational for scaling AI agents Technology, enabling the seamless integration of third-party agents, often available as open-source code, into existing systems. However, these protocols must also ensure system safety, and potential security risks must be carefully considered.
In this presentation, we will highlight a key vulnerability in these protocols: integrating outsourced agent card's text into the delegator agent's instructions introduces a backdoor for cyber security attacks. Our presentation will first explain the protocol design and its weaknesses. Then, we will show how malicious agents with hidden prompt injection can bypass current defenses and checks. We will also present a way to combine user's trust in LLMs and LLM hallucinations to drive the user to install malicious agent.
Finally, we demonstrate how such malicious agents enable full system compromise, including DoS, sensitive data theft, Phishing and lateral spread. All those attacks are done without detections at all and look to the user as normal behavior of the system.
By:
Adar Peleg | Cyber Researcher, Technion
Stav Cohen | PhD Student, Technion
Shaked Adi | Student & Researcher, ATLAS - The Technion's AI Security Lab
Dvir Alsheich | Student & Researcher, ATLAS - The Technion's AI Security Lab
Rom Himelstein | Graduate Student & Supervisor, ATLAS - The Technion's AI Security Lab
Amit LeVi | Principle AI Security Researcher & Advisor, ATLAS - Technion Lab: AI Trust, Learning, Architecture, Security
Avi Mendelson | Head of the ATLAS Lab, Technion – Technical University
Presentation Materials Available at:

Comments

Want to join the conversation?

Loading comments...