If weaponized at scale, LLM fine-tuning could enable a new, hard-to-detect C2 and exfiltration vector that leverages trusted AI services, complicating incident response and raising urgent needs for detection, provider controls and organizational safeguards.
Researchers from Palo Alto Networks' Cortex team demonstrated how attackers can weaponize fine-tuning of large language models to build stealthy command-and-control channels that live inside popular AI models. They show attackers already using LLMs for reconnaissance, social engineering and coding, and explain why models are not trivially suitable for C2—because they are stateless, probabilistic and gated by safety filters. By fine-tuning a model on stolen endpoint data, the team created a proof-of-concept that allowed covert retrieval of victim data via the model’s API, though reliability and engineering hurdles remain. The researchers built a tool called C2LM and plan to detail detection and defensive measures against such LLM-based implants.
Comments
Want to join the conversation?
Loading comments...