AI Security Needs a Shift From Models to Systems, Researchers Argue

AI Security Needs a Shift From Models to Systems, Researchers Argue

CSO Online
CSO OnlineMay 25, 2026

Why It Matters

Shifting AI security from model‑centric to system‑centric controls addresses the root cause of emerging agent attacks, protecting enterprise data and operational integrity. Without this shift, traditional security tools will remain blind to autonomous agent behavior, exposing organizations to credential leaks and workflow sabotage.

Key Takeaways

  • Treat AI models as untrusted components, enforce security at system level
  • Five principles: least privilege, tamper resistance, complete mediation, secure flow, human factor
  • All 11 attacks broke secure information flow; most ignored least privilege
  • ADR framework catches 67% of attacks with zero false positives

Pulse Analysis

The rapid adoption of autonomous AI agents in corporate environments has exposed a critical blind spot in traditional cybersecurity. While vendors have focused on hardening the underlying language models through alignment and prompt‑level guardrails, real‑world incidents—ranging from ChatGPT macOS data exfiltration to AgentFlayer’s malicious Jira exploit—demonstrate that agents can bypass these defenses by leveraging system resources, memory, and APIs. Treating the model as an untrusted component and applying operating‑system‑style isolation mirrors decades of proven systems security practice, offering a more resilient foundation for protecting enterprise assets.

Central to this paradigm shift are five principles distilled from systems security research: least privilege, tamper‑resistant trusted computing base, complete mediation, secure information flow, and accounting for human error. The authors’ analysis of eleven documented attacks shows a universal violation of secure information flow and frequent breaches of least privilege, underscoring the urgency of embedding these controls at the infrastructure layer. Implementing verifiable policy generation, separating instruction from data, and enforcing information‑flow controls are identified as open research challenges that, once solved, could dramatically reduce the attack surface of AI agents.

To operationalize these concepts, the paper introduces an Agentic Detection and Response (ADR) framework that monitors over 10,000 daily agent sessions across thousands of hosts. In internal benchmarks, ADR detected 67% of attacks with zero false positives, outperforming existing solutions such as Meta’s LlamaFirewall. By providing runtime visibility into agent cognition, tool invocation, and memory usage, ADR exemplifies the next generation of security tooling required for autonomous AI. Enterprises that adopt system‑level safeguards now will be better positioned to harness AI’s productivity gains while mitigating the escalating risk of sophisticated, agent‑driven threats.

AI security needs a shift from models to systems, researchers argue

Comments

Want to join the conversation?

Loading comments...