SecTor 2025 | Interactive Network Visualization of Data Poisoning Attacks

Black Hat
Black HatApr 21, 2026

Why It Matters

Network‑based visualizations give security teams a practical way to detect and quantify data‑poisoning, protecting AI models before malicious backdoors cause real‑world harm.

Key Takeaways

  • Visualizing data poisoning as network graphs reveals hidden attack patterns.
  • Tools like GEI and Graph Leak enable side‑by‑side graph comparison.
  • Real‑world chatbots (Tay, Grok) illustrate large‑scale poisoning risks.
  • BadNets demonstrate backdoor attacks succeed with as little as 10% poisoned data.
  • Statistical risk scores help quantify poisoning impact on model performance.

Summary

The SecTor 2025 talk introduced an interactive approach to visualizing data‑poisoning attacks by treating machine‑learning training sets as network graphs. By mapping nodes and edges that represent data points and their relationships, the presenter demonstrated how clean and compromised datasets can be contrasted within a single workspace, making subtle manipulations observable. Key insights include the use of graph‑theoretic metrics—such as edge multiplicity and modularity—to flag anomalies, and the development of open‑source tools like GEI and the browser‑based Graph Leak for side‑by‑side comparison. Real‑world incidents, from Microsoft’s Tay to X’s Grok, underscore how injected hateful content can corrupt models, while BadNets research shows backdoor attacks succeed with as little as ten percent poisoned data. Notable examples featured a traffic‑sign classifier that misidentified a stop sign when a yellow post‑it was added, and a live demo where adding a single node to a social‑network graph altered the risk rating by 9.5 %. The presenter highlighted statistical risk scores that categorize poisoning severity based on the proportion of altered nodes, reinforcing findings from 2016‑2017 literature. The broader implication is that visual, statistical, and provenance‑based analyses can become essential defenses for organizations outsourcing model training or relying on public datasets. By exposing hidden data‑integrity issues early, these techniques help mitigate cascading security failures in downstream AI applications.

Original Description

What if we could not only visualize poisoned training data, but also interact with it?
As data poisoning becomes a growing threat to the integrity of machine learning systems, understanding its effects requires more than static visualizations. This talk introduces GraphLeak, an open-source, interactive web tool designed to visualize how poisoned training data alters network structure. We will explore how adversarial data manipulation impacts graph-based representations.
Building on network science concepts, this session will go deeper: not just showing how poisoning affects structure, but allowing users to directly interact with poisoned vs. clean datasets in real time. We'll walk through how the app ingests CSV or JSON data, builds networks, and renders them via layouts.
The presentation of this tool emphasizes accessibility through making data poisoning tangible and transparent, allowing security practitioners and non-experts to understand how data poisoning attacks distort model behavior. By making threats visible, we make the defenses of these threats more approachable, democratizing insight into machine learning vulnerabilities and supporting the development of more robust, transparent systems.
By: Maria Khodak | Security Engineer
Presentation Materials Available at:

Comments

Want to join the conversation?

Loading comments...