Companies Mentioned
Summary
The episode explains AI poisoning, where attackers deliberately corrupt an AI’s training data (data poisoning) or the model itself (model poisoning) to cause targeted misbehaviour or overall performance degradation. It distinguishes direct attacks like backdoors, which trigger specific harmful outputs, from indirect attacks such as topic steering that embed false information into the model’s knowledge. Real‑world studies show even tiny amounts of poisoned data can make large language models spread medical misinformation, and poisoned models pose broader cybersecurity threats, while some artists are using poisoning defensively to protect their work.

Comments
Want to join the conversation?
Loading comments...