Black Hat USA 2025 | LLM-Driven Reasoning for Automated Vulnerability Discovery Behind Hall-of-Fame
Why It Matters
Automating binary‑level vulnerability discovery with LLMs dramatically speeds up security testing and lowers reliance on labor‑intensive reverse engineering, reshaping how firms protect software ecosystems.
Key Takeaways
- •LLM‑based tool “Whisper” automates vulnerability discovery in binaries.
- •Achieved Samsung Mobile Security Hall of Fame for 2024 bug.
- •Pipeline combines human selection with AI‑driven code analysis.
- •Reconstructs data structures from stripped ARM64 binaries for accurate checks.
- •Model router balances heavy LLM tasks and lightweight JSON fixing.
Summary
The Black Hat USA 2025 talk introduced “Whisper,” a large‑language‑model‑driven system that automatically discovers vulnerabilities in stripped ARM64 binaries. The presenter, a researcher guiding an undergraduate team, explained how the tool earned a Hall of Fame award at Samsung Mobile Security 2024 by uncovering a critical RTCP buffer‑overflow bug.
Whisper’s architecture fuses human oversight—selecting target processes and validating results—with a cascade of AI agents that decompile binaries, rebuild global call graphs, and reconstruct data structures absent source symbols. By feeding precise pre‑conditions and value ranges into the LLM, the system can answer binary‑level “yes/no” vulnerability queries with high confidence, eliminating the ambiguous “may‑be” responses that plagued earlier chat‑bots.
A concrete example highlighted CVE‑2024‑34587, where the model identified an attacker‑controlled length field leading to a buffer overflow in Samsung’s video engine service. The pipeline generated a JSON report detailing the bug, confidence score, and step‑by‑step reasoning, even repairing malformed JSON outputs via a lightweight model router that balances cost and accuracy.
The broader implication is a shift toward AI‑augmented security testing: routine reverse‑engineering tasks become scalable, human analysts focus on strategic decisions, and organizations can integrate continuous, automated code review into their development lifecycles, potentially reducing time‑to‑patch and exposure to zero‑day exploits.
Comments
Want to join the conversation?
Loading comments...