Enterprise AI Cybersecurity

Black Hat USA 2025 | Let LLM Learn: When Your Static Analyzer Actually 'Gets It'

•February 27, 2026

0

Black Hat

Black Hat•Feb 27, 2026

Why It Matters

By combining LLM reasoning with static analysis, security teams can uncover more vulnerabilities faster and at lower cost, reshaping how software risk is managed across the industry.

Key Takeaways

•AI-enhanced static analysis reduces false positives, improves recall
•Three integration models: AI‑enhanced, AI‑explorer, AI‑native designs evaluated
•Prompt engineering and closed‑loop optimization boost scanner efficiency
•Code block segmentation aligns AI findings with human auditor reasoning
•Open‑sourcing the agent enables broader zero‑day vulnerability discovery

Summary

The Black Hat presentation explored how large language models (LLMs) can be fused with traditional static analysis tools to create a new generation of vulnerability scanners. The speaker outlined three integration patterns—AI‑enhanced, where a static scanner filters LLM output; AI‑explorer, where the LLM leads discovery and the scanner validates results; and AI‑native, where the LLM functions as the scanner itself—highlighting the trade‑offs in false‑positive rates, coverage, and hallucination risk.

Key insights included the high cost of one‑by‑one AI reporting, the importance of prompt engineering, and a closed‑loop optimization framework that treats the LLM as a query‑language (QL) optimizer. By feeding generated QL rules back into a test suite, the system iteratively refines detection logic, while strict context isolation prevents compilation loops. The team also introduced a code‑block segmentation strategy that mirrors how human auditors abstract code, dramatically improving recall while keeping false positives low.

Notable examples cited were over 500 pull‑requests spent eliminating false positives in existing rule sets, a three‑fold increase in recall after relaxing rules, and an 80% catch‑rate when summarizing similar code paths. The open‑source release of the agent and its implementation allows the community to reproduce the workflow and accelerate zero‑day discovery.

The implications are clear: integrating LLMs with static analysis can slash review time, raise detection accuracy, and shift the industry toward AI‑augmented security pipelines. Organizations that adopt these closed‑loop, segmentation‑driven approaches will likely gain a competitive edge in vulnerability research and remediation.

Original Description

Imagine the process of a human security auditor. What distinguishes an expert? It's their accumulated knowledge and nuanced understanding, allowing them to see beyond simple rules. Indeed, Large Language Models (LLMs) demonstrate semantic understanding capabilities potentially exceeding traditional rule-based static analysis. However, raw reasoning power isn't synonymous with effective learning in this complex domain.

While LLMs have shown promise for semantic reasoning tasks, deploying them directly on massive codebases is frequently impractical due to scalability constraints and excessive computational overhead. Additionally, isolated semantic summarization at function or module granularities often yields overly abstract results lacking practical actionable insights, or excessive context that proves too cumbersome to analyze effectively.

In this talk, we propose "Let LLM Learn," an innovative approach that facilitates incremental semantic knowledge learning using reasoning models. Our method reframes the role of static analysis; instead of relying directly on its predefined rules, we leverage it to identify and extract relevant code segments which serve as focused learning material for the LLM. We then strategically partition complex codebases into meaningful, semantic-level slices pertinent to vulnerability propagation. Leveraging these slices, our framework incrementally teaches the LLM—potentially guided by human annotations—to summarize and cache valuable semantic knowledge. This process significantly enhances accuracy, efficiency, and context-awareness in automated vulnerability detection.

Empirical evaluations demonstrate that our approach effectively identifies over 70 previously unknown bugs in real-world software projects, including VirtualBox and critical medical device systems in the IN-CYPHER project led by the UK and Singapore. Crucially, the semantic knowledge accumulated by our system naturally encodes high-value vulnerability patterns, closely resembling the intuition and analytical capabilities of human security experts. Our technique thereby bridges a critical gap between human expertise and automated analysis capabilities, considerably enhancing vulnerability detection effectiveness, precision, and practical utility.

By:

Zong Cao | Phd Student, Imperial Global Singapore and Nanyang Technological University

Zhengzi Xu

Yeqi Fu

Yuqiang Sun

Kaixuan Li

Yang Liu

Full Session Details Available at:

https://blackhat.com/us-25/briefings/schedule/?#let-llm-learn-when-your-static-analyzer-actually-gets-it-46444

0

Comments

Want to join the conversation?

Loading comments...