Classification in Security Operations

Published in

Towards Data Science

5 min readFeb 11, 2021

Everyone in the cyber security industry is rushing to get their hands on the requisite artificial intelligence and machine learning required to get ahead of their attackers. While many cybersecurity companies certainly only employ AI/ML in bouts of buzzword salesmanship, most cyber security professionals do feel that AI/ML has its place in the security world.

The foundational problem in ML or any data analytics for that matter is classification. Security operations centers are constantly solving a set of classification issues. Given some set of input data, security analysts initially must determine if activity is malicious or non-malicious. This first issue is therefore binary classification. In the image below, I break this problem down a bit further. The red dots mark where classification occurs, and the yellow boxes mark the results of classification.

Classification Decision 1: Is this data useful for detection?

Security analysts love to have their logs and IDS alerts, but all data needs to be continuously scrutinized for relevance in incident detection. Security engineers not only have to consider removing useless sources of incident detection but also adding new, innovative data sources. Generally a human does not operate this part of the loop — SOCs get what data they can from networks and logs and hope it will be sufficient for them to do their jobs. AI/ML may therefore naturally fit here. Host computing power presents the major blocker; for if the host must intelligently determine which data may help detection in real time, it will use compute overhead to run such an algorithm. Whatever data that the system does choose to send will pass to the next classification decision.

Classification Decision 2: Is this an incident?

After deciding what data to pass to our intrusion detection system or systems, the intrusion detection system must decide whether or not the data characterizes malicious activity. This is a binary classification problem, so why three cases in the image above? A machine learning engineer will develop an algorithm to determine whether system data characterizes an incident based on input training data. This algorithm will return the probability of incident occurrence. The cases therefore represent thresholds in this returned probability. Analysts will have to determine the proper threshold to avoid alert fatigue and process the proper number of alerts per day. The case 1 threshold will be very high. Note that since incidents tend to be the less common occurrence, this may not be a threshold close to one — the threshold for case 1 could in fact be very close to zero, just greater than the next threshold. The threshold for case 2 will still be high, but it will generate several alerts. This will enable to analyst to read through the data which generated the alert and make her decision on how to classify the event. The loop shown with the SIEM feeding back the decision point represents this iterative process.

Incident Investigation and Response

The NIST Cybersecurity Framework defines a cybersecurity incident as “a cybersecurity event that has been determined to have an impact on the organization prompting the need for response and recovery.” I never liked this definition and here we need something more concrete. I prefer a simpler definition: a cybersecurity incident is an event that an analyst decided to investigate further. This leads naturally to the first item from the NIST Cybersecurity Framework in Response Analysis: “Notifications from detection systems are investigated.” Incident investigation and response are also classification problems. The figure below displays three binary classification problems that analysts must solve to close an incident after generation.

Classification Decision 3: Sufficiency of Context Data

Classification again returns to quality of the provided data. Decision point 2 classified the incident based on data from decision point one, but this data was focused on detection and not investigation. In other words, we just needed to answer the question: is this anomalous behavior that we should look into? Now we need to answer: what exactly happened, and how can we fix it? The analyst will likely need to query additional data sources in order to answer these questions. AI/ML can provide significant value here, for it can quickly determine what data may be useful to the analyst for investigation, based on the context data, and automatically add this data to the incident case object (data hydration) for further analyst review.

Classification Decision 4: True Incident

In the process of gathering additional context data, the analyst will frequently determine that this incident was a false positive. In this case, she will simply close the incident without finding. This error will always occur because context data is focused and massive. Moving this decision point back to decision point 2, thus requiring all possibly requisite context data to hit the IDS would overload networks and make incident detection a computationally impossible task. The two-step incident determination uses triaging in order to cut down on computation and networking requirements. The AI/ML algorithms in this step would therefore need to adapt to the inputs which the analyst adds to the context data. An algorithm could run continuously in the background or upon changes to the incident case file to alter incident likelihood values which were generated in classification decision 2.

Classification Decision 5: Mitigation Effectiveness

One day our AI overlords may automatically generate mitigations based on context data. For now, however, analysts must use their best judgement to develop mitigations upon investigation completion. Analysts use inductive reasoning and the scientific method rather than deductive reasoning. In other words, they have to use their best judgement to develop mitigations and validate them through test rather than a series of if-then statements about the incident to lead to the correct set of mitigation actions. Upon actual execution of the mitigation, the analyst will close the incident if the mitigation is successful. Analysts will develop automated tests which validate their implemented mitigations as they develop the mitigation. Since analysts already build these tests and they naturally emerge from mitigation development, AI/ML does not present significant value in this decision point.

Targeting Machine Learning and Analytics to the Problem

Those unfamiliar with security operations may think that a single ML algorithm could assist and eventually replace security analysts. While such a super-machine may exist one day, each of the five decision points above represents a unique problem with varying data sources and desired classification results. In fact, the images above and narrowing of security operations to five decisions are oversimplifications of the challenges of security operations. The value in this simplification lies in identifying where we can start using machine learning to make security operations more effective. Machine learning engineers should seek to implement classifications in some of these five decision points in order to assist, ease, and focus security operators.

Classification in Security Operations

Written by Wesley Belleman