While definitions vary, generally speaking a security operation centre (SOC) is a central point from where all security issues are dealt with on an organisational and technical level. Typically, it will encompass all the enterprise’s information systems – from websites, applications, databases, data centres and servers to networks, desktops and other endpoints, that are monitored, assessed and defended. The problem is that all too often SOCs are failing. When you see organisations spending huge amounts of money on security measures that fail to spot 95 per cent of simulated attacks, it’s hard to come to any other conclusion. So, what’s going wrong?
The classic mistakes
One of the first mistakes often made is people jumping to a perceived solution without thinking about the problem first. There is often a perception that log aggregation, collected from as many devices as possible and all fed into a commercial SIEM that generates as many different alerts as possible, is a measure of success. Though it is true that centralised log collection can be a beneficial component of an effective attack detection system, it needs to be done right, and even then it is still only one component.
The situation SOCs end up in with this approach is that they have a mountain of data that is very difficult to process, and a huge number of daily alerts, the overwhelming majority of them being false positives. Even when a legitimate attack or compromise is discovered, it can be very difficult to investigate or respond to the problem without additional capabilities. This is also often a very threat intelligence/signature-focused approach (which ultimately is one and the same) and so at best it ends up being a system that can only detect compromises that have been seen before – it won’t pick up any advanced, targeted attacks.
The approach outlined above ends up being the virtual equivalent of a security guard with his feet up reading the paper, who occasionally glances up at the CCTV screens he is supposed to be watching.
The data and capabilities required
Instead of jumping to a solution that doesn’t work, focus should instead be on what matters and what the requirements are. What specific types of attacks need to be detected? Which parts of the cyber kill chain should be focused on? What type of threat actors need to be deflected?
This should all be done with reference to real attack techniques and so requires good offensive knowledge. Worrying over how many IP addresses port scan a well-secured public facing website every day is pointless when the way many organisations are being compromised is through spear-phishing emails. Remember also that some threats suit detection and others much less so. For example, detecting ransomware is of limited value because the damage is done immediately. Trying to detect ransomware and then to find and power-off the affected system before it encrypts other data is fighting a losing battle. On the other hand, a targeted attack that seeks to gain a foothold on a network, gradually extend access and then maintain that access to information for months or years to come is much more suited to detection. Finding that on the first day or even in the first week is a huge success, compared to it going undetected.
The next question is what key components are needed to support these activities. Log collection was mentioned earlier, but that is just one facet of one major component. Collecting the right logs to support objectives plays a part, but also to discard anything that is of no security value – after all, less is more in this sense. There are two additional major components – endpoint analysis and passive network monitoring. These three major components all address different problems and only when combined create a truly effective attack detection system.
Once these systems are in place, an effective workflow is needed, that is followed every day, and is designed to detect the attack scenarios identified. Alerts on certain types of data from the different data sources collected are one aspect of this but require careful thought and tuning to ensure that the alerts are suitable, manageable and provide enough context to investigate the issue properly. The last thing anyone wants is a mountain of events that can’t be actioned.
However, real-time alerts are not the only way to work. Though they have their place, they are arguably much less effective than active data visualisation and review supplemented by anomaly analysis. The idea here is to have set ways of visualising the data, each with a specific intended purpose. Aggregated data presented in the right way and enriched with supporting information is a very effective means of detecting a wide variety of attacks, particularly targeted attacks that have not been seen before.
As a very simple example, being able to visualise every persistent binary across a network in one view, with each unique entry shown only once and counted by the number of hosts, is a very effective technique for quickly discovering that one user laptop seems to have an executable that runs on start-up that no other system on the network has. Is that laptop unique? Or is that just the one system that has a full remote-access trojan installed and set to run on start-up?
The people problem
This one is critical. No matter how good an organisation’s technical systems and capabilities, it’s all for nothing if the right people to support it are missing. To solve this problem, the job needs to be intellectually stimulating and rewarding and needs to develop experience over time in such a way that it is attractive to capable employees, so that they improve over time and so that they want to remain working there.
As a rule of thumb, smart and capable employees do not like staring at screens of thousands of alerts 24 hours a day and seven days a week that are almost entirely false positives and performing the same monotonous tasks over and over again.
To make the job interesting, the SOC should take out the grunt work, continually improve and generally not overwhelm analysts with huge amounts of data. This ensures that the job itself can remain interesting and allow focus on the important parts that deliver results and develop experience.
Top tips for success
Having already covered several critical issues for success above, the following gives a summary of a few top tips for success:
Make sure endpoint analysis, network analysis and log collection are in place – endpoint analysis is particularly important for detecting more advanced targeted attacks.
Don’t be completely reliant on threat intelligence feeds to stay ahead of the curve.
People are key – the right people, the right experience and the right job roles.
Real-time alerts can be useful but active data review with anomaly analysis is arguably the more important component.
Test your SOC. Test that it can detect the attack techniques it claims in practice and, if not, then improve it until it can.
Less is more – constantly review data sources, workflows and alert cases to eliminate what isn’t valuable and further improve what is.
How do you know whether your SOC is delivering good results?
Unless you test and measure the SOC’s effectiveness, there is no reason to believe it is of any value at all. To see results, thinking needs to change. Not every compromise can be prevented, but identifying it quickly and acting on that intelligence is the endgame.