As security events emerge in an endless stream, we have realized that only deploying intrusion detection devices and firewalls is hardly enough to constitute a satisfactory enterprise security system. Also, the enterprise security team often finds it more difficult to effectively control the increasingly complicated security situation. Traditional security products generate huge amounts of alert logs, checking of which wears security teams down. Meanwhile, security researchers put forward many a model that centers on data collection, data analysis, and response policies. The Cyber Kill Chain model put forth by Lockheed Martin clearly describes attack behaviors, offering a new approach to delineating the attack path and the resulting harm from piles of logs. Information Security Continuous Monitoring (ISCM) advocates a proactive, automatic, and risk-based security incident response mechanism. The National Aeronautics and Space Administration (NASA) released reports on establishing a security incident response mechanism based on continuous monitoring, providing guidance for building security operation systems.
In 2015, NSFOCUS launched a security analysis system that is based on big data. This article focuses on the design of a security incident response system that is based on data analysis. NSFOCUS security team deployed this system for a customer, which provides security analysis services after receiving logs from several intrusion prevention systems. The following describes a security incident analysis case which is also a useful reference for building an O&M system for the security team.
Analysis of Security Data and Response Process
According to the intrusion threat situation figure, we can see that many attack events, mainly brute-force attacks (figure 2), occurred on several servers in the same subnet.
Figure 3 shows the last-7-day kill chain analysis based on intrusion alert logs. We can click the “details” button at the upper right of a chart to check details of a kill chain. See figure 4
The attacker attempted to log in to the system for brute-force attacks. Though the login failed, a great number of web attacks were launched against the server. As the system was deployed only a short time ago, we have not sorted out these asset objects under protection. Therefore, to obtain relevant information about the attacked server, we resorted to NSFOCUS’s threat intelligence system.
By making a reverse DNS query, we obtained domain names (figure 6) and web fingerprints (figure 7) of the attacked server. Also, the query result shows that the website with the domain names runs on a Windows-based server which provides web services and FTP service by using IIS 7.5. We guessed that the website mainly provides information services to the public, while the FTP service is used for maintenance of the business system.
Now, let’s look at the original attack logs. Figure 8 shows brute-force attack logs which reveal that multiple source IP addresses launched brute-force attacks against the MySQL database server and FTP server installed on the attacked server. Though the attacker failed to break into the server according to figures 1 and 2, the continuous attacks of these source IP addresses were evidently malicious behaviors. NSFOCUS’s security incident response system provides the incident handling recommendations. With the consent of the customer, we performed relevant configurations to enable the intrusion prevention system to block such source IP addresses.
After analysis, the security team found that all the source IP addresses were those of Windows-based servers in an Internet Data Center (IDC) and such victim servers may have become zombies. Without more data sources, the security team could not make an in-depth analysis for further attack source tracing.
About the System Design
During the preceding analysis, we realize that a security incident response system should be designed in such a way as to make the security team clear about the security situation at a glance, and such a system should provide auxiliary automated tools for users to establish security response policies in near real time. Following are our considerations on the design of a security incident response system from three perspectives:
Situation Awareness and Support for Decision-Making
A security incident response system with the intrusion situation awareness capability enables security teams to have a fast grasp of the overall security status. The system can inform people of the current overall security status and trend, letting administrators know whether it is necessary to adjust and optimize security policies and to reallocate resources accordingly. The system can also present data in neat and useful tables and charts that are appealing to eyes, visualizing the effect of security policies to let customers know the value of the system.
Nowadays, a large number of security devices, systems, maps, and globe-like apparatuses are available in the market, providing dazzling data to grab our attention. In fact, what matters to us is whether such information and incidents are authentic and valid and whether they can truly help us resolve security issues.
The system is supposed to enable security personnel to rapidly and accurately discern events of top concern. Based on data analysis, the system should be able to display data according to different data, priority, and user levels. Such a classification of data from various dimensions can reduce the data noise. In circumstances where attacks take place frequently, the kill chain analysis model can summarize attack behaviors. But it is not enough. The system should be able to determine the severity of attack events and instruct security personnel to properly handle these events. The problem is that the current kill chain list available in the system cannot effectively help security personnel determine which events should be handled first. The system should let users know for which events manual intervention is a must and should automatically provide suggestions for handling these events. Behind these moves, the system has actually completed a series of actions, known or unknown. Therefore, the automatic processing capability has the loudest say in the system capacity.
If equipped with the lead-based drilling-down function, the system can allow analyzers to keep digging data level by level from one entry instead of frequently switching from one page to another, as is typical in traditional system design.
Automatic Interoperation with Multiple Data Sources and Systems for Combined Capabilities
The capacity of a single system is often limited. As shown in the example given before, the complexity of security incidents requires the interoperation with multiple systems for combined capabilities of these systems when we analyze and determine the severity of a security incident. Only in this way can the system proactively respond to and defend against security incidents in a semiautomatic or automatic manner. In this article, other data systems that the system should automatically interoperate with refer specifically to threat intelligence systems. Based on the alert log analysis result, the system can find out the destination IP address and list the server location, operating system, opened services, and high-risk vulnerabilities (like the notorious IIS-related vulnerability disclosed in the preceding example) through automatic system query. Such information can be very useful for operation and maintenance (O&M) personnel. The system should also receive logs from hosts. The preceding example presents only logs from the intrusion detection system (IDS) device. If it is possible to obtain server-related information, such as security logs and patches, the system can perform a more thorough correlation analysis, thus improving the integrity and accuracy of kill chain data as well as the efficiency of security policy evaluation. When DDoS attacks are in question, the system is expected to automatically interoperate with the DDoS detection system, presenting users a whole picture of the DDoS incident.
The system itself should be open so as to incorporate capabilities of other security systems. It is advisable to collaborate with other systems for obtaining threat intelligence to effectively enhance the capability of analyzing attacking IP addresses. For this purpose, the system should be designed in such a way as to be able to obtain threat intelligence from different security vendors. With open interfaces, the system can interoperate with the systems from carriers or other security vendors, enabling users to track and analyze attack sources.
After the system interconnects with multiple systems, or more specifically, multiple security intelligence systems, the next crucial step is to correlate and merge various capabilities. This requires the joint efforts of experienced security personnel to program and transform individual and teamwork experience to higher productivity. This area is like a virgin land, waiting for us to explore.
Other Capabilities Required
The system should also provide the following capabilities:
- Providing continuous monitoring for security teams
- Evaluating security personnel’s incident handling effect and notifying security personnel of the evaluation result
- Working with the risk assessment system to persistently assess the vulnerability of servers
This article describes how a security incident response system analyzes security data and responds to security incidents and conducts a high-level analysis of considerations for designing such a system to meet day-to-day security O&M requirements. The setup of an enterprise security system is a complicated project involving a lot of work. Our security incident response system is far from perfect. We hope that more people will join us in the related discussion.
Chinese version: http://blog.nsfocus.net/security-incident-response-system-design/