Using Machine Learning in Vulnerability Management for Prioritization
- Oct 04, 2022
- Adrienne Juett
Securing computer systems is the ultimate big data problem. Billions of computers generate roughly 2.5 quintillion bytes of data everyday. Most of this data is benign; however, a small fraction comes from malicious actors. The large amount of data generated is too much for a security team to manually analyze necessitating the use of Artificial Intelligence/Machine Learning (AI/ML) techniques to sort the good from the bad.
AI/ML is used in a variety of ways within the cybersecurity realm. Anomaly detection is used to detect atypical network usage indicative of a network intrusion. AI/ML algorithms sort emails to identify phishing attempts and malware attachments. Here at Nopsec, we use AI/ML to prioritize vulnerabilities allowing security teams to remediate the highest risk vulnerabilities.
There are over 170,000 published software vulnerabilities in the National Vulnerability Database (NVD). While these vulnerabilities cover many different software systems and versions, it is not uncommon for a single system to have tens to hundreds of unpatched vulnerabilities. Multiply this by 1000s of computer systems within an organization, and it is easy to see why a security team can be quickly overwhelmed. Vulnerability prioritization seeks to bring order to this chaos by identifying the riskiest vulnerabilities for immediate patching.
Vulnerability scanners typically provide severity scores for detected vulnerabilities. For software vulnerabilities that are found by Common Vulnerability Enumeration (CVE) Program, this score is given by the Common Vulnerability Scoring System (CVSS). CVSS scores are a measure of the expected severity that an exploit of a vulnerability is expected to produce. CVSS base scores, the scores that are most commonly used, only measure the intrinsic properties of the vulnerability and do not take into account the behavior of threat actors or the specifics of an organization’s environment such as mitigating controls. Our analysis ([link to July blog for algorithm update]) has shown that prioritization based on CVSS score can yield an unmanageable number of vulnerabilities at the highest grades while simultaneously missing a significant fraction of CVEs known to be associated with actual threats.
Improved prioritization can occur when CVEs are scored based on their likelihood of having an exploit or being used within malware. Of the over 170,000 documented CVEs in the NVD, only 21.8% have known exploits and an even smaller fraction, 1.5%, are associated with malware or exploit kits. By focusing remediation efforts on CVEs that are expected to be in these exploit and/or malware groups, security teams can focus efforts on the riskiest vulnerabilities first.
So how do we identify these riskiest vulnerabilities? At Nopsec, we have developed our own risk scoring system for CVEs using a classification ML algorithm. First, we gather data from a variety of sources that help us characterize CVEs. This includes data from the NVD on the CVSS score and vulnerability description, what programs and versions are affected, linkages to Common Weakness Enumerations (CWEs), the availability of known exploits and data from social media. Next, we gather data on threat actors and which CVEs have been used in threats. All of this data is used to train and test our classification algorithm. Once the model has been trained, we are then able to use it to score any CVE. The Nopsec risk score is the likelihood that a CVE is or will be used within malware or an exploit kit by a threat actor.
By using AI/ML, the calculated risk score provides a better prioritization of CVEs for remediation. CVEs with a critical risk score are greater than 10 times more likely to have threats associated with them than CVEs with a high CVSS score. In addition, the risk score calculation is 80% better at predicting actual threats than a Critical CVSS score. This allows security teams to prioritize vulnerabilities that are associated with threats for immediate remediation.
Learn more about how to use machine learning in vulnerability management to prioritize your risk within the context of your environments with the on-demand webinar.
Vulnerability prioritization techniques that utilizes AI/ML identify the features of CVEs that are known to be used in exploits and malware and produce a risk score based on how similar a given CVE is to the set of threat identified CVEs.
By implementing AI/ML, algorithms can sort through and categorize large amounts of data into benign and malicious buckets. This allows security teams to focus on suspected threats more easily and allows them to ignore the much larger set of benign data.