CERT/CC's Art Manion says CVSS scoring needs to be replaced
Security expert Art Manion discusses what he calls major problems within the Common Vulnerability Scoring System and explains why CVSS needs to be replaced.
Security researchers at Carnegie Mellon University's Software Engineering Institute believe the Common Vulnerability Scoring System may be doing more harm than good.
CVSS scoring is widely used by vendors and third-party organizations such as NIST and the CERT Coordination Center (CERT/CC) at the Software Engineering Institute (SEI). However, members of SEI contend CVSS is widely misused by vendors and customers alike. In a research paper published in December, SEI members argued too many organizations use base scores to determine the risk a vulnerability poses to them, which is incorrect. The authors also cite mathematical issues with the CVSS scoring equations and call for either a drastic overhaul of the current system or for a completely new replacement to be developed. While CVSS version 3.0 was introduced in 2015, the authors believe that the core problems still exist in the latest version.
"I think we will probably go off on our own to some extent and make something new, publish it, do some tests and some experiments and then loop back and see if CVSS wants to go that direction or not," said Art Manion, vulnerability analysis technical manager at CERT/CC and co-author the paper. "Then I think we'll have a better discussion about is this. Is this usable? Is it practical? Does it replace CVSS? Is it connected to CVSS in some way?"
Manion spoke about the core issues with CVSS scoring, the problems it's causing for the infosec community and the dangers that inflated -- or deflated -- CVSS scores pose to organizations.
What are the fundamental issues that you see with CVSS scoring today?
Art Manion: There are two major issues, and the first one is not fully at the feet of the CVSS stakeholders. The documentation for CVSS is very clear; I don't think it even uses the word risk. They are trying to use severity or prioritization for vulnerabilities, which is fine. But there's this emphasis on people providing the base score; for example, NIST's national vulnerability database provides base scores, and a lot of vendors do as well. And as designed, the vendor or the source provides the base score, and then the consumer is supposed to come and add additional scores and get a revised number that is specific to their organization and their use case.
The theory's all sound there, but what we're seeing very broadly is people take the base score, stop thinking about it, and run with that. And there's a lot of places where that's encouraged in guidance, if not actual rules for how to handle vulnerabilities in certain places, and your people are using that base score as basically a risk assessment. Again, CVSS doesn't claim to do that, but that's how it's being very widely used. That may be the biggest problem.
A lot of the advice we give around the moving forward side already is if you take CVSS as input to what you're doing, that's great. But do not take the free base score candy being handed out and assume you're done at that point. It's counter-informative, and it may be actually dangerous to trust or rely on base scores alone.
It's not that the CVSS SIG [Special Interest Group] is telling people to go use it this way, but that's how it ends up being used. It's a major problem. And the way the vectors are combined into a one-dimensional score is also a problem. In the paper you'll see there's some actual technical math problems with that. But basically, there are equations like "fastest + fastest" that don't make a lot of sense. You can't do some of the things that they do with those values. It just doesn't work that way.
That seems like a big problem. How did that happen?
Manion: The math is sort of tortuous. There wasn't a lot of transparency on that process. I'm on the SIG, so I actually know a little bit about what happened, but no one else really does. We didn't even write it down that well internally, so it's hard to tell.
There were good intentions, but this thing has grown to be widely used incorrectly, and it's a problem. There's a mistreatment of it as risk assessment and then there's a couple of angles on why the math is wonky. Those are the biggest problems. A one-size equation does not fit all is the important thing. What if I'm actually a safety-critical embedded system and availability impact is the worst possible problem I could have? If somebody was on my system and stole my information, that's bad. But as long as the system is running and people are not being injured or things aren't being damaged because the system is still running properly, I'll take the confidentiality hit over availability any day. But you can't change that in the CVSS; that's all fixed in the equation.
There have been a number of occasions in recent years where vendors have complained about the CVSS scores their vulnerabilities were assigned, often arguing that the scores were too high and the severity of the bugs were not as serious as the scores indicated. Do you think those complaints about CVSS scoring were on to something, or are a lot of those complaints based on a different agenda?
Manion: My guess is it's probably both. If my position in that paper is that the system has these important flaws, then it's a little weird for me to get into an argument with someone and saying, 'Well, that's an 8.8, and if you change this vector like I think it should be changed, then that drops the score.' If I'm saying the whole system is kind of broken, then why am I arguing about a detail of it with a specific score? But on the other hand, if you're discussing a vulnerability with a vendor and the vendor claims it's not network-accessible, but you're saying that it is, at least you're having a more concrete discussion about the access vector [of CVSS]. That part of a complaint could be useful.
Art Manionvulnerability analysis technical manager, CERT/CC
The system isn't great to begin with. Let's say I'm going to assume at face value for a minute that the vendor wants to convey to their customers that this is largely true and a sort of accurate assessment of how bad this thing is. The vendor is mature and they want to tell their customers to stop and patch right away. If all that's true, then they don't want the vulnerability to be buried in a low score either. It could go either way.
I'd almost recommend a vendor just use their own justification for the high-medium-low scale and skip the CVSS entirely. To your question, yes, there might be closer-to-factual aspects of the vulnerability to talk about and argue about. But trying to nudge the score up and down so you'd get across the boundary of critical-to-high, and because your compliance requires you to do something if you're in one bucket to the other? That's madness. You're wasting time, and you're complying with the wrong things. And that's where it might be detrimental to be using it, rather than being just not a good use of your time. It may actually be causing you to focus your efforts in a suboptimal way.
You mentioned the problem can go either way, and that CVSS could actually be producing lower scores than it should be. With the way CVSS scoring currently works, are we more likely to get inaccurate scores that are inflated or deflated?
Manion: The way the whole thing is designed is that the base score is sort of a theoretical, technical worst-case scenario. In other words, if I have this piece of software under these conditions, in a lab environment, and I discount all the risk and context that's actually so important, then this scenario could happen, so therefore the score is a 9.8. And then if I come along and do the right thing by CVSS and [apply context] temporally -- there's a patch, and there's no exploit activity, and in my environment I have that tightly controlled network -- then the scores get lower as you do the rest. The documents we publish [at CERT/CC] use CVSS v2, and we score all of the parts -- base, temporal, environmental -- on purpose. And on any of our documents, you can watch the score just drop as you go through it.
Again, it's designed that way. But, for example, if you took a big set of base scores and had someone score them for full environmental, they're all going to go lower, and you'll probably see fewer critical and fewer high scores proportionally. And there'd be a lot of medium and a lot of low scores because that's actually probably closer to reality. There are bad vulnerabilities every once in a while, absolutely. But there are 20,000 new vulnerabilities a year publicly documented, and I don't know how many we actually get hit with. Hundreds? Less? Implying that the average state of vulnerabilities in the world is high or even medium is just wrong. It's the wrong assessment. So yes, there's inflation, and it's designed to start high and get lower. Nobody takes it lower, though, and that's the problem.