A hefty dose of caution and expectations management when it comes to static code analysis or vulnerability scanning tools.
- Scan findings often contain a few thousand items. The vast majority of them don't matter.
- Scanning tools often do not take into account the attack surface when they scan. This means that vulnerabilities found by them should not immediately be described as those that cause a risk.
- The code libraries common in use are open source, these libraries have lots of code in them which does not make it into the final software. This unused code might be included in the scan and it would be a waste of time to go hacking apart a library just to make the scan results better.
These reasons and more make the results from static code analysis poor metrics. It's tempting to use them in this manner. The number of discovered vulnerabilities is such an easy number to find and we all expect it to be reduced quickly as we fix things. But unlike other metrics, static code analysis results, in the real world, never arrive at a state we can call complete or any other status than “better”. This is troublesome because metrics, almost by definition, are presented to audiences that are not deep subject matter experts in the nuances of their creation. That often means that the decisions that are made are not driven by an objective review of the results. Instead, non-expert audiences are swayed in their perceptions, positive or negative, by how well the expert presents their arguments. An inarticulate developer thrown before leadership holding a report claiming 1000 false positives on a scan is going to fare poorly. I’ve seen this scenario play out multiple times and it never ends well. The loss of confidence that can result from confusing metrics is often countered by adding more developers or oversight to the project. In the worst of cases it can lead to a shift of responsibilities, effectively destroying the hard fought knowledge attained by that developer during the course of their work. In 1975 Fred Brooks coined Brook’s Law which states that "adding manpower to a late software project makes it later". I believe it holds true for projects that weren’t late to begin with. Using bad metrics to report performance can ruin that performance.
With all of the trouble that comes with improper use of their results, scanning tools can still be a powerful tool in your arsenal for writing good software. I believe that the highest level that those raw scan numbers should go is the person who directly manages the performance of the developers. These performance management decisions are mostly based on relative personnel performance and developer performance management. Most importantly the person in this role often has deep subject matter understanding. This is also the level where implementation of design takes place. If you don't trust your lead developer to be able to find and mitigate code vulnerabilities, find a new one. Everybody up the management chain from them will not have the technical skill to evaluate those decisions although few will hesitate to do so if called on. When developing status reports for levels higher in the organization than this, the scan results need to be culled down to those that are actionable items. Typically, they get dropped as a bug fix into a sprint or some other task management system, whatever yours may be. Those sprint items or the backlog can provide excellent metrics, provided they have an apparent and predictable stability. Managing requirements coherently is the topic for another post, so I’ll leave it at that here.
Software code and vulnerability scanning is a great tool, but it's a two edged sword... with a few other edges hidden where you don't expect them.