Last Updated onReading Time: 7 minutes
Containers help solve the “well it worked on my laptop” issue that was quite prevalent in the past. Developers can now define the exact versions of dependencies that have been tested to work alongside their applications. There is however a dark side to this. In the old days you may have had a team of operations and security people responsible for upgrades. Today we have thousands of images with who knows what inside that never get updated.
This security issue became headline news a few years ago when it emerged that a large proportion of the images on Dockerhub contained vulnerabilities. Docker went to great lengths to add new features to try to combat this massively negative publicity. More recently I read an article that said backdoored images on Dockerhub had been downloaded 5 million times before removal.
Today we’ll compare some container vulnerability scanning applications that you can run yourself. This comparison does not include any SaaS applications. The majority are free and opensource with the exception of Twistlock which I included because I had already set it up at work.
Instead of just doing internet searches and comparing features I wanted to install each and scan a single container to get a sense of how simple they are to setup, how quickly they run and what the results look like.
I’ve evaluated each of the applications from the perspective of potentially adding them to a continuous delivery pipeline. Where containers would get scanned on each pull request and fail the build if any vulnerabilities are found.
For my testing I tried to scan the official Kong image on Dockerhub. At the time of writing this is docker.io/library/kong:1.0.0rc1. This is based on Alpine 3.6 so has some known vulnerabilities and I thought that it may be representative of the types of containers people might be running. I was expecting the scanners to mostly report the same output but as you’ll see below I was mistaken.
This was beautifully simple to setup. I used the arminc/clair-local-scan which includes the latest database snapshot updated every 24 hours. With two Docker commands I had the server and Postgres database running on my laptop. Another binary called clair-scannner is then used to connect to the server to run the scan.
The only problem I had was that I’m running OSX which has Docker running in a VM. It took a couple of Google searches to realise I needed to set the ip address clair-scanner uses to the address on network interface en0. On Linux you can probably just use localhost.
Clair reported only a single vulnerability: CVE-2018-14618. I like the pretty coloured text output and it was very quick to run (4 seconds). With the baked in database and Docker containers this would be extremely simple to add to a CI build stage.
The instructions for this one looked a bit daunting at first but they are actually really simple copy and pastes. You create a directory, download a config file and then run docker-compose to start the server and database. Anchore Engine uses Postgres.
Once the server is running you can use the Anchore CLI which can be installed by Python pip. The CLI has commands to add an image to the engine, query for status and print out a report. I quite like this interface when playing locally but it’s not optimum for a CI pipeline. You have to poll in a loop to check status as per the Gitlab integration document.
Anchore Engine reported the same vulnerability as Clair. The scan time was a little bit longer than Clair but still only a couple of minutes from start to finish. This could easily be added to a CI system.
Aqua make some cool open source security software. What I liked about Aqua Microscanner is it takes a different approach to scanning than the other options. To enable scanning you simply add 3 lines to a Dockerfile. The free version does come at the cost of providing an email address. There is also a paid enterprise version that adds more features.
By default Aqua Microscanner outputs a big block of JSON so I won’t screenshot that here. You can see the full output for all scans in this spreadsheet.
Aqua Microscanner didn’t find the Curl vulnerability. Or from another perspective Clair and Anchore Engine didn’t find these two vulnerabilities.
The scan was very quick and this is probably the simplest way to add vulnerability scanning to your pipelines.
This is paid software. I was curious to see what vulnerabilities it would find versus the free options. Installation of Twistlock wasn’t particularly difficult. We bundled it into a container and execute scans as part of our pull request builds.
What’s weird is that Twistlock found a high and medium severity CVE in openssl whereas the others didn’t. It also picked up on one of the unzipping CVE’s found solely by Aqua Microscanner. However, it didn’t detect the Curl CVE found by Clair and Anchore Engine.
Dagda looks really promising from the readme. It scans for vulnerabilities and compares against not only CVE’s but also BIDs (Bugtraq IDs), RHSAs (Red Hat Security Advisories) and RHBAs (Red Hat Bug Advisories).
Unfortunately I wasn’t able to get this running. It didn’t seem to want to work on my laptop and I assumed it was because it needed Linux as it has some integration with Falco that needs development headers. So I ran it on an AWS instance and that failed due to filling memory. I set it up a third time on an instance with 8GB ram and it seems to be stuck populating the Mongo database.
I’ll persist and update the blog when I get the results.
From what I know so far I think you’d need to setup a long running Dagda server on a reasonably sized instance and then connect out from your CI system via an SSH tunnel to execute scans. I don’t know how long the scans themselves take but I saw some Github issues stating 30 minutes was not unusual.
Dagda has issues open for performance and is under constant development so I think these problems could end up being solved.
EDIT: I got it working after a couple more restarts!
This matches the unzip bug found by Aqua Microscanner and Twistlock. It also found a couple of issues with PCRE. One in the nvd database and another by doing a lookup against the Bugtraq database.
In the end, after the server is up and running, the scan itself was very quick. So this could be added to a pipeline assuming you run the Dagda server itself and Mongo on an instance with > 4Gb memory and try not to rebuild the database too often from scratch.
I read various Github issues that said the oscap-docker tool would scan any flavour of container. It’s maintained by Redhat and the official docs all seem to state it’s for RHEL scanning. So I persisted anyway and installed OpenSCAP on a Centos instance using the yum packages. It complained about needing Atomic so I installed that too.
Only 10 minutes of my life wasted but a bit disappointing. As you can see it does seem to output RHSA information when you feed it a RHEL based image so this could be valuable for companies running RHEL exclusively.
All of the scanning applications I tested with the exception of Dagda were extremely simple to get up and running. Aqua Microscanner is definitely the simplest to integrate.
Clair, Anchore Engine, Aqua Microscanner and Twistlock all only take a couple of minutes to run which is perfect for adding to a pipeline.
These scanners mostly work by enumerating installed OS packages and comparing versions to the CVE database. I’m confused why 4 different scanners would produce such different results. Obviously comparing just a single image isn’t very scientific but it does make me wonder whether you need to run multiple scanners in your pipeline to catch everything which is a bit insane.
|wdt_ID||Clair||Anchore Engine||Aqua Microscanner||Twistlock||Dagda|
|1||CVE-2018-14618||CVE-2018-14618||CVE-2015-9261 CVE-2016-3189||CVE-2016-8610 CVE-2015-9261 CVE-2016-7055||CVE-2015-9261 CVE-2017-11164 CVE-2017-16231|
That’s a total of 7 unique vulnerabilities found and you’d need to run at least 4 different scanning applications to find them all.
I must admit that I haven’t yet investigated if any of these CVE’s are false positives. It’s highly possible they are. However, given the sample size of a single image it’s not a great way to judge a winner. It’s merely anecdotal evidence that differences are quite apparent between applications that scan the same image.
The results are interesting at least. I don’t consider this blog complete and will continue to add to it as I find out more information. Many of these scanners do more than just output CVE’s to a console. Some integrate with other server components and have many different options which I’ve not really looked into.
If you have any feedback please leave a message in the comment section.
Edit: A follow up blog has been completed which explains the differing results and why it’s probably not a good idea to scan Alpine containers with open source vulnerability scanners.
Tell us about a new Kubernetes application
Never miss a thing! Sign up for our newsletter to stay updated.
Discover and learn about everything Kubernetes