What We Discovered Analyzing the Top 100 Public Container Images
Oct 13, 2021
Containers are ubiquitous in modern development.
Between 2020 and 2021, the number of all-time pulls on Docker Hub nearly tripled, from 130 billion to 318 billion. That level of growth is astounding, especially when you consider that it took more than six years to achieve the first 130 billion and that some estimates say Docker Hub, while still the most popular container registry, is home to just half of the world's containers.
Containers have become the norm for application development, with the massive developer adoption of cloud native apps and containerized workflows. The result: millions of public image repositories. And with more than eight million public container images on Docker Hub alone, the container landscape continues to get more complex, more specialized, and more difficult to secure.
At Slim.AI—our startup that's focused on developer experience around container best practices—we have container enthusiasts using our various tools everyday, scanning containers, optimizing them, and sharing their experience with us.
We thought it would be interesting to find out what's inside the public images that serve as starting points for nearly all modern software development.
So, we looked.
A brief of summary of findings:
Finding 1: Bloated Containers Are a Time Sink in CI/CD
Our analysis showed a nearly perfect correlation between container size and scan time.This number may seem trivial for shipping a single container to production, but multiplied by the thousands of images used in a typical organization and hundreds of developers shipping images multiple times a day, it means real productivity losses.
A 1GB container takes approximately 6X longer to scan than a 200 MB container.
For a typical development team, this could conservatively mean 160 wasted hours per year.*
Finding 2: Complexity hinders clear understanding, even for experts
Our analysis shows that understanding the composure of both general and special purpose containers requires massive effort. We looked at distributions of packages, licenses and special permissions across all categories expecting large outliers, but even the averages were surprisingly high. It is typical to see hundreds of packages even in small, special purpose containers. And as we explore larger images in more generic categories, these numbers explode.
Finding 3: Attack surface is more than just a vulnerability count
We looked at “attack surface” — not just vulnerabilities found in a container scan, but the combination of known vulnerabilities, their criticality, files in the container with special permissions, and total number of packages (i.e., potential Zero Day vulnerabilities) — and saw a wide spread among categories and containers within categories. To us, this implies a required (and presumably manual) step in the “getting dev containers ready for production” process that many teams may ignore.
Want to discuss these findings with fellow container enthusiasts? Join us on Discord at: https://discord.gg/uBttmfyYNB.