5 Best Practices Production-Ready Containers| Slim.AI

December 29, 2022

5 Best Practices Production-Ready Containers

Slim.AI was created to give developers the power to build safer cloud-native applications with less friction. If you’re reading this here on our blog, we guess you know a bit about DockerSlim the open source backbone of the Slim SaaS platform, but for those who might be reading it elsewhere, the TL;DR is that Slim.AI allows developers to optimize their containers, reducing both overall size and vulnerability count. By increasing efficiency, while at the same time decreasing the attack surface, we ensure that you’re only shipping what you need to production.

By building DockerSlim and our platform, we’ve learned quite a bit about what real, secure production-ready containers look like and in this post we’d like to dive into a reproducible example, that you can run at home to try and gain a better understanding of what this looks like practically. You’ll come away with 5 security best practices you can apply today, to achieve more production-grade containers.

This post will cover:

The sample application
Then we’ll look at several techniques, with a focus on where they impact security
An overview of tools that can help you understand exactly what you are shipping to production.
We’ll make an objective decision on what the best base image is for the example application
Finally we’ll harden the container image with DockerSlim to significantly reduce the attack surface

Using the techniques and tools I’ll outline today, this is what can be achieved.
All image sizes I’ll show in this present are uncompressed sizes.

Our Sample Application

Let’s get started by taking a look at our sample application. We’re going use a simple Python example comprised of a simple Python/Flask app 🐍 that implements an even simpler RESTful API. (The app is just for illustrative purposes, it’s function is unimportant.)

What we’re going to do with the application is containerize it using 4 base images, and differing container image composition techniques, and then we’ll dive into how each of these impact security.

Let’s start by taking a look at one of the Dockerfiles.

FROM python:3.9.13-slim-bullseye
RUN apt-get -y update && apt-get -y upgrade && \
    apt-get -y clean && rm -rf /var/lib/apt/lists/*

WORKDIR /app
COPY --chown=nobody:nobody app/requirements.txt ./
RUN pip3 install --no-cache-dir -r requirements.txt
COPY --chown=nobody:nobody app/app.py ./

USER nobody
EXPOSE 8008
ENTRYPOINT \["python3","app.py"\]

By reviewing the Dockerfile, we can see that it adheres to container best practice, as follows:

It uses an official Python base image
A WORKDIR is defined for our app
It has good layer construction, to minimize cache invalidation and optimise build performance
Files are COPYed in only as required
A port is exposed
And it uses ENTRYPOINT for proper signal handling

However, upon closer look, there are a few things here that help specifically with container security.

User Nobody

USER nobody

The first, pro tip when it comes to containers, is that if you do not specify a USER in your Dockerfile, your app will run as root, and this has a few critical implications. This means:

nobody is an unprivileged system account
your container is available, by default, in any Linux distribution–whether Debian, Ubuntu, Alpine, or Distroless, etc.

The nobody account is intended to run things that don't need any special permissions, and is usually reserved for services so that if they get compromised, the would-be attacker has minimal access/impact on the rest of the system. In contrast, if your app is running as root, then the would-be attacker potentially has complete access to the container, as well as possibly tools and utilities shipped in the container that they can now use to disrupt your operations and infrastructure.

Pinning Your Version Number

FROM python:3.9.13-slim-bullseye

Choosing a version number for your base image is often called pinning, some tutorials teach newcomers to pin their images to the latest tag, but this isn’t always a good practice. Here’s why.

Containers are meant to be ephemeral, meaning they can be created, destroyed, started, stopped, and reproduced with ease and reliability. Using the latest tag means there isn't a single source of truth for your container's "bill of materials", resulting in your container getting whatever the most recently updated version is. Every now and again, upgrading to the latest tag can introduce major version bumps of the system and language which may result in breaking changes to your application.

Pinning a specific major and minor version in your Dockerfile is a trade-off. While you're choosing to not automatically receive system and language upgrades via updates, most DevSecOps teams prefer to employ security scanning as a way to control updates rather than dealing with the unpredictability that comes with container build and runtime failures.

We’ll now see how specifying the base image tag can be helpful.

RUN apt-get -y update && apt-get -y upgrade && \
    apt-get -y clean && rm -rf /var/lib/apt/lists/*

Once upon a time, avoiding RUN apt-get upgrade (and equivalents) in Dockerfile was considered best practice. In the majority of cases, this is not good advice.

We have since learned that base images from vendors, and large projects, are frequently updated, using the same tag, to include critical bug fixes and security updates. However, there can often be days between the updates being published in the package repositories and the revised base images being pushed to registries.

This means that relying on the base image alone is not sufficient, and this is true even for images, blessed by, and maintained by companies with plenty of resources. Now imagine a small open source project, maintained in someone’s spare time. These delays can be significant.

If you pin a stable base image, package updates are purely focused on security fixes and severe bug fixes.

By doing so, you can safely apply system updates without fear of unexpected upgrades that may introduce breaking changes. But just note, you need to be sure you are really applying the latest updates.

Docker Build Speed Optimization

docker build --pull --no-cache -f app:latest .

Docker builds can be slow, so a good practice is to use layer caching, to reuse build steps from prior builds to speed up the current one. While this does improve build performance, there’s a potential downside: caching can lead to insecure images. For most Dockerfile commands, if the text of the command hasn’t changed, the previously cached layer will be reused in the current build.

When you’re relying on caching, those apt-get install/update/upgrade RUN commands will add old, possibly, insecure packages into your images, even after your “distro” vendor has released security updates.

That’s why you’re sometimes going to want to bypass the caching, and this can be done by passing two arguments to docker build:

--pull: pulls the latest version of the base Docker image, instead of using the locally cached one.
--no-cache: ensures all additional layers in the Dockerfile get rebuilt from scratch, instead of relying on the layer cache.

If you add those arguments to docker build the new image will have the latest system-level packages and security updates.

If you want both the benefits of caching, and to get the required security updates within a reasonable amount of time, you will need two build processes:

The normal image build process that happens whenever new code is released.
Every night, rebuild your container image from scratch using docker build --pull --no-cache to ensure you have all the security updates.

We now have a container image that adheres to best practice.

Container Lifecycle Security

One security best practice that is undisputed, is that you should absolutely perform vulnerability scans and generate SBOMs in your production container image build pipelines, and review the results regularly. These are extremely useful tools for understanding what’s in your containers, what you are shipping to production, and what your potential exposure is.

Docker has integrated vulnerability and SBOM scanners

docker scan is powered by Snyk.
docker sbom is powered by Syft.

For the purposes of this presentation I used the Slim.AI SaaS which has Grype, Trivy and Snyk integrated.

Grype, Snyk, Trivy, and Clair are all useful tools for assessing different aspects of your container security, and it’s recommended to give them all a try. The Slim.AI Docker Extension makes it possible to demystify containers and really get to know what’s inside them.

Knowing what’s in a container is critical to securing your software supply chain. The Slim platform coupled with best of breed open source tooling, lifts the veil on container internals so you can analyze, optimize, and compare changes before deploying your cloud-native apps. Let’s use container scanning and analysis to determine what the “best” base image would be for our example application.

The regular official python:latest base image is built from Debian 11 and weighs in at 921MB.

But smaller starting points are available.

                                        Size
Python:3.9.13-alpine3.16                48MB 🥇
gcr.io/distroless/python3               54MB 🥈 (Multistage)
Ubuntu:22.04                            78MB 🥉(No Python)
python:3.9.13-slim-bullseye             125MB

I’ll containerize our example app using four different base images including:

The official Python image based on Alpine 3.16
The official Python image using their “slim” version based on Debian 11
A Distroless multi-stage build
And Ubuntu 22.04, which doesn’t include Python so has to be installed via a Dockerfile RUN command

Sometimes, it is necessary to install additional system packages as dependencies for your application or to otherwise help build your image. The Ubuntu base image doesn’t include Python, so it needs to be installed via apt-get using a Dockerfile RUN command.

The default options for system package installation with Debian, Ubuntu, and RedHat Enterprise Linux (RHEL) can result in much bigger images than you actually need. That 921MB python:latest base image I mentioned earlier is a good example of package excess. More packages make the container image larger, which in turn increases the attack surface of your container. So, when you do need to install system packages, a good practice is to avoid installing the recommended dependencies.

Here are examples for Ubuntu and RHEL that install Python3 without the unnecessary recommended packages. Using –no-install-recommends in my Ubuntu based container reduces the image size by ~298MB

Building Our Containers

Let’s build our app with each of the base images and see how the final image sizes stack up against each other.

RUN apt-get -y --no-install-recommends install python3-minimal python3-pip

RUN dnf --nodocs -y install --setopt=install_weak_deps=False python3

Here’s the results in terms of image size.

                                            Size
Python:3.9.13-alpine3.16                    60MB 🥇
gcr.io/distroless/python3                   72MB 🥈
Ubuntu:22.04                                131MB 🥉
python:3.9.13-slim-bullseye                 159MB

The images all include Python, our example app and its dependencies, which is 11 packages installed via pip.

The adage goes, “smaller is safer”, where the logic is that a smaller image size corresponds to fewer packages, and that should result in fewer vulnerabilities. Let’s check that, and see if that holds true.

Our Containers Under the Hood

There is no denying the Alpine results are excellent.

                                        Total       Crit        High
Python:3.9.13-alpine3.16                0           0           0 🥇
gcr.io/distroless/python3               24          0           0 🥈
Ubuntu:22.04                            47          3           5 🥉
python:3.9.13-slim-bullseye             82          1           11

The official Python image based on Debian 11 is not looking great, however, with 82 vulnerabilities of which 1 is Critical and 11 are High. These are all in system packages installed via apt.

Distroless, also based on Debian 11, has 47 vulnerabilities, of those 3 are Critical and 5 are High. Again, these are all in system packages installed via apt. And with DIstroless, it is also difficult to do much about that. Unlike traditional Debian derived images, there is no apt-get in order to install the latest updates. This can be worked around but it’s non-trivial.

And that brings us to Ubuntu.

The third largest container image, but the second best vulnerability assessment with no critical or high vulnerabilities at all, which is pretty impressive. There are just 3 Medium and 13 Low risk vulnerabilities and 8 Negligible. So, why is this? How can the Debian-based image be so different from Ubuntu when it is derived from Debian, right?

The primary difference is that Ubuntu is a commercially backed Linux distro, with a full-time security team that has SLAs to mitigate vulnerabilities for their customers, which includes mitigating all Critical and High vulnerabilities for the supported lifetime of the distro.

Debian on the other hand, is a community project. While many Debian contributors (including myself) do fix security issues in Debian, it simply can not provide the same level of commitment to security as the commercially backed Linux distro vendors such as Canonical, Red Hat and SUSE.

The interesting thing to note is that the Python image based on Debian 11 is the same size as the Ubuntu based image, but has over 3 times the number of vulnerabilities.

In fact, 20 CVEs in the Python image based on Debian 11 are marked as “Won’t Fix’, alongside 15 in the Distroless container image. These are known vulnerabilities that the Debian project will not fix in Debian 11 and therefore will not be fixed in Distroless or any other container image based on Debian 11.

So, Alpine is the clear winner, right?

Sadly, not 😭

Python, Node and some other languages, can result in significantly slower builds and introduce runtime bugs, not to mention unexpected behaviour. This is due to differences in musl, used in Alpine, as opposed to glibc used in most other distros. This topic could also be an entire blog post of its.

My personal take, is that it is not recommended to use Alpine for Python apps, however, it can be great for Go and Rust-based applications.

But what if…

What if I could have the low complexity of maintaining Ubuntu-based containers and the security profile of Alpine? What if I can make containers smaller than Alpine?

Enter Slimming & Minifying with DockerSlim

Let’s try just that. At Slim.AI, we use the terms slim, minify, harden and optimize interchangeably to describe the act of reducing the size of a container image. 🤏

Below we are going to walk you through an example of slimming the Ubuntu container using DockerSlim, our free and open source project, that is available from GitHub. Both DockerSlim and the Slim.AI SaaS platform can automatically optimize your containers.

You don't need to change anything in your container image, and you will be able to minify it by up to 30x, making it more secure by reducing the attack surface of the container. We split this into two categories: analysis and creating a new single layer image.

One of the most common questions we get asked is how DockerSlim works (and we’ve addressed parts of this in earlier posts).

Analysis

Let’s start with the first category - analysis. docker-slim optimizes containers by understanding your application and what it actually needs using various analysis techniques including static and dynamic tracing. docker-slim will throw away what the container doesn’t need, so that only the critical pieces to running your application remain.

Creating the Image

Once this phase is completed, docker-slim generates a new single layer image. This image is composed of only those files in the original unoptimised image that are required for your app to properly function. You can understand your container image before and after you optimize it using the Slim.AI SaaS or Slim.AI Docker Extension, including exactly how your container image was changed in this process.

There are a number of benefits to slimming containers, in case you’re asking yourself why you should slim container images anyway?

By slimming containers you only ship into production what your app requires. Slim containers can be up to 30X smaller.
Slim container images are faster to deploy (due to the smaller size) and faster to start (fewer files).
Slim container images can be less expensive to store and transfer.
Slim containers reduce your attack surface.

This is also backed by data and research. Our report titled What We Discovered Analyzing the Top 100 Public Container Images shows an increasing trend of dev/test/qa/infra tooling being left in production containers. Unused shells, interpreters, tools and utilities left in your container images can be used against your infrastructure to disrupt operations if a container is compromised.

Let’s take a look at the slimmed containers.

                                        FAT         SLIM            REDUCED
Python:3.9.13-alpine3.16                60MB        19MB            3.09X
gcr.io/distroless/python3               72MB        21MB            3.40X
Ubuntu:22.04                            159MB       23MB            6.83X   
python:3.9.13-slim-bullseye             131MB       25MB            5.41X

In most cases there are significant size reductions to be had slimming any container image, regardless of build technique and base image used. In our examples here, we see between a 3X and 7X size reduction.
And this is quite modest––some containers reduce 10X and even 30X for complex applications. However, despite this, the value is apparent, as the container attack surface has been significantly reduced.

Our slim Ubuntu based image is now just 6MB larger than the slim Alpine based image, but with none of the compatibility concerns and coming in at 35MB smaller than the unoptimised Apline image. This is great from the size and performance perspective, but let’s see how it stacks up on the security side.

Has slimming also improved the vulnerability assessment?

Let’s find out.

                                        TOTAL
Python:3.9.13-alpine3.16                0
Ubuntu:22.04                            0

Analysing if vulnerable components exist in slimmed containers is currently a semi-manual task and we’re working on automating this. That said, it only takes a few minutes (at most) using Slim.AI SaaS or the Slim.AI Docker Extension to search for vulnerable components and confirm they are no longer present.

We’ve already confirmed the Ubuntu based image was free of CRITICAL and HIGH risk vulnerabilities to begin with, with the 3 MEDIUM risk vulnerabilities in the libsqlite3, perl-base and zlib1g packages.

It took, literally, seconds to confirm none of those components exist in the slimmed image.

Using the Slim.AI SaaS confirmed all of the components affected by the remaining LOW risk vulnerabilities have also been removed.

We have an Ubuntu based image with no known vulnerabilities, of comparable size to a slimmed Alpine image with none of the potential complexity of working with Alpine. So yes! We can have our cake and eat it too 🍰

What this analysis shows is that Ubuntu is the “best” base image for our example app after all.

What Five Tips Have We Learned?

To summarise the takeaways from this post, the important things we discovered are that, first and foremost, following container best practices will set you up for success. Let’s recap:

Always run internet facing apps and services via unprivileged USER accounts
Pin your base images
Use a stable base image & apply updates
Be mindful of layer caching introducing potentially insecure packages into your container images
Container scanning and analysis are essential to fully understand what is inside your container

In addition to these tips, we have also learned that it is not a good practice to install recommended packages, as these are really just bloat, and do not help keep your container slim and secure. When it comes to choosing the right base image, there are benefits to working with known vendors than community supported projects, in this example we learned that Linux vendors have security SLAs.

That is why it is always important to assess your base image options and pick what is most suitable for your project, and like scanning, it is recommended to slim your containers (as it is possible to slim just about any kind of container image), including those based on Alpine and Distroless. By slimming, you also remove vulnerabilities & reduce attack surface, on top of deriving performance and cost benefits.

5 Best Practices Production-Ready Containers