What’s in your container?

Why Docker Layers matter for container optimization
Pieter van Noordennen
Apr 22, 2021

The world of cloud-native development is rife with metaphor, and none more so than the container-ship analogy surrounding Docker. When it comes to Docker Layers, it might be useful to equate them to wooden pallets in the shipping analogy.

A Docker layer is a set of filesystem changes that creates an intermediate image in the build process. Like pallets, the Layers used to build a container:

  • Stack on top of each other sequentially to create the final container;
  • When arranged thoughtfully, can make a container organized, fast, and easy to work;
  • Conversely, make a giant mess when not constructed well;
  • Are of little interest to end users (i.e., consumers or developers), who care more about the final delivery than the container’s internal construction

Commands specified in a Docker file (FROM, RUN, COPY, etc.) cause the previous image to be altered, thus creating a new layer. Docker layers provide developers with a way to effectively “track changes” when authoring a new Docker image, but can also add performance overhead and larger container sizes.

FROM python:2.7.15

RUN mkdir -p /opt/my/service
COPY service /opt/my/service

WORKDIR /opt/my/service

RUN pip install -r requirements.txt

EXPOSE 9000
ENTRYPOINT ["python","/opt/my/service/server.py"]

Other than DevOps specialists, most developers don’t understand — nor care about — the internal structure of their Docker images and their related build paths. But those internals, and most notably the way Docker Layers are constructed in their images, play a critical role in the performance, functionality, and size of final containers.

Engineers who specialize in container optimization and security examine layers during construction to be able to tune the image construction process, reduce container size, increase performance and security, and triage errors in generated containers.

Docker Layers and Optimization: the Good, the Bad, and the Ugly

This layering concept is valuable when authoring and optimizing images for several reasons. It makes changes easier to see from one step to the next, it increases build speed through layer caching, and it allows container authors to arrange their image construction to improve build speed.

File Diffs

In effect, Docker stages changes like a version control system would, adding a change, then another, then another, in a sequential manner until the build is complete. Each successive change is a diff between the previous layer and the new one, just like you’d see in a merge request in GitHub.

When creating a layer, Docker will mount its own read-write filesystem layer on top of what’s already there and begin making changes. Layers do not have elements like environment variables or default values — these are properties of the image as a whole rather than a particular component — and should never be dependent on the state of any external system or process.

The benefit of this “immutability” restriction is that any number of containers can be started from one and the same image, making the state of a freshly created container predictable, and that Docker can use layer caching to reduce build time.

The downside, however, is that often necessary packages, files, or data is maintained in the filesystem and packaged with the resulting container. Three instructions — ADD, COPY, and RUN — create layers that increase the size of the resulting container. If a certain Dockerfile adds, deletes, and changes many files, the image grows in size. These active images can negatively impact performance.

Multi-stage builds are meant to control for this, but can be cumbersome to create and debug. Open source projects like our own DockerSlim can automate file size reduction, but have a learning curve to get implemented correctly.

Layer Caching

Because layers are intermediate images, if you make a change to your Dockerfile, Docker will build only the layer that was changed and the ones after that in a process is called layer caching.

Layer caching is useful when re-building an image that already exists on your machine. Each layer has its own sha256 value, and Docker will simply look at that value to see if it matches the sha256 value in cache, knowing whether it needs to rebuild a layer or not.

Layer caching can lead those new to Docker into a trap, however. A common anti-pattern in Dockerfile is to have commands or instructions that may depend on the state of external systems, such as an action pushed in a development database or microservice. If Docker sees no changes to the underlying layer (it doesn’t observe the outside world), it will assume the layer is properly constructed and cached and not rebuild it. This can lead to container errors that result in costly cycles to debug.

Build Speed

Since each layer is immutable, Docker doesn’t need to rebuild any layer in the cache that hasn’t changed. Almost all Docker image constructions start with the Operating System install in Layer 0 (often the result of the FROM command in the Dockerifle). This is because the OS is highly unlikely to have changed from one build to the next.

In good Dockerfile construction, layers are built from least likely to change to more likely to change, with some exceptions. This takes full advantage of layer caching, allowing images to be rebuilt and redeployed quickly.

However, sometimes container authors take shortcuts here to reduce file size, such as combining a bunch of actions into a single line of Dockerfile using the && operator. There’s nothing wrong with this, but if the instructions are haphazard, it could actually backfire by lumping in costly actions that could be cached with those that are bound to change frequently in development.

In summary, while most Docker Layers are going to be less interesting to those who simply want to grab a container, toss their app in it, and go, Layers actually do matter a lot when it comes to optimizing a container.

For more on container optimization, Slim.AI and DockerSlim, and cloud-native development, check out our Twitch stream or join our community page.

Related Articles

Automatically reduce Docker container size using DockerSlim

REST Web Service example using Python/Flask

John Amaral

CEO

Building Apps Using Cloud Native Buildpacks

Getting started with this innovative technique

Vince Power

Contributor

Comparing Container Versions with DockerSlim and Slim.AI

See differences between your original and slimmed images

Pieter van Noordennen

Growth

Five Proven Ways to Debug a Container

When Things Just Are Not Working

Theofanis Despoudis

Contributor

Getting Started with Multi-Container Apps

Up your container game with Docker Compose

Nicholas Bohorquez

Contributor

Reducing Docker Image Size - Slimming vs Compressing

Know the difference

Pieter van Noordennen

Growth

Quick Start Guide

Slim Developer Platform Early Access

Meet DockerSlim's Compose Mode

Optimize a multi-tier app with a single command

Ian Juma

Technical Staff

Creating a Container Pipeline with GitLab CI

Shipping containers the easy way

Nicolas Bohorquez

Contributor

5 Most Commonly Asked DockerSlim Questions

We enlisted DockerSlim expert and Slim.AI Developer Experience Engineer to dive into how container slimming works.

Primož Ajdišek

Technical Staff

5 Ways Slim Containers Save You Money

Do slim containers really save you money on your cloud bill? Are there cost advantages to smaller containers? Find out here.

Chris Tozzi

Automating DockerSlim in Your CICD Pipeline

Using GitHub Actions, you can refine container images automatically making them smaller, faster to load, and more secure by default – all without sacrificing any capabilities.

Nicolas Bohorquez

Contributor

Building DockerSlim into a Jenkins Pipeline

A step by step tutorial on building DockerSlim into your CI/CD pipeline.

Clarifying the Complex: Meet Ivan Velichko, Container Dude at Slim.AI

Ivan recently joined the team at Slim.AI, and we sat down with him to learn more about the path that led him here.

Ivan Velichko

Container Dude

Container Insights: Dissecting the World's Most Popular Containers

Join Ayse Kaya in this series, as she creates her 2022 Container Report Chalk Full of Important Security Findings for Developers.

Ayse Kaya

Analytics & Strategy

Container of the Week: Python & Flask

Our weekly breakdown of a popular container

What We Discovered Analyzing the Top 100 Public Container Images

Complexity abounds in modern development

Ayse Kaya

Analytics & Strategy

2022 Public Container Report

Vulnerabilities continue to increase and developers are struggling to keep up.

Ayse Kaya

Analytics & Strategy

Containerizing Python Apps for Lambda

A tutorial on deploying AWS Lambda using containers, Python edition.

Docker Containers for Your Raspberry Pi

Compact PCs need compact apps

Martin Wimpress

Community

Explore and analyze a Docker container with DockerSlim X-Ray

Understanding container composition

Martin Wimpress

Community

Five Things You Should Never Ship to Production in a Container

Here is our take on five things to avoid when creating a container or shipping it to production.

Chris Tozzi

Increasing Your CI/CD Velocity with Slim Containers

We’ll explain what Slim Containers are, how they speed up the build process, and how they can improve the efficiency of your testing.

Mike Mackrory

Contributor

Integrate Testing into Your Container Pipeline

A closer look at testing within container pipelines, CI/CD, software delivery, and containerization.

Faith Kilonzi

Software Engineer

Serverless Applications and Docker

How to Scale the Latest Trend in Infrastructure

Pieter van Noordennen

Growth

Slim.AI Docker Extension for Docker Desktop

How to access our Docker Extension and try it for yourself.

Josh Viney

Product

Slimming a Rails Application with DockerSlim

Dissect a simple Rails application container using DockerSlim to analyze, optimize, and deploy your product more quickly.

Theofanis Despoudis

Contributor

Where Do You Store Your Container Images?

Container Registry Options are Growing in Number and Complexity

Pieter van Noordennen

Growth

Using AppArmor and SecComp Profiles for Security Audits

Conduct better container security audits using tools like SecComp, NGINX, and Docker.

Why Developers Shouldn't Have to Be Infrastructure Experts, Too

Simplifying processes required to containerize and deploy cloud-native apps.

Chris Tozzi

A New Workflow for Cloud Development

Leverage the benefits of containerization without the headaches & hassle

John Amaral

CEO

Why Don’t We Practice Container Best Practices?

Container best practices are easy to understand, hard to do

John Amaral

CEO

Better Security Audits with AppArmor and SecComp via DockerSlim

Combine the power of tools like SecComp, NGINX, and Docker.