Recent Preview Test

From Malware, to Access Control Risks, and Beyond
Chris Tozzi
Oct 18, 2022

Photo by Christophe Dion on Unsplash

In a previous tutorial, we explained how to create a container pipeline for a Python application hosted in GitLab. We used GitLab CI to set up automation for packaging the app into a Docker container and deploying it to Google Cloud. Once we were finished, we could push code commits into the repo and see the results live in the production site within minutes.

Now, we will show you how to automate DockerSlim in the CI/CD pipeline. DockerSlim is an open source tool that minifies and secures Docker container images. We will use our sample application to show you the necessary steps to take and which parameters to configure. The goal of this tutorial is to show how developers can automatically slim containers in the CI pipeline and explain what to watch out for if they do.

Let’s get started.

Overview of the Sample Application

We will host our sample application code for this tutorial in this repo. For convenience, we’ll use the same codebase from the previous tutorial. However, we’ll now be using GitHub Actions for declaring the CI/CD steps. The overview of the CI/CD pipeline is shown below:

Figure 1 – A Flask CI/CD Pipeline

It consists of the following workflows:

  • The Flask application source code (stored in GitHub).
  • The automated pipeline (which is triggered when we push changes to the repository).
  • We’ll use GitHub Actions for the CI/CD pipeline. The pipeline configuration is located in .github/workflows/ci.yml. It provides instructions on which stages to run and which jobs to execute on each stage.
  • There are a number of steps that need to be configured, and each step performs a particular job. For example, there is a job for checking out the git repo and another one for configuring Docker. All of the steps happen in a specific order, and if one of the steps fails, then the pipeline will stop.
  • Once the pipeline succeeds, you’ll be able to see the application running in Cloud Run.

The steps for building and deploying the application to Cloud Run are as follows:

Editor's note: In the markdown below, we've replaced the shorthand for environmental variables for technical reasons. In these code samples, environment variables should be surrounded by {{ double curly braces }}.

.github/workflows/ci.yml

        name: ci

        env:
            SERVICE_NAME: flaskpytestgitlabcd
            RUN_REGION: europe-west1
            GCP_PROJECT_ID: stoikman-198318

        on:
            push:
                branches:
                  - 'main'

        jobs:
            docker:
                name: Deploy to CloudRun
                runs-on: ubuntu-latest
                steps:
                - name: Checkout
                  uses: actions/checkout@v2

                - name: Set up Docker Buildx
                  uses: docker/setup-buildx-action@v1

                - name: Build
                  uses: docker/build-push-action@v2
                  with:
                    context: .
                    load: true
                    tags: gcr.io/$env.GCP_PROJECT_ID/$env.SERVICE_NAME:latest

                - id: 'auth'
                  uses: 'google-github-actions/auth@v0'
                  with:
                    credentials_json: '$secrets.GCR_JSON_KEY'

                - name: 'Set up Cloud SDK'
                  uses: 'google-github-actions/setup-gcloud@v0'

                - name: 'Use gcloud CLI'
                  run: 'gcloud info'

                - name: 'Set docker registry to gcloud'
                  run: 'gcloud auth configure-docker -q'

                - name: 'Push image to Google Container Registry'
                run: 'docker push gcr.io/\$\{\{ env.GCP_PROJECT_ID \}\}/\$\{\{ env.SERVICE_NAME \}\}:latest'

                - name: 'Deploy to Google Cloud Run'
                  run: |-
                    gcloud run deploy "$env.SERVICE_NAME" \
                                --quiet \
                                --image "gcr.io/$env.GCP_PROJECT_ID/$env.SERVICE_NAME :latest" \
                                --region "$RUN_REGION" \
                                --platform "managed" \
                                --allow-unauthenticated \
                                --memory=512Mi

Compared to GitLab pipelines, GitHub Actions are more fine-grained and extensible. There are lots of public actions that you can use. For our example, we will use google-github-actions/auth@v0, which allows the pipeline to authenticate with GCloud services. All you’ll have to do is configure the credentials string from the environment, and the subsequent steps will have access to GCloud services.

If you haven’t done so already, you’ll need to create a service account and enable the Cloud Run service before you run the pipeline. (You can follow these steps from our previous tutorial to do so.)

In order to add secrets to GitHub Actions, you’ll need to navigate to the Project Settings and then select the Secrets Menu item on the sidebar. Then, copy and paste the contents of the service account JSON file under the key GCR_JSON_KEY.

Figure 2 – Adding Secrets into GitHub Actions

Prior to running the pipeline, you may want to change the GCP_PROJECT_ID to match the Project ID in your Google Cloud account.

Then, trigger the build. Once it’s successful, Cloud Run will provide you with a public URL for your application:

Figure 3 – A Flask Application Running in Cloud Run

Now that you have an overview of the CI/CD pipeline, let’s dive deeper. Next, we’ll show you how to integrate DockerSlim to optimize the production image.

Including DockerSlim into the CI/CD Steps

DockerSlim is a CLI tool that allows DevOps teams to minify and secure container images. We will use this tool as part of the CI/CD pipeline automation.

In order to integrate DockerSlim, you need to include extra steps as you build the container images. First, you need to build the image using Docker, and then slim it using DockerSlim. Then, you can push both images to the container registry and publish the latter on the CloudRun platform.
You want to test this workflow locally and then include the steps in the pipeline.

Before you build the container, you need to modify the existing Dockerfile to include a step to expose the port 8080. Here are the contents of this file:

Dockerfile

    \# Stage 1 - Install build dependencies
    … Same as before

    \# Stage 2 - Copy only necessary files to the runner stage

            FROM python:3.7-alpine
            WORKDIR /app
            COPY --from=builder /app /app
            COPY app.py routes.py ./
            ENV PATH="/app/.venv/bin:$PATH"
            EXPOSE 8080
            CMD ["python", "app.py"]

You need to expose a port in the image manifest so that DockerSlim can use the http-probe to properly initialize the Python application without issues. You can learn more about the http-probe by reading this comment thread. TL;DR: there are HTTP calls that force the application to load its dependencies when it handles them. To follow along with this tutorial, you just need to run DockerSlim with the default options.

Go to the project root folder and build the container:

        $ docker build -t app .

Then, slim it using DockerSlim. You can see the various command line flags for the docker-slim build in this section of the docs:

        $ docker run -it --rm -v /var/run/docker.sock:/var/run/docker.sock dslim/docker-slim build app

You should see a state=completed message in the logs. This will create a new image named app.slim.

$ docker image ls
REPOSITORY                      TAG                 IMAGE ID        CREATED     SIZE
docker-slim-empty-image         latest              e57428508e79        16 minutes ago  0B
app.slim                        latest              63b46c8c7d36        16 minutes ago  16.6MB
app                             latest              8286aa5a819f        19 minutes ago  72.4MB

Now, test the image locally:

$ docker run -p 8080:8080 app.slim

After you’ve tested the steps locally, you need to transfer them into the CI/CD pipeline. As it turns out, you only need to include the steps to download the DockerSlim binary, build and tag the image, and then reference that.

You can edit the **_ci.yml _**file to include the following steps after you build the Docker image:

.github/workflows/ci.yml

        …
        - name: Build
          uses: docker/build-push-action@v2
          with:
            context: .
            load: true
            tags: gcr.io/$env.GCP_PROJECT_ID/$env.SERVICE_NAME:latest

        - name: Minify
          run: |
                    wget https://downloads.dockerslim.com/releases/1.37.3/dist_linux.tar.gz
            tar zxvf dist_linux.tar.gz
            chmod +x ./dist_linux/docker-slim
            ./dist_linux/docker-slim build --tag gcr.io/$env.GCP_PROJECT_ID/$env.SERVICE_NAME -slim:latest "gcr.io/$env.GCP_PROJECT_ID }/$env.SERVICE_NAME :latest"

        - name: Inspect
          run: |
                    docker image inspect gcr.io/$env.GCP_PROJECT_ID /$env.SERVICE_NAME -slim:latest

        …
        - name: 'Push image to Google Container Registry'
          run: 'docker push gcr.io/$env.GCP_PROJECT_ID /$env.SERVICE_NAME -slim:latest'

        - name: 'Deploy to Google Cloud Run'
          run: |-
                  gcloud run deploy "$env.SERVICE_NAME " \
                            --quiet \
                            --image "gcr.io/$env.GCP_PROJECT_ID /$env.SERVICE_NAME -slim:latest" \
                            --region "$RUN_REGION" \
                            --platform "managed" \
                            --allow-unauthenticated \
                            --memory=512Mi

As you can see, we added the steps to minify the image by downloading DockerSlim and running it on the image that we built in the previous step. This resulted in another image:

gcr.io/$env.GCP_PROJECT_ID /$env.SERVICE_NAME -slim:latest.

We pushed that image to the container registry and used it to deploy to Cloud Run.

Once you’ve completed all of the pipeline steps, you can inspect the application running on a minified container.

Caching Steps

Looking at the current state of the pipeline, you can see that several of these steps could be optimized. For example, you could cache the Docker image layers so that you don’t download and rebuild the same base image again and again. You could also cache the DockerSlim binary that is downloaded on each run. As long as you use the same version, you can re-use the same binary.

To cache the container image layers, you just need to leverage the cache-from and cache-to directives in the docker/build-push-action.

You can edit the ci.yml file to include the following steps after you build the Docker image:

.github/workflows/ci.yml

        …
        - name: Build
          uses: docker/build-push-action@v2
          with:
            context: .
            load: true
            tags: gcr.io/$env.GCP_PROJECT_ID/$env.SERVICE_NAME :latest
            cache-from: type=gha
            cache-to: type=gha,mode=max

These steps use the cache-from and cache-to directives to move the Docker image to the cache. The type declared is gha, which means that it will be stored in the GitHub Actions service. Now, instead of building the image layers from scratch on subsequent runs, the layers will be cached, as you can see below:

        #10 [builder 4/5] COPY requirements.txt .
        #10 CACHED

        #11 [builder 5/5] RUN .venv/bin/pip install --no-cache-dir -r requirements.txt
        #11 CACHED

As for caching the DockerSlim binary, you can use the actions/cache to store the binary in the cache and restore it on each run.

You can edit the ci.yml file to include the following steps after you build the Docker image:

.github/workflows/ci.yml

        …

        - name: Cache dockerslim
          id: cache-dockerslim
          uses: actions/cache@v2
            with:
              path: ./dist_linux/
              key: $runner.OS -dockerslim-cache-$hashFiles('docker-slim') }}
              restore-keys: |
                    $runner.OS -dockerslim-cache-$hashFiles('docker-slim')
                    $runner.OS -dockerslim-cache-

        - name: Download DockerSlim
          if: steps.cache-dockerslim.outputs.cache-hit != 'true'
          run: |
                    wget https://downloads.dockerslim.com/releases/1.37.3/dist_linux.tar.gz
                    tar zxvf dist_linux.tar.gz
                    chmod +x ./dist_linux/docker-slim
        - name: Minify
          run: |
                    ./dist_linux/docker-slim build --tag gcr.io/$env.GCP_PROJECT_ID /$env.SERVICE_NAME -slim:latest "gcr.io/$env.GCP_PROJECT_ID /$env.SERVICE_NAME :latest"

Instead of downloading the DockerSlim binary, this computes a unique key for it and stores it in the cache. This step will be skipped on subsequent runs, since it will reuse the binary from the cache:

Figure 4 – Caching Dependencies

Now you know how to leverage caching to improve the build performance of your CI/CD pipeline!

Conclusion

DockerSlim is a practically indispensable tool for your CI/CD pipeline. By adding a few configuration steps, you can refine container images automatically, making them smaller, faster to load, and more secure by default – all without sacrificing any capabilities.

Keep an eye out for more tutorials and articles on how to automate container workloads and optimize containers for scalable and secure deployments. To learn more about containerized pipelines and other SlimAI resources, you can sign up for the Slim Developer Platform here.