Creating a Container Pipeline with GitLab CI
Dec 02, 2021
Containers are an excellent choice for almost any use case, and they contribute to a great local development environment as well. Once you have a solid understanding of their power and how to use them, you can take containers to the next level by combining them with other DevOps tools and practices, like Source Code Management (SCM), Continuous Integration (CI), and Continuous Deployment (CD). In this article, we will examine this type of integration through the lens of pipelines.
We will show you how to use GitLab SCM in a local development environment to create a pipeline that will build, test, and deploy a simple web application to Google Cloud Run with the help of two containers. Finally, we will end by introducing some of the alternative options for creating this setup.
To complete this tutorial, you'll need:
- A GitLab account (free plan)
- A Google Cloud account (free plan)
- Docker installed on your local machine
- Python installed on your local machine
- A basic knowledge of Python and Docker
CAUTION: While our intent is that this tutorial can be done at no cost, many cloud platforms require a valid credit card for free tier access, and some services may incur charges. Please use at your own risk and be cautious when setting up cloud services.
Why Do You Need a Container Pipeline?
A container pipeline automates getting code from your local machine to different environments—such as pre-release, staging, or production—in a systematic and repeatable process. A clear, isolated, and testable pipeline normalizes your commit-and-ship process and ensures your application will not require any external dependencies or special setup.
A well-conceived container pipeline is usually platform-independent as well, meaning that you can create conceptually similar pipelines combining different platforms. For this tutorial, we will use GitLab for SCM and CI, and we’ll use Google Cloud Run for the CD tasks. The following diagram shows a high-level overview of the pipeline:
First, a Docker image is built from a Python web application to run the source code. Then, a pipeline configuration instructs the SCM to run the tests and compile a code coverage report on each push to the repository. Finally, if all of the testing requirements are met, the pipeline deploys the container to a public cloud, which is the Google Cloud Run environment in this example.
If these languages, platforms, or providers aren't your cup of tea, you can modify the examples to your preferred languages and platforms.
A Simple Python Web Application
We will be using a simple Python web application that only has an endpoint with a JSON response (
routes.py files) and a unit test to check the status code and message of the response (
test_hello.py). You can find the sample application here. The dependencies are managed with a
requirements.txt file in order to facilitate their installation in the container.
There are two other files that complete the source code of the project: a
Dockerfile that contains the instructions needed to create the container image, and a
gitlab-ci.yml file that contains the configuration for the pipeline.
You can run the application locally with a simple
python app.py call from your CLI. In addition, you can run the test and generate a test coverage report with
pytest --cov=. to get the local results (you'll need
pytest-cov installed locally):
> pytest --cov=. ======================= test session starts ======================================= platform linux -- Python 3.8.10, pytest-6.2.5, py-1.10.0, pluggy-1.0.0 rootdir: /code/flask-pytest-sample plugins: cov-2.12.1 collected 1 item test_hello.py . [100%] ---------- coverage: platform linux, python 3.8.10-final-0 ----------- Name Stmts Miss Cover ----------------------------------- app.py 6 6 0% routes.py 5 0 100% test_hello.py 12 0 100% ----------------------------------- TOTAL 23 6 74% ======================== 1 passed in 0.22s ========================================
That completes our first step. Now we have a working local application that can be pushed to a Git repository. We can create a pipeline configuration to run the tests automatically and take advantage of the GitLab CI/CD’s capabilities.
The First GitLab Pipeline
The next step is to configure a pipeline where the code will be compiled and the test will be run once you push changes to the remote repository. Remember that the main branch is protected by default in GitLab, so you can’t push changes to it directly. (You can follow these instructions to change that behavior.) To enable Continuous Integration, you have to add a
.gitlab-ci.yml file like the one below to the repository:
variables: SERVICE_NAME: 'flaskpytestgitlabcd' default: image: python:3.9-slim-bullseye before_script: - apt-get update - apt-get install -y python3-pip - pip install -r requirements.txt stages: - test - deploy - notify test: script: - pytest --junitxml=report.xml --cov=. test_hello.py - coverage xml artifacts: when: always reports: junit: report.xml cobertura: coverage.xml
In GitLab, a pipeline is composed of several jobs and stages. The jobs define what to do, and the stages define when to do it. A stage can include several jobs which are executed by processes known as runners. You can get runners from the GitLab (some runner execution time comes with the free tier), or configure external machines to serve as runners. Our sample pipeline is simple enough, so the runners that are included in the free plan will be enough.
In GitLab, a pipeline is composed of several jobs and stages. The jobs define what to do, and the stages define when to do it.
Right now, the pipeline just defines one stage (
test) with one job. The job uses a Docker image (
python:3.9-slim-bullseye) that installs the project’s dependencies using the
before_script commands. Once the container is ready, the job script runs two commands that execute the test. This produces an xml file that allows the README.md file to render the test coverage percentage of the project. Once you push the
.gitlab-ci.yml, GitLab will execute the pipeline, and you will be able to check the results:
There are many options for configuring the GitLab CI/CD pipeline, and it can use any image from the Docker public image registry. We just used the default
before script for this example. You can find more extensive documentation here.
This concludes the Continuous Integration part of our project. Now, you can take it a step further by adding Continuous Deployment (CD) to the pipeline, which will give you the ability to deploy the project to a running environment once the integration is complete.
Preparing the Deployment
Before adding the CD stages to the pipeline, you must prepare the environment in which the project will be deployed. For this example, we will take advantage of the free tier of the Google Cloud Run service, which allows you to deploy containers in a managed environment with a generous quota. First, we’ll create a project in the Google Cloud Developer Console (follow this link if you need further instructions on how to get started with GCP) and activate the Cloud Build and Cloud Run APIs:
Once the APIs are available for the project, we’ll establish a method of authentication by creating a service account that will represent non-human access to the APIs from the GitLab CI/CD:
In order to deploy the project, we will need to grant access to several roles, including:
- Cloud Build Service Agent
- Cloud Run Admin
- Project Viewer
- Service Account User
Then, we’ll generate a service account JSON key to use as an environment variable in the GitLab CI/CD pipeline. To deploy the project’s container into Google Cloud Run, we’ll need to add two environment variables to the GitLab repository configuration: GCP_PROJECT_ID and GCP_SERVICE_KEY.
These will hold the project’s identifier and the contents of the JSON key that we just created:
The protected flag restricts the use of the variable to only pipelines that run on protected branches or tags. In this example, the stage of the pipeline that deploys the project will only be executed when the test branch is merged into the main (protected) one.
Extending the Pipeline
The last step is to extend the pipeline to deploy the project into the Google Cloud Run environment that we just configured. Let’s take a look at the Dockerfile that assembles the container image that will be deployed:
# Stage 1 - Install build dependencies FROM python:3.7-alpine AS builder WORKDIR /app RUN python -m venv .venv && .venv/bin/pip install --no-cache-dir -U pip setuptools COPY requirements.txt . RUN .venv/bin/pip install --no-cache-dir -r requirements.txt # Stage 2 - Copy only necessary files to the runner stage FROM python:3.7-alpine WORKDIR /app COPY --from=builder /app /app COPY app.py routes.py ./ ENV PATH="/app/.venv/bin:$PATH" CMD ["python", "app.py"]
As you can see, it’s very simple. The first stage creates a Python virtual environment and then installs the project dependencies. In the second stage, the necessary files are copied to the runner (notice that only the
routes.py files from the source code are copied) and the Flask development server is started. (Obviously, this is just a sample deployment, and it should not be used in a Flask production environment. Check out this link for more information.)
In order to set up the Continuous Deployment, we’ll need to modify the
.gitlab-ci.yml pipeline definition with a new stage deploy that contains only one job definition:
deploy: stage: deploy needs: [test] only: - main # This pipeline stage will run on this branch alone image: google/cloud-sdk services: - docker:dind script: - echo $GCP_SERVICE_KEY > gcloud-service-key.json # Save Google cloud contents in a temporary json file - gcloud auth activate-service-account --key-file gcloud-service-key.json # Activate your service account - gcloud auth configure-docker # Configure docker environment - gcloud config set project $GCP_PROJECT_ID #Set the GCP Project ID to the variable name - gcloud builds submit --tag gcr.io/$GCP_PROJECT_ID/$SERVICE_NAME #Run the gcloud build command to build our image - gcloud run deploy $SERVICE_NAME --image gcr.io/$GCP_PROJECT_ID/$SERVICE_NAME --region=us-east4 --platform managed --allow-unauthenticated # Run the gcloud run deploy command to deploy our new service
The job requires that the previous stage test is complete, and it will run only when the main branch of the repository is merged. Notice that the job uses the Google Cloud SDK Docker image (the Google Cloud CLI tool is preinstalled in this image) and that the script has many steps:
- First, it puts the value of the GitLab CI/CD environment variable GCP_SERVICE_KEY into a file called gcloud-service-key.json;
- Second, it activates this key as the way to authenticate to Google Cloud;
- Steps 3 to 5 configure the container using the Dockerfile described above;
- Finally, it runs the container in the us-east4 region of Google Cloud, allowing access to unauthenticated clients.
.gitlab-ci.yml file looks like this:
variables: SERVICE_NAME: 'flaskpytestgitlabcd' default: image: python:3.9-slim-bullseye before_script: - apt-get update - apt-get install -y python3-pip - pip install -r requirements.txt stages: - test - deploy - notify test: script: - pytest --junitxml=report.xml --cov=. test_hello.py - coverage xml artifacts: when: always reports: junit: report.xml cobertura: coverage.xml deploy: stage: deploy needs: [test] only: - main # This pipeline stage will run on this branch alone image: google/cloud-sdk services: - docker:dind script: - echo $GCP_SERVICE_KEY > gcloud-service-key.json # Save Google cloud contents in a temporary json file - gcloud auth activate-service-account --key-file gcloud-service-key.json # Activate your service account - gcloud auth configure-docker # Configure docker environment - gcloud config set project $GCP_PROJECT_ID #Set the GCP Project ID to the variable name - gcloud builds submit --tag gcr.io/$GCP_PROJECT_ID/$SERVICE_NAME #Run the gcloud build command to build our image - gcloud run deploy $SERVICE_NAME --image gcr.io/$GCP_PROJECT_ID/$SERVICE_NAME --region=us-east4 --platform managed --allow-unauthenticated # Run the gcloud run deploy command to deploy our new service notify: stage: notify needs: [deploy] only: - main # This pipeline stage will run on this branch alone script: - echo "We are done"
Notice that a final stage ("
notification")is added. This prints a message in the console and enables the GitLab DAG pipeline’s stage dependencies report to show the status of the execution:
You can check the details of any job by clicking the report. Importantly, the output of our deploy job will end with the public url of the deployed service:
You can always configure your service to use a more secure and accessible endpoint. Please check the Google Cloud Run documentation for more information.
This is not the only way to build a container pipeline, of course. The tech market is full of alternatives that implement these concepts in a similar way. For example, the following CI/CD platforms are also capable of executing pipelines like the one above with minimal differences.
Jenkins is a great open source tool that was among the first to implement the concept of pipelines. It has many extensions and can be run on-premises or in cloud environments. It can also be integrated with almost any SCM on the market (including the ones that support Version Control Systems other than Git), and it has great support from the community.
If you’re looking for a SCM solution with CI/CD capabilities like GitLab provides, you should check out GitHub Actions. They have a very generous free option and a large number of sample pipelines for almost any kind of project.
Bitbucket uses a similar approach, and you can have a very similar pipeline with nearly the same configuration plus the additional benefits of full integration with the Atlassian suite of development tools.
In this tutorial, we showed you how to use free services to set up a container pipeline to build, test, and deploy a Python web application with little effort. This approach is very scalable, and you can design more complex pipelines that can deploy to different environments as well as address additional requirements like security and performance testing.
In future tutorials, we'll look at ways to implement container best practices into your automated CI/CD deployments, including the use of DockerSlim to create production-ready containers automatically.
About the Author
Nicolas Bohorquez (@nickmancol) is a Data Architect at Merqueo. He has a Master’s Degree in Data Science for Complex Economic Systems and a Major in Software Engineering. Previously, Nicolas has been part of development teams in a handful of startups, and has founded three companies in the Americas. He is passionate about the modeling of complexity and the use of data science to improve the world.