To mitigate the challenges of deploying containers in production environments, to ensure consistency, and reduce risk, cloud providers and open-source engineers alike have authored plenty of articles enumerating their take on “container best practices.” We can summarize the main takeaways of these articles into these key learnings:
- Best Practice #1: Control what’s inside the container
- Best Practice #2: Minimize image size and optimize for build speed
- Best Practice #3: Control vulnerabilities and secure services
- Best Practice #4: Enforce standards across your organization
- Best Practice #5: Automate updates
Sounds great on paper, right? And makes for a nice blog post, article, or headline material for presentations to business leadership and technology management. The reality is that most developers struggle to convert these best practices into easily achievable outcomes, because they often lack the know-how, time & tooling required to implement them. This problem exacerbates when you try to implement this advice at large and distributed organizations. Let’s take a closer look.
# Best Practice #1: Control what’s inside the container
It’s common for container base images to start from publicly available container images. Repositories like DockerHub and Amazon ECR have made it possible to find the official, maintained images of core operating systems and database platforms. But, there are also lots and lots of “free” images there, too, in widely varying states of fidelity. For good reason, organizations are often leery of devs just pulling whatever image they find on the public internet to start their project. In bigger organizations, DevOps teams are often tasked with creating and maintaining one or many base images that devs can use from available container starting points.
All this container base image curation comes at the cost of manual work and constant upkeep. The insidious catch is that these “curated” base images are often not suitable for the developers needs. Missing dependencies, package managers, common tools are often the culprit for poor developer experience. As a result, developers manually add whatever libraries and dev tools they need for their job or to just reduce development friction. This “build up” from the base image practice results in increased risk and negative side-effects such as reintroduction of vulnerabilities, poor container composition, bloat and more that needs to be fixed “later”. In addition to undoing the value of work that DevOps did to curate the images in the first place.
# Best Practice #2: Minimize the image size and optimize for build speed
Containers have a lot to do as they make their way from your local sled, through your test suites and staging environments, and into production. Every test, vulnerability scan, or pipeline operation takes time, and as we know, time is money. Additionally, bulky containers mean downtime for developers looking to rebuild their applications during the development process.
Reducing and optimizing container image size by hand is hard, expert work. There are precious few optimization tools in the market that help, and too often optimization comes down to deep knowledge of containers, Linux mastery, individual configurations or manual tweaking. At best this is a difficult and time consuming activity that most developers just don’t know how to do.
# Best Practice #3: Control vulnerabilities and secure services
Too often, responsibility for security is paradoxically both left to individual developers, and at the same time thrown over the wall to Ops teams and pipeline checks. Application developers and their code reviewers know the application and its vulnerabilities best, but often have too little accountability on security until something bad happens, when it is too late. On the flip side, security checks in CI/CD pipelines tend to offer a level of false precision. Many smart voices in the DevOps community (opens new window) are calling for security (and its related DevSecOps practice) to be baked into the start of development and carried all the way through, and we couldn’t agree more.
# Best Practice #4: Enforce standards across your organization
This one makes great fodder for VP-level slide decks and corporate-wide change management initiatives, but woe be to the code police in charge of issuing warrants and making arrests. You can have meetings, office hours, documentation, best practices, clean and flexible base images, and even a sternly-worded edict from your CTO. It won’t matter. As soon as you make a rule, you will get an email asking for an exception. And those exceptions will be hard to say “No” to… after all, you’ve built your company on the core values of flexibility, innovation, and giving your developers the tools they need to do the job. Like good application code, the key is in managing exceptions, not preventing them or forbidding them. What a DevOps team and their management wants is visibility, tracking, and an audit trail when it comes to the exceptions they are allowing in their standards. This way they can manage changes, know where the risks are, and take corrective action when those risks bear out in the worst-case scenario.
# Best Practice #5: Automate updates
What good would DevOps be without automation? We’ve come a long way, baby, in the ability to automated build and deploy processes since the old days of local bash scripts and “ping John in IT to see if he can restart the build.” Even coordination between automated processes — thanks to Jenkins, K8s, and related tools — is better than it’s ever been. The problem we see with automation is there still isn’t enough of it. We’ve written plenty about manual processes in CI/CD pipelines, and the jury is out as to whether we need more specialized tools or a single, vertically integrated system to handle it all for us.
# Carefully consider public images
Getting this advice is sometimes akin to your doctor telling you to “eat better and in moderation.” Thanks, will do my best! Which ones are the good images again? The images with “Verified Publisher” tags in Docker Hub are great starting points, but by definition are generic and require libraries and dev tools to be added in order to be useful for specialized application development. Maybe there is a like-minded community member out there who has solved a problem similar to mine and has an image purpose built for this kind of thing, but will my DevOps or security teams slap my hand or reject my commit if I use that container? Like many other best practices, the best thing a DevOps-focused company can do here is provide the tools that allow developers to make smart decisions with clear visibility and guidelines, and do so as early in the workflow as possible to eliminate surprises.
You may be thinking; “Ok thanks, Slim.AI, for outlining a bunch of intractable problems that exist in my organization. It’s been fun.” But don’t worry, we have an opinion here and some suggestions to help address, if not solve, these challenges head on. Have thoughts yourself? We’d love to get your comments and feedback.
Coming Soon: Next article in our series, Part 3: A New Workflow.
Also check out our previous article, Part 1: Developers Say Cloud Development Still Too Manual & Complex (opens new window)