Docker vs Reality , 0 - 1
(aka the opinionated summary of the #devopsdays London November OpenSpace on , Containers and the new flood of Image Sprawl)
There's a bunch of people out there that think I don't like docker, they are wrong.
I just never understood the hype about it since I didn't see, (and still don't) see it being used at large and people seem to understand that as being against it.
So let me put a couple of things straight :
There's absolutely nothing wrong with using a container based approach when deploying your infrastructure. If you remember my talks about the rise of Open Source Virtualization some years ago you've noticed that I've always mentioned OpenVZ and friends as good alternatives if you wanted to have a lot of isolated platforms on one machine. LXC and friends have grown .. they are even more usable these days. Years ago people bought bare metal and ran Hypervisors on it to isolate resources. These days people rent VM's and also want the same functionality so the use of the combination of Virtualization and Container based technologies is a very good match there.
There's also nothing wrong with using Infrastructure as Code tools to build an reproducable image you are going to deploy will provide you with a disposable image which allows you to quickly launch a reproducable and versionned platform for your application if that application is supposed to be shortlived. The tooling around today is not yet there to have these images long lived as you still need to manage the config inside the containers as your application will evolve, it will change, your environment will change (think even about changing to a different loghost..) , but when you don't have to keep state you can dispose the image and redeploy a new reproducable one.
In the embedded world, this kind of approach with multiple banks has been a round for a while , one image running, a second bank as a fallback, and when you upgrade the passive bank you can swap the roles and still have roll back.
There's is also nothing wrong on combining these to approaches and using tools such as Docker and Packer.
But there is lot wrong with building images that then start living their own life, tools like Veewee etc saw the light to create an easy way to make sure the JeOS image (Just Enough Operating System) we created was reproducable, not to ship around virtual appliances.
But, lets be realistic, the number of applications that are suitable for this kind of environment is small. Most applications these days are still very statefull, and when your application contains state you need to manage that
that state, you can't just dispose an image which has state. Specially in an Enterprise environment stateless, immutable applications are really the exception rather than the rule.
When your application maps with stateless and short lived, or a some people like to call it Immutable please do so.. but if it doesn't please remember that we started using configuration management tools like CFengine, Puppet and Chef to prevent Image Sprawl and Config Drift
There's proprietary businesses out there building tools to detect config drift and extort organisations to solve problems that shouldn't have existed in the first place.
Luckily the majority of smart people I've spoken to over the past couple of weeks pretty much confirmed this ...
Like one of the larger devops minded appliation hosting outsourcers in emea, I asked them how much % of their customer base they could all "Immutable" , exactly 0% was the answer.
Image Based Container solutions are definitely not a one size fits all solution, and we have along way to go before we get there if at all ..
Till then I like not to diffuse my attention to too many different types of deploying platforms, just not to make stuff more complex than it already is...as complexity is the enemy of reliability
Comments
#1 Solomon Hykes : A few corrections on state and repeatability
Hi Kris! I'm pretty sure we met at Fosdem in 2009 around a Belgian pint, so hi again :)
Thanks for taking the time to write this! I agree that containers and configuration management are not mutually exclusive, and I believe in the end containers are a facilitator of "infrastructure as code", not a replacement.
I think that there is some misunderstanding on both the *goals* and *capabilities* of docker regarding state management, configuration drift, repeatability, etc. I find your title a bit unfair, as it seems to imply that people using and developing Docker are not in touch with the reality of ops - when in fact many seasoned ops and infrastructure use Docker extensively, at large scale, in the real world. Since we are all reasonable, experienced systems people I'm sure we can reach an understanding on the relative strengths and weaknesses of our favorite tools, without doubting the experience and professionalism of others :)
So here are the corrections I would like to submit to you:
1) The container model is not the same thing as the vm appliance model. You seem to conflate the 2, for example by putting Docker in the same category as appliance building tools like Packer and Veewee. There are superficial similarities but the 2 are fundamentally different. Container engines are neither in the category of "dynamic configuration management" nor in "server image baking". They are in a 3d category of their own. Realizing this will help us discuss the pros and cons of container engines which are really specific to them (including docker).
2) Container engines are not just for stateless applications. Docker has primitives for persistent data volumes. These primitives are typically used to designate persistent parts of the container like database files. Volumes continue to exist separately from the lifecycle of an individual container, and can be shared or transferred across containers. This is how database updates are implemented, or how you might update eg. a wordpress installation with user-uploaded assets. The nice thing about this model is that it gives you the best of both worlds: you still get a fresh new container for each version, which means every change anywhere on your infrastructure maps 1:1 to a new container - from a one-liner fix to your configuration to security updates in openssh, it's right there in your audit trail with a unique ID, inputs and outputs, and all other properties of a containers. But at the same time you can use volumes to maintain a whitelist of exceptions - the data you know needs to be shared. This is better for the same reason that "deny-by-default" firewalling is better.
3) Docker containers can be built from source repeatably. This is a key feature of docker. With the 'docker build' command you get a 1-to-1 transformation from a source repository to a runnable container. Building a container is very different from building a VM image (again: docker is not at all in the same category as veewee and packer). To the developer it feels like compiling a binary - the speed, resource usage and portability characteristics are roughly the same. So your argument of "if you don't run a configuration management agent in the container, it will drift!" is incorrect. You can measure drift from the outside (just run 'docker diff'). You can solve drift entirely at any time by simply nuking your container and running a new one. As explained in point 2, *you can do this for stateful applications too* thanks to volumes.
In short: I respect the experience and accumulated lessons that went into the current devops tools - I've been a sysadmin too :) But I feel that a lot of the criticism against Docker comes from knee-jerk "get off my lawn" reactions, assuming that everyone using or advocating Docker is a hipster who's never seen a real production environment. I assure you that is far from the case. I know that Docker is pretty hyped right now - I personally think it's a mixed blessing for exactly this reason: unexperienced people are bound to talk about Docker, and say stupid things about what Docker can do and how to use it. I wish I could magically prevent this. But all I can do is emphasize that some of us know what we're doing :)
#2 Nathaniel Eliot : Claiming that enterprise
Claiming that enterprise software is 100% stateful is misleading. Instead of asking "what % of your customers are immutable", you should ask "what % of your systems hold state".
Pretty much all software stacks are at some point stateful. Individual components, however, are often designed to be stateless to make 'burn and turn' upgrades easier. Many data stores (especially newer ones) can be treated as psuedo-stateless, in that they recover their data from peers when replaced. And even the most stateful machines can be made easily recapitulable by clever use of attached persistent storage.
Docker doesn't magically enable any of these scenarios, however, so I do agree with your conclusion. While neat, it isn't quite as revolutionary as it's hyped to be. It provides capabilities that folks using IaaS (and/or local virtualization) have already relied on for years, just somewhat more efficiently.
#3 Justin Ellison : Copy + Paste
Hey Kris,
Great article, I agree whole-heartedly. Docker vs CM is truly Apples vs Oranges.
Quick note, it looks like you pasted a blurb twice in your post. Search for "But there is lot wrong with building images that then start living their own life" and you'll see it.
Justin