Aug 22 21:50

11.5 Factor apps

Each time someone talks about the 12 Factor Application a weird feeling overcomes me .. Because I like the concept .. but it feels awkward .. it feels as if someone with 0 operational experience wrote it. And this devil is in a really small detail.

And that is Part III. Config ... For me (and a lot of other folks I've talked to about this topic) , using environment variables (as in real environment variables) are just one of the worst ideas ever. Environment variables are typically set manually , or from a script that is being executed and there's little or trace to see fast how a specific config is set.

Imagine I launch an app with an env variable X=foo , then your collegue stops that app, and launches it with X=bar. The systems integrity has not changed, no config or binaries have been changed, but the application behaviour could have completely changed.

Sure I can go in to /proc/pid/environ to find the params it was launched with ... But I want to know what state the system is supposed to be in .. and have tooling around that verified that the system indeed is in that state.

https://12factor.net/config claims that using config files that are not stored
in revision control are a huge step forward from hardcoded config in code, but they claim that config files now start ending up all over the place. This obviously feels written by someone who never experienced the power of modern configuration management. Templates which are dynamically populated , or configs that are even calculated on the spot depending on the environment an application lives in are the norm in modern infrastructure as code .. yet people seem to think that managing an environment variable would be less painfull.

I feel the idea of what they wanted to achieve was good, but the way they suggest the implementation was foobar. I don't ever want critical config of an application (like wether is is talking to the PROD or DEV database) to be set from an environment variable that can be modified. This has to come from an automated system.

This kind of config should be idempotent, one should be able to trace back where it came from and who modified it (version control), and every redeploy of the application should end up with the same result. It can even be dynamic (service discovery), but placing it in an Environment variable is the last place where a config deserves to live.

So please let's stop calling it the 12 Factor application .. and call it the 11.5 Factor application ..

Sep 01 21:02

Decomissioning my Drupal blog

If you are looking at this blog post right now... my live Drupal site has finally been decommissioned.. or not .. well these pages are served statically but the content is still generated by an ancient aging Drupal 6, which is hiding somewhere in a container that I only start when I need it.

Given my current low blog volume .. and the lack of time to actually migrate all the content to something like Jekyll or Webby I took the middle road and pulled the internet facing Drupal offline. My main concern was that I want to keep a number of articles that people frequently point to in the exact same location as before. So that was my main requirement, but with no more public facing drupal I have no more worrying about the fact that it really needed updating, no more worrying about potential issues on wednesday evenings etc

My first couple of experiments were with wget / curl but I bumped into. Sending a Drupal site into retirement which pointed me to httrack which was a new tool for me ..

As documented there
httrack http://127.0.0.1:8080/blog -O . -N "%h%p/%n/index%[page].%t" -WqQ%v --robots=0
creates a usuable tree but root page ends up in blog/blog which is not really handy.
So the quick hack for that is to go into the blog/blog subdir and regexp the hell out of all those files generated there direction one level below :)
for file in `ls`; do cat $file | sed -e "s/\.\.\//\/blog\//g" > /blog/$file ; done

httrack however has one annoying default in which it puts metatdata in the footer of a page it mirrors, where it comes from and when it was generated thats very usefull for some use cases, but not for mine as it means that every time I regenerate the site it actually generates slightly different content rather than identical pages. Luckily I found the -%F "" param to keep that footerstring empty

And that is what you are looking at right now ...

There are still a bunch of articles I have in draft .. so maybe now that I don't have to worry about the Drupal part of things I might blog more frequent again, or not..

Jun 15 2016

Will containers take over ?

and if so why haven't they done so yet ?

Unlike many people think, containers are not new, they have been around for more than a decade, they however just became popular for a larger part of our ecosystem. Some people think containers will eventually take over.

Imvho It is all about application workloads, when 8 years ago I wrote about a decade of open source virtualization, we looked at containers as the solution for running a large number of isolated instances of something on a machine. And with large we meant hundreds or more instances of apache, this was one of the example use cases for an ISP that wanted to give a secure but isolated platform to his users. One container per user.

The majority of enterprise usecases however were full VM's Partly because we were still consolidating existing services to VM's and weren't planning on changing the deployment patterns yet. But mainly because most organisations didn't have the need to run 100 similar or identical instances of an application or a service, they were going from 4 bare metal servers to 40 something VM's but they had not yet come to the need to run 100's of them. The software architecture had just moved from FatClient applications that talked directly to bloated relational databases containing business logic, to web enabled multi-tier
applications. In those days when you suggested to run 1 Tomcat instance per VM because VM's were cheap and it would make management easier, (Oh oops I shut down the wrong tomcat instance) , people gave you very weird looks

Slowly software architectures are changing , today the new breed of applications is small, single function, dedicated, and it interacts frequently with it's peers, together combined they provide similar functionality as a big fat application 10 years ago, But when you look at the market that new breed is a minority. So a modern application might consist of 30-50 really small ones, all with different deployment speeds. And unlike 10 years ago where we needed to fight hard to be able to build both dev, acceptance and production platforms, people now consider that practice normal. So today we do get environments that quickly go to 100+ instances , but requiring similar CPU power as before, so the use case for containers like we proposed it in the early days is now slowly becoming a more common use case.

So yes containers might take over ... but before that happens .. a lot of software architectures will need to change, a lot of elephants will need to be sliced, and that is usually what blocks cloud, container, agile and devops adoption.

Jun 13 2016

Jenkins DSL and Heisenbugs

I`m working on getting even more moving parts automated, those who use Jenkins frequently probably also have Love - Hate relationship with it.

The love coming from the flexibility , stability and the power you get from it, the hate from it's UI. If you've ever had to create a new Jenkins job or even pipeline based on one that already existed you've gone trough the horror of click and paste errors , and you know where the hate breeds.

We've been trying to automate this with different levels of success, we've puppetized the XML jobs, we've used the Buildflow Plugin (reusing the same job for different pipelines is a bad idea..) We played with JJB running into issues with some plugins (Promoted Build) and most recently we have put our hope in the Job DSL.

While toying with the DSL I ran into a couple of interresting behaviours. Imagine you have an entry like this which is supposed to replace the $foldername with the content of the variable and actually take the correct upstream

  1. cloneWorkspace('${foldername}/dashing-dashboard-test', 'Successful')

You generate the job, look inside the Jenkins UI to verify what the build result was .. save the job and run it .. success ..
Then a couple of times later that same job gives an error ... It can't find the upstream job to copy the workspace from. You once again open up the job in the UI, look at it .. save it , run it again and then it works.. a typical case of Heisenbug ..

When you start looking closer to the XML of the job you notice ..

  1. <parentJobName>${foldername}/dashing-dashboard-test</parentJobName>

obviously wrong .. I should have used double quotes ..

But why doesn't it look wrong in the UI ? That's because the UI autoselects the first option from it's autogenerated pull down list .. Which actually contains the right upstream workplace I wanted to trigger (that will teach me to use 00 as a prefix for the foldername for all my tests..)

So when working with the DSL .. review the generated XML .. not just if the job works ..

Jun 01 2016

Linux Troubleshooting 101 , 2016 Edition

Back in 2006 I wrote a blog post about linux troubleshoooting. Bert Van Vreckem pointed out that it might be time for an update ..

There's not that much that has changed .. however :)

Everything is a DNS Problem

Everything is a Fscking DNS Problem
No really, Everything is a Fscking DNS Problem
If it's not a fucking DNS Problem ..
It's a Full Filesystem Problem
If your filesystem isn't full
It is a SELinux problem
If you have SELinux disabled
It might be an ntp problem
If it's not an ntp problem
It's an arp problem
If it's not an arp problem...
It is a Java Garbage Collection problem
If you ain't running Java
It's a natting problem
If you are already on IPv6
It's a Spanning Tree problem
If it's not a spanning Tree problem...
It's a USB problem
If it's not a USB Problem
It's a sharing IRQ Problem
If it's not a sharing IRQ Problem
But most often .. its a Freaking Dns Problem !

`

May 28 2016

Docker and volumes hell

We're increasingly using Docker to build packages, a fresh chroot in which we prepare a number of packages, builds typically for ruby (rvm) , or python (virtualenv) or node stuf where the language ecosystem fails on us ... and fpm the whole tree as a working artifact.

An example of such a build is my work on packaging Dashing. https://github.com/KrisBuytaert/build-dashing

Now part of that build is running the actual build script in docker with a local volume mounted inside the container This is your typical -v=/home/src/dashing-docker/package-scripts:/scripts param.

Earlier this week however I was stuck on a box where that combo did not want to work as expected. Docker clearly mounted the local volume, as it could execute the script in the directory, but for some reason it didn't want to write in the mounted volume.

docker run -v=/home/src/dashing-docker/package-scripts:/scripts dashing/rvm /scripts/packagervm
Usage of loopback devices is strongly discouraged for production use. Either use `--storage-opt dm.thinpooldev` or use `--storage-opt dm.no_warn_on_loop_devices=true` to suppress this warning.
corefines: Your Ruby doesn't support refinements, so I'll fake them using plain monkey-patching (not scoped!).
/usr/local/share/gems/gems/corefines-1.9.0/lib/corefines/support/fake_refinements.rb:26: warning: Refinements are experimental, and the behavior may change in future versions of Ruby!
/usr/share/ruby/fileutils.rb:1381:in `initialize': Permission denied - rvm-1.27.0-1.x86_64.rpm (Errno::EACCES)

So what was I doing wrong, did the Docker params change, did I invert the order of the params, did I mistype them ? I added debugging to the script, (ls , chmod, etc..) and I couldn't seem to read or modify the directory. So I asked a coworker to be my wobbling duck.

He did more .. he wondered if this wasn't selinux.

And he was right..

Apr 23 21:47:00 mine23.inuits.eu audit[9570]: AVC avc: denied { write } for pid=9570 comm="fpm" name="package-scripts" dev="dm-2" ino=368957 scontext=system_u:system_r:svirt_lxc_net_t:s0:c47,c929 tcontext=unconfined_u:object_r:unlabeled_t:s0 tclass=dir permissive=0
Apr 23 21:47:02 mine23.inuits.eu python3[9597]: SELinux is preventing fpm from write access on the directory /home/src/dashing-docker/package-scripts.

So while I was looking for errors in docker, it was just my selinux set to enforce acting up and me not noticing it.

The quick way to verify obvisously was to setenforce 0 and trigger the build again that however is not a long term fix so I changed the

semanage fcontext -a -t cgroup_t '/home/src/dashing-docker/package-scripts'
restorecon -v '/home/src/dashing-docker/package-scripts'

That solves the problem

Mar 23 2016

It's just rubber , with some air in it.

- A child balloon
you inflate it, play with it for a while, then it explodes, you throw it away you inflate another one, maybe even a different color. the kid plays with it .. till it breaks, then you throw it away...

- An inflatable castle.
you inflate it, play with it for a while, deflate it, move it around, inflate it again, if it has a hole in it, you patch the hole.

- The tyres on your kids bike,.
You inflate it , kid rides on it, if it starts losing air, you fix it asap so you can continue using it.

All of the 3 are really valuable use cases for rubber with air in it,
in a way they are all equally valuable, but serve different purposes, different use cases.

Now think about this next time you spin up a container, that's running a database, application server and where your users ssh to.

It's not just another VirtualMachine

Mar 23 2016

Lies, Damn Lies and Statistics, 2016 Edition

When people sign up for Configuration Management Camp, we ask them what community room they are mostly interested in.
We ask this question because we have rooms in different sizes and we don't want to put communities with 20 people showing interest in a 120 seat room and we don't want to put a community with 200 people in a 60 seat room.

But it also gives us to opportunity to build some very interesting graph over the potential evolution of the communities.

So looking at the figures ... the overall community is obviously growing,From 350 to 420, to just short of 600 people registered now.

The Puppet Community is not the biggest anymore, that spot went to the Ansible Community room. And all but the CFengine communities are growing.

One more thing , The organisation team discussed several times if we should rebrand the event. We opted not to .. Infracoders.eu could have been an alternative name .. but we decided to stick with the name that already is known,
the content will evolve.. but Config Management Camp will stay the place where people that care about Infrastructure as Code and Infrastructure automation meet.

Mar 23 2016

Recent and Upcoming Talks

I gave a couple of new and updated talks the last couple of months.

At Fosdem, I opened up the Distributions Devroom with a talk on how we could improve the collaboration between the developers of a distro, and their users, the ops folks. I seemed to not be the only person with similar ideas as the smart folks over at the CentOS project were already actively talking about their efforts into making some my ideas become reality the day before at the CentOS Dojo.


Another talk of early this year was an update on the why Security is requires a devops aproach, why you want to embed security as a standard practice in your development process and why continuous delivery actually is a security requirement.


At last weeks FLOSS UK Conference in London , I gave an updated version of My MonitoringLove talk, giving an opinionated overview of the current state of Open Source Monitoring tools.


I was scheduled to give the Opening Keynote today (23/3/2016) at the Making Open Source Software Conference , but sadly I had to cancel that due to yesterdays (22/3/2016) events at Brussels Airport. My flight to Bucharest was obviously cancelled.

I`m scheduled to open up the 2nd day of the the upcoming Devopsdays London Edition

And Bernd Erk tricked me into giving a follow up talk on my popular 7 tools for your devops stack talk, aptly titled Another 7 tools for your devops stack

Jan 07 2016

Bimodal IT , redefined

There's been a lot of discussion about the sillynes of the term BiModal IT, aka the next big thing for IT-organisations that don't dare to change, but still want to sound cool.

So here is my idea to reuse that term for something relevant.

BiModal IT, is the idea where you take a fully automated infrastructure which has been build on the principles of Infrastructure as Code. Which gets periodic idempodent updates (e.g every 15 or 30 minutes, or when you orchestrate it), and consistency checks , where the source code for that infrastructure is versioned , tested and delivered trough a traditional Continuous Delivery pipeline for the majority of your services. and add realtime reconfiguration capacities based on service discovery for the other services that really change fast, or in a real elastic way, using tools like Consul, Consul_template, etcd etc..

That way you have 2 modes of managing your infrastructure, aka BiModal