Jul 26 2006

OSCON presentation

Luke Kanies on Web 2.0 vs Open Source Infrastructure companies at OSCON presentation

I'm definitely still tainted from the Web 2.0 conference last year; I keep wanting to make snide jokes about the froth that surrounded the conference, and I'm still pretty bitter how much attention web companies with no business model get when infrastructure companies with a pretty good model just get ignored.

Well Said !

Jul 25 2006

What Can System Administrators Learn from Programmers?

The LinuxPlanet - Interview - What Can System Administrators Learn from Programmers? - The Role of Open Source
tels us what we already know for a decade. We have to move forward to build infrastructures that can survive the 10th floor test.

Moving this process to the field of system administration, I would expect that a test system is rebuilt nightly unattended from scratch using the operating system distribution files, appropriate scripts, and add-on packages. All needed elements would be stored on local file servers under a configuration management system like CVS. Such a practice obviates the all too common danger of having a running system depending on a tool that was once fetched over the net from a site that has ceased to exist.

I think comparing it to nightly builds is a good thing.. afteral on our development platforms we are aiming for nightly redeployments :)

Jul 24 2006

Matt will be speaking at Linux-Kongress

The Linux-Kongress 2006 schedule is out.. and amongst the people to give a talk there is my good friend
Matthias Rechenburg. Matt will be talking about the project he has been working on for the past couple of hmm.. years actually ... openQRM

Jul 20 2006

Research before Coding !

Upon reading Free Software in education my first tought , and actually a general tought not aimed at Arnaud, was.. why would one want to write Yet Another version of a Tool or piece of code that is already available in Open Source, did they forget to google to see what already exists ? Are they just egotripping coders who want to say .. "Its MY project" , Is the other one written in "the wrong technology".

There is so much new work out there that we really don't need to spend time on duplicating stuff that already works.

But just to clarify .. Arnaud is doing a great good job teaching people about Open Source and code in general in this project was to teach people stuff so the most important part is that they have learned something new, and such a project is reasonable in size and feature set to do as a teamwork and show people the ropes. But a lot of other projects make the mistake to just start writing code and not look around hence wasting valuable time they could spend on new developments.

Jul 19 2006

O'Reilly on Operations ...

Tim's article titled O'Reilly Radar > Operations: The New Secret Sauce is causing a lot of feedback from the LISA folks,

The main statement we are trying to prove wrong is actually from Nat :

Nat replied: "Deployment tools have never been open source's strong point: OS has always been about the developer, rarely about the deployer. cf the hacker's disdain for IT who get stuck with deployment and management. That said, there are some open source tools like nagios (for system monitoring) and capistrano (for rails deployment). The feedback loop there tends to be that the people writing the tools are the ones with the deployment problem. The downside is that if your need isn't met by the tool, it may be hard to get the developer to add it.

Off course deploying a (physical or virtual) machine with working applications in less than 5 minutes keeping them higly available and maintaining a couple of hundreds of them is nothing as sexy as actually writing the 3000 lines of code that are running on those machines. :)

Back when I started working in the Internet industry ages ago, you wrote the code, you tried doing the graphics, you installed and configured the webservers, you had to make sure they were connected to the internet correctly and it was secured. We were doing both Java and Perl development, and setting up infrastructures. We were designing both the interior and the outside of a car and building the Highway infrastructure on which it was supposed to drive from the same office. So when a couple of years later our goal was to build everything with opensource tools we ran into the issue that nobody was around to setup and deploy your stuff (correclty), so rather than continuing to write the code I decided to first start building the infrastructure to deploy the code on. And that's hat I`m still doing today ...

Today lots of the stuff you need IS available, you just need to know where to look and spend some time learnnig the tools,
Integration is being worked on as we speak and the foundation of the Open Management Consortium shows that lots of companies are active in the field. Strange OMC didn't come up in the comments yet .. Even stranger that nobody even mentionned infrastructures.org .

He continued: "The deployment tools tend to be commercial offerings in open source, where Red Hat, IBM, et al. give away the open source operating system and charge like a wounded bull for the management tools. Walking around LinuxWorld Boston two years ago convinced me of this: everyone had management tools. Third party management tools suffer because of the lack of integration. Red Hat at least can pair the management people with the kernel people and get the integration they want. I'm not ready to believe that the Windows server story is 10/10. I'd say the open source story is only 5/10. There's a lot more to be done."

I can be short on commercial deployment offerings, the ones from the hardware vendors only have one goal : Vendor Lock in, and the closed ones from the (propietary) software vendors only have one issue.. : The don't do the job.

The opensource offerings however are plenty, and they work, you can integrate installation with imaging , you can combine tools that you have been using for ages to deploy physical machines and keep them for your virtual platforms and you can even use package management tools cross distribution.
And even configuration management tools such as CFengine work cross platform.

The choice is yours... you just have to make it.

What's more, I'm not even sure that the open source community understands just how important this whole area is going to be, so even if the tools and techniques were released, I'm not sure how strong the uptake would be.

Sure not everybody understands, and there is a long way to go to get the noses in the right direction.. but it's we are already much further ahead on the Web2.0 kind of management and deployment stuff than on the Large Scale Desktop User Preference management stuff.. there's tons of work to be done there. We got some good ideas, but we need more hands.

As Luke mentions in one of the comments

I've had a hard time finding people even interested in developing in this space, and I can't help but think that's at least partially because web apps are treated so sexily by entrepreneurs and media companies while operations is ignored.

I just hope some VC guys get the picture :)

There's lots of other interresting discussion in the comments , not entirely coincidential from people who's blogs I also happen to be reading ..

GeneHack confirms my statements from above

I think the thing that's really getting missed about JMason's comment is the "also see" bit at the end. People have been working on and thinking about the operations side, and it's led to tools like cfengine, puppet, bcfg, etc. They're fairly complicated to deploy (as are AD and WSS and the like), but they're cross-platform, OSS, vendor-neutral -- and once you've got them working, extremely powerful.

Luke Kanies, of puppet fame first commented

It would be nice to see O'Reilly spend 1/100th the effort that you spend
on "Web 2.0" companies, advertising projects like Puppet, Radmind, RT,
OpenNMS, and the few other open source operations applications out
there. I still get comments on my O'Reilly cfengine articles from 3
years ago, because they're about the only published works on automation.

Seems like Tim read his posting as he now will be presenting Puppet at Oscon :)

A few emails later, Nat Torkington notified me that they had some cancellations and thus there was a speaking slot available to me if I were interested.

To be honest , I didn't even bother submitting an abstract on topics such as Automating System Deployment or Taxonomy in Package Management for Euroscon as I was quite confident that the target audience wouldn't match and the paper wouldn't get selected anyhow. I'm interrested in the feedback Luke will be getting and maybe I`ll try proposing topics to future O'Reilly conferences.
But hey if Tim is still looking for related topics for the Brussels edition, I'm available to drive the full 30Km to Brussels to give a talk there :)

Jul 19 2006

K700i as a Remote Mouse

Ages ago when Macbar was giving his MythTV workshop with his Powerbook and using his mobile to flip the pages I was kinda jealous since I never even tought of doing such a thing on Linux.

So when I got the MacMini to play with I configured my phone as a remotec ontrol and it worked in 3 seconds. Somehow I never saw the feature before or it got installed from the MacMini dunnow but it worked.

So when a couple of nights ago arlied blogged about using his mobile having a really neat bluetooth HID remote control app built in, I figured hey .. that worked with the MacMini .. why not try it with my phone and laptop . and guess what.. it works.. so no need to be jealous anymore of a feature that Macbar has on his Mac ..it works just as good on my Linux box.. I just didn't know ..

Jul 17 2006

High Availability Storage Foundation , on SLES 10

So last friday I already teased you people by asking what you get if you mix stuff such as OCFS2, iSCSI , Xen and Linux-HA2 , here's the full story .

Guess what.. its all about the release of SLES 10 and the High Availability Storage Foundation

Last week we sat down with Jo De Baer to dicuss and proofread his document on the Novell High Availability Storage Foundation.
Jo spent a lot of time preparing this document and a fully working setup and we were the happy few actually validating his document and validating his setup. Great workshop, we did find some issues with the setup and some errors in the documents that got fixed in the meanwhile but overall really good work.

The overall concept of the Novell High Availability Storage foundation is to provide a more easy to install package including iSCSI, OCFS2, Hearbeat2 and Xen in SLES10, Suse did some work in making sure that OCFS2 and Heartbeat2 were working together nicely and in defining Xen as a Heartbeat Resource.

The goal is to create a virtual machine that is actually a cluster resource, so that when there is a problem with it , you just launch another one on another machine. For this you need to be able to access the same filesystem in order to launch the same virtual machine, and that's where the concept of exporting an loopback image pool over iSCSI with OCFS2 comes in.
Novell is clearly going for disk images., something I don't prefer, I love LVM for this and have always been running into issues with loopback now they have been optimising and bugfixing the loopback drivers for both performance and stability so at least that argument should be gone in the future. The idea of being able to move around an image to multiple machines by just copying a file is interresting for testing and playing around, but it doesn't belong in a managed infrastructure that should survive the 10th floor test. Well, at least they don't recommend using loopback images for your real data, just for the OS.

Anyway.. amongst the problems you run into when tring to do such a setup is how to integrate ocfs with ha-2
as OCFS2 is a cluster filesystem with it's own membership and heartbeat functions they don't play nice with HA-2 by default, luckily the OCFS2 CMS is pluggable so you can disable it and give this function to HeartBeat which is exactly how we set it up

The second problem is how do you tell heartbeat that Xen is a resource. Yes you can shutdown your vm's on one node then restart them on the other manually but then what's the HeartBeat doing, so you want HeartBeat to do that for you ..
But one can define Xen as a HeartBeat resource today

Of course this setup isn't really ideal, I much more prefer having failover between different applications in virtual machines, those things are already in production today giving you the advantage that you don't have to boot a full machine again before your application is ready hence less downtime, but still its a nice case that these different technologies really do work together.

Now the next steps in this kind of setups could be adding multiple virtual machines in a resource group and rather than shutting down a machine migrate it to another host based on the definitions of resource groups.

So now that SLES 10 is out , it's no secret anymore what's in there, I`m not going to document everything further inhere, Jo did a great job writing a huge document so that even our Sales guy understood what it was all about so I`ll point you to the real juice when Jo puts it online :)

Jul 17 2006

Multi mgmd node MySQL cluster

I finally took the time to upgrade my MySQL cluster to 5.11 beta in order to test some new features.
I have some clusters running with 2 ndb_mgmd nodes , but the ones that are scheduled for upgrade only have 1 management node yet.

So todays goal was getting multinode namagement up and running on 5.11 , and test the fresh --nowait feature for ndbd.
Digging back to my older docs told me this would be a 15 minute effort, I'd had to change the config file to have 2 management nodes, restart them. Then change the connection strings for the ndbd's so that they have 2 nodes. Solved.

The first thing I ran into was that on the 2nd node I couldn't connect to ndb_mgmd via ndb_mgm , I could connect to it from a remote machine or while specifying a full connection string with the external ip but not from internally.
Off course this was a fine dns problem as it seemed the machine couldn't do lookups for localhost :)

The second thing I ran into was that even tough I saw 1 connection string in my primary management node it seemed from ndb_mgm that both nodes didn't see eachother, really strange.. I eventually decided to start the ndbd nodes while not changing my configs and suddenly noticed the 2 nodes started communicating. Mikael then confirmed on the mailinglist that this indeed was expected behaviour

Anyhow .. I also tested the --nowait feature which gives you the opportunity to boot only 1 clusternode and start using the database, rather than having to wait till the 2nd node is up and running, and that also works like a charm.

Jul 14 2006

iSCSI, Xen, OCFS2 and Heartbeat 2

What do you get when you throw iSCSI, Xen, OCFS2 and Heartbeat 2 on a couple of machines and make them play niceley together ?

Jul 14 2006

Vacation for Geeks

Lars Wirzenius has the perfect way to get geeks away from their screens and on vacation. At least for 2 days ..