Jan 18 2007

Emerging HPC Technologies

LinkedIn has a Question about Emerging HPC technologies. As not everyone is on LinkedIn I`ll crosspost my reply here.

First of all there is no general purpose, one size fits all type of HPC solution, so this question is a bit vague, Depending on your type of data, your type of problem and your type of application different alternatives surface.

People tend to split up HPC into on one hand beowulf and ssi style solutions and on the other side you hand Grid.
Off course it's not that black and white as Beowulf and SSI style became hibrid ages ago and lots of typical SGE environment are being used on a BeoWulf style cluster today.
Everything depends on what you need Do you have an MPMD kind of application or do you need long and heavy processing power on one monolitic block ?
There are lots of developments in both areas.

I see mostly evolution in existing projects, Look at XenoServers. , from where Xen originated. Look at Qlusters with their HotMigration product (or even an openMosix cluster managed by openQRM) or look at what Renaud Lottiaux is doing with Kerlabs

Everywhere you see different Virtualisation techniques involved , both upscaling to an SSI and downscaling to a point where you will be migrating minimal virtual machines (even JVM's or Database instances) over your network to the best resource to tackle it. We will lead to an environment where one can distribute load independent of the platform the machine is originally aimed for, and management of those platforms will become easier.. Virtualisation will also help the 5 to 9 cluster, just hybernate the desktop OS and unpause the HPC instance and of you go. But there's a long way to go till everyone has their personal unlimted CPU resources.

Jan 17 2007

Synergy

Synergy rocks !

It just works, cross distro, I have in the middle my laptop running an FC6, on my left the screen is connected to my RACK running a Xenhost on FC5 and on my right the iMac is humming on Ubuntu. I now have 3 screens at home and I`m only typing on 1 keyboard using 1 mouse. It took me no more than 10 minutes to download the appropriate pacakges (both FC and Ubuntu ship with Synergy in their Repo's ) configure it on 3 machines and start working with multiple heads at home.

Our cat really loves it.. she can now ly on the unplugged keyboard in fron to the screen while I type away on my main keyboard.

Jan 16 2007

Joost

The Venice Project is now Joost

We actually have a collegue at the office called Joost. We often suggest people to go ask him the question.. strange how nobody ever actually does that.

But the social aspect can be solved easy. Just as we get suggestions on which music we should listen on Pandora and Last.FM to we'll now get suggestions which shows we have to watch.

And Joe Average will probably keep going to Benidorm and watch VTM. While we are already watching more selective , and watch when we want to watch. Oh that's right.. you still don't have your MythBox :)

Jan 07 2007

Convert, the New XenMan

I totally missed Xenman rebranding to Convert, and they seem to have Debian/Ubuntu packages now also.

Should keep a better eye on them.

Jan 06 2007

AC/OS: What would Novell buy?

Matt AC Wonders what would Novell buy?

A conversation with a friend today made me wonder, "What will Novell buy?" Word on the street is that the company has $200M or so burning a hole in its pocket, with Altiris and XenSource as potential acquisition candidates. If there's any fire behind the smoke, both deals could make sense, albeit for very different reasons.

I wonder, does XenSource want to be bought ? When talking to Peter Levine in Amsterdam early december I wasn't under the impression they were interrested in being bought any time soon , if you look at his previous track record at Veritas I think he's someone who wants to build something big, not sell something small.

Jan 03 2007

Calender++

I think Brian Aker

Every website that has events should publish an ical file. Just like they publish RSS, they should publish their events based on dates. ICS files with VCALENDAR entries are easy to publish.

Should have a look at Upcoming.org , but then we can't expect everyone to fill in their event there. And I also lost track of the events I wanted to publish there but never get around doing so.

But in general, as I mentionned earlier
solving the Calender problem isn't easy.

Jan 03 2007

Next Generation Radio

Dear Radio Manufacturer :

I want to talk to you about the new features in your next generation Car Radios , I have a favorite channel, but sometimes they play music that I really can't tolerate, I then switch to my CD player, usually forgetting to switch back for the rest of the trip. Hence missing all the other good songs I appreciate listening to.

What I want from you is 1 button. A button I push whey I don't like the music that is being played , there should be some way to tell upfront how long a song wil last , and you pick a random number of about the same length play it and then return control back to the radio station. Ideally offcourse you have multiple tracks, and you learn which kind of music I hate and never play it again. Pandora already does that for me online.

So what's keeping you. I`m happy to test those features for you !

Jan 02 2007

One less domain to get spam ..

No more mx for stone-it.be ..

By now the world should know we rebranded to x-tend.be , it's been over 3 years already !

Jan 02 2007

How to manage parsable code in a large infrastructure.

I already touched the topic in the previous article, but I've been wondering, how do you ideally manage say Perl or PHP code in a Large Environment.
As earlier noted I want a system to be reproducability under version control and I want to be able to identify the source of every single file on my system either as belonging to a package or being transferred there from my version control platform.

For the ease of reading let's say we stick to mass deploying PHP Code. Libraries most certainly belong to be in packages. One can disuss where your own code belings.
sTypically PHP code, html layout and Stylesheets are small files that are modified manually by a developer . Often really small changes are being made to those files.
These changes don't need to be recompiled in order to be used so shops that use this kind of code don't always have tradition of build and release management.

I`m thinking about abusing configuration management tools to manage code and just consider code as a configuration file.
Amongst the disadvantages of using these file based management tools , puppet/cfengine/yourpoison but it should be a tool that makes it as easy for you to update 1 file on 1 machine as 1 file on 2000 machines.

  • Unclear file origin , you have no direct link anymore between the file and where it came from on the machine as it doesn't show up in a package list and rpm -qf doesn't help you either
  • You need a tool , most people these days know either rpm or dpkg , either apt or yum, using a tool such as CFengine or Puppet makes it more complex for new people to step in.

The DisAdvantages of packaging code

  • You have a less granular way of defining which files need to be modified, it's an all or nothing approach , however you can define how small your packages are
  • You need a build framework to build the packages, off course this can be automated.

If you use tagging wisely you're sure that either way you know which version is on the production platform.
Most configuration management systems have features that allow you to verify if a file is changed or to to do a certain action after you update this so that's not an issue.
Is using a package per application really that much overhead ? If you already have the infrastructure in place to build your packages then what's the extra effort ?

I've used both approaches so far, a lot depends on the culture of the developers team and the already existing infrastructure.

So what's your opinion ?

Jan 02 2007

IBM on hosting Drupal sites | Dries Buytaert

I must be so out of IBM's target audience with regard to their Drupal Hosting article Dries pointed us to.

Lemme say that again, I was probably not their target audience ! so my opinion on the article isn't really relevant. But still, here's what bothers me in the article.

The article explains how to create a virtual machine with Drupal and how to deploy it later in a webhosting infrastructure. A very good idea, however the way they explain the user how to do it will lead to problems when one eventually wants to scale further. Some basic rules of Large Scale deployment have been forgotten (but as told that was not the audience). Nowhere in the article is reproducability or scalability mentionned.

Early in the article I read that they were disappointed they couldn't use Xen because the management tools were not as polished and robust as they would like.
They couldn't save enough time by using Xen but had enough time to create a zillion screenshots from a gui ? For me managability means that I can script everything and I don't need a GUI. Actually having to use a gui to configure stuff means it's NOT manageable at all. Certainly when going to Large Scale installs. Imaging having to click 100 times on a start button, that just doesn't scale.

But not all is bad in the article Using LVM is good , very good, Using CVS good & taggin good , checking out CVS from the production server, not really good .
As long as you have 1 server it's fine, but as soon as you have more servers you will one day have a developer fixing a bug on machine 1 , even comitting it, but it never gets deployed to machine 2,3 and 4. So it's a good habit not to allow it for starters as old habits die hard :)

One should have a mechanism in place that pushes new updates to the servers, or have the servers periodically pull the update, which brings me to another topic, should I package parsable code , or should I keep the files in a repository file by file and have method to transfer those. Hmm.. that's stuff for a different posting.

Next up is a really bad habit. It's all the dissadvantages of transferring full disk image vs recreating that same image in a different machine. It's the VMWare idea of deploying, just copy a big file to another disk and start the virtual machine there. Good, for testing, demos etc.. but not for large scale deployment Use real LVM on the machine and deploy there , reproducable.
The biggest problem with transferring disk is that you never now how you got there , and you'll never be able to reproduce what you did .
My paper on Automating Virtual Machine Deployment explains how you can do this, mind the methodology, not the tools.
It doesn't matter if you use Fai, Kickstart or SystemImager , but USE them !

This article is a a good start for developers, but it's a No Go for an experienced system admin. Working like this just doesn't scale.