monitoring

Aug 11 2012

Our #monitoringsucks rpm is repository available

Not only our Rubygems Builds have changed, but also my internal #monitoringsucks repository.

You might have noticed a variety of vagrant- projects on my github acount

http://github.com/KrisBuytaert/vagrant-ganglia
http://github.com/KrisBuytaert/vagrant-graphite
http://github.com/KrisBuytaert/vagrant-puppet-logstash,
Being the #monitoringsucks part of them. All of those Vagrant projects are basically my test setups to play with those new tools.

They contain a bunch of puppet modules that install and configure these tools. (Note that they mostly consist of
of git submodules to other puppet module repositories.

Given the fact that I also like to have my software cleanly installed from a package, that means that some of these tools had to be packaged, or I had to create a personal / internal repository which had packages from upstream that were hiding on the internet available.

I've forked of this repository off the internal Inuits epository so you all can also benefit from these efforts.
(You gotta love pulp :))

That means you can now install all of the above mentionned #monitoringsucks tool from our public repo on

  1. yumrepo { 'monitoringsucks':
  2. baseurl => 'http://pulp.inuits.eu/pulp/repos/monitoring',
  3. descr => 'MonitoringSuck at Inuits',
  4. gpgcheck => '0',
  5. }

Patches to both the Vagrant projects and the puppet modules are welcome ...

May 01 2012

Devops in Munich

Devopsdays Mountainview sold out in a short 3 hours .. but there's other events that will breath devops this summer.
DrupalCon in Munich will be one of them ..

Some of you might have noticed that I`m cochairing the devops track for DrupalCon Munich,
The CFP is open till the 11th of this month and we are still actively looking for speakers.

We're trying to bridge the gap between drupal developers and the people that put their code to production, at scale.
But also enhancing the knowledge of infrastructure components Drupal developers depend on.

We're looking for talks both on culture (both success stories and failure) , automation,
specifically looking for people talking about drupal deployments , eg using tools like Capistrano, Chef, Puppet,
We want to hear where Continuous Integration fits in your deployment , do you do Continuous Delivery of a drupal environment.
And how do you test ... yes we like to hear a lot about testing , performance tests, security tests, application tests and so on.
... Or have you solved the content vs code vs config deployment problem yet ?

How are you measuring and monitoring these deployments and adding metrics to them so you can get good visibility on both
system and user actions of your platform. Have you build fancy dashboards showing your whole organisation the current state of your deployment ?

We're also looking for people talking about introducing different data backends, nosql, scaling different search backends , building your own cdn using smart filesystem setups.
Or making smart use of existing backends, such as tuning and scaling MySQL, memcached and others.

So lets make it clear to the community that drupal people do care about their code after they committed it in source control !

Please submit your talks here

Jan 03 2012

#monitoringsucks and we'll fix it !

If you are hacking on monitoring solutions, and want to talk to your peers solving the problem
Block the monday and tuesday after fosdem in your calendar !

That's right on february 6 and 7 a bunch of people interrested to fix the problem will be meeting , discussing and hacking stuff together in Antwerp

In short a #monitoringsucks hackathon

Inuits is opening up their offices for everybody who wants to join the effort Please let us (@KrisBuytaert and @patrickdebois) know if you want to join us in Antwerp

Obviously if you can't make it to Antwerp you can join the effort on ##monitoringsucks on Freenode or on Twitter.

The location will be Duboistraat 50 , Antwerp
It is about 10 minutes walk from the Antwerp Central Trainstation
Depending on Traffic Antwerp is about half an hour north of Brussels and there are hotels at walking distance from the venue.

Plenty of parking space is available on the other side of the Park

Mar 10 2011

Watching the Guards

A couple of weeks ago I noticed a weird drop in web usage stats on the site you are browsing now. Kinda weird as the drop was right around Fosdem when usually there is a spike in traffic.

So before you start.. no I don't preach on practice on my own blog, it's a blog dammit, so I do the occasional upgrades on the actual platform , with backups available, do some sanity tests and move on, yes I break the theme pretty often but ya'll reading this trough RSS anyhow.

My backups showed me that drush had made a copy of the Piwik module somewhere early february, exactly when this drop started showing. I verified the module , I verified my Piwik , - Oh Piwik you say .. yes Piwik, if you want a free alternative to Google Analytics , Piwik rocks .. - I even checked other sites using the same piwik setup and they were all still functional happily humming and being analyzed.... everything fine ... but traffic stayed low ..

This taught me I actually had to upgrade my Piwik too ...

So that brings me to the point I`m actually wanting to make...
as according to @patrickdebois in his chapter on Monitoring "Quis custodiet ipsos custodes?" who's monitoring the monitoring tools, who's monitoring the analytics tools,

So not only should you monitor the availability of yor monitoring tools, you should also monitor if their api hasn't changed in some way or another.
Just like when you are monitoring an web app you shoulnd't just see if you can connect to the appropriate http port, but you should be checking if you get sensible results back from it , no gibberish.

But then again ... there's no revenue in my blog or its statistics :)

Mar 06 2010

Better days Arrive when Dev Meet Ops

A couple of weeks a go Brian Profitt pinged me for a chat about Devops , the result of that chat , his article can now be found on the Zenoss blog, it's titled Datacenter Barometer: Better days arrive when dev meets ops

It's a very nice read with some pointers to places regular readers of my blog should already know ;)
So with lots of leading Open Source infrastructure companies on different levels, such as config management (OpsCode and Reductive Labs) , monitoring (Zenoss) , deployment (openQRM, RPath, and obviously Consultancy companies , the upcoming Devops conferences around the planet promise to be a lot of fun ! ;)

Oh, and apparently there is some more on the story on /.

Jan 28 2010

Implementing Raid Monitoring on a 3Ware 3w-9xxx based controller.

When you pull out a disk from your Raid setup it shows a warning in syslog

  1. Jan 27 10:18:22 EL860 kernel: 3w-9xxx: scsi0: AEN: WARNING (0x04:0x0019): Drive
  2. removed:port=1.
  3. Jan 27 10:18:22 EL860 kernel: 3w-9xxx: scsi0: AEN: ERROR (0x04:0x0002): Degraded
  4. unit:unit=0, port=1.

However if no one is looking at syslog that won't really be helpfull.

3Ware provides a tool from their site called tw_cli which can be used to manage
the raid setup from the command line.

  1. [EL860-root@EL860 admin]# tw_cli /c0 show
  2.  
  3. Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy
  4. ------------------------------------------------------------------------------
  5. u0 RAID-1 REBUILDING 41% - - 232.82 RiW ON
  6.  
  7. VPort Status Unit Size Type Phy Encl-Slot Model
  8. ------------------------------------------------------------------------------
  9. p0 OK u0 232.88 GB SATA 0 - ST3250310NS
  10. p1 DEGRADED u0 232.88 GB SATA 1 - ST3250310NS

I'd figure I'd either have to write wrapper script around that or find some other way of integrating it.
Asking the question on ##infra-talk on irc.freenode.net gave me the following link to a check script on github

koollman: sdog: something like http://github.com/stanaka/check_tw should work.

With that in your snmpd.conf you can get the info via snmp

  1. [root snmp]# snmpwalk localhost -v 2c -c public .1.3.6.1.4.1.2
  2. 021 | grep ext
  3. UCD-SNMP-MIB::extIndex.1 = INTEGER: 1
  4. UCD-SNMP-MIB::extNames.1 = STRING: TW_RAID
  5. UCD-SNMP-MIB::extCommand.1 = STRING: /usr/local/sbin/check_tw
  6. UCD-SNMP-MIB::extResult.1 = INTEGER: 2
  7. UCD-SNMP-MIB::extOutput.1 = STRING: CRITICAL: Unit: u0, Type: RAID-1, Status: RE
  8. BUILDING
  9. UCD-SNMP-MIB::extErrFix.1 = INTEGER: 0
  10. UCD-SNMP-MIB::extErrFixCmd.1 = STRING:
  11. UCD-SNMP-MIB::ssSysContext.0 = INTEGER: 2073
  12. UCD-SNMP-MIB::ssRawContexts.0 = Counter32: 11781783
  13. UCD-DLMOD-MIB::dlmodNextIndex.0 = INTEGER: 1

Oct 11 2009

Monitoring MySQL

Ronald Bradford wants to know what kind of Monitoring you use..
He specifically wants to know about Alerting tools

There's different cases , looking at it from a full infrastructure point my current favourite is Zabbix or good old Nagios,

But when looking at it from a debugging perspective you have MySQLAR or Hyperic, but those aren't in the alerting list.

However, when you are building HA clusters, you have custom scripts running either from mon or from pacemaker ..

Still .. Ronald probably wants more input :)

Sep 23 2009

MonitoringForge

I found the initial announcement in my Spambox, thats where I send all the mails from GroundWork as unsubscribing from their marketing lists doesn't seem to work ..

Tarus pointed it out.. he already mentionned Marketing, low community imput etc..

Nevertheles I had a look, and my first and last impression was, NagiosForge, no active community, and empty pages on other imvho more relevant Monitoring Projects created by GroundWork employees.

Big John coined the "to pull a GroundWork" phrase aiming at their early products, err logo placements. When we were evaluating differen Open Source monitoring tools for our OLS paper about 18 months ago we looked at the tool and it didn't make the cut .. it failed in all areas where it claimed to be strong, ease of installation , lack of errorhandling during the installation, etc ... some while later another collegue tried again and we ran into similar problems

Yet some people claim it to be a great product.. I'm not a fan yet ...maybe one day they they'll know to convince me .. but for now
When I need Nagios .. I`ll be looking at Djagios , Opsview , Icinga or the real thing.

However, ther's still Hyperic, Zenoss and Zabbix :)

Sep 29 2008

Zenoss, the Book , the review

As mentioned earlier , Packt sent me their Zenos Core Network and Systems Monitoring by Michael Badger to review.

So I spend the past couple of weeks trying to crawl my way trough it. Not that there was any problem with the book, more with my schedule :) I was hoping to finish it some time ago but time was working against me, even last week when my flight back from Zurich was a bit shorter than planned I couldn't finish reading the book.

But yesterday I managed, Michael managed to write a good book on a hot topic during what was probably one of the most busy of his life.

In 8 chapters Michael manages to explain the different aspects of Zenoss fairly well, he even touches the creation of ZenPacks and extending Zenoss in general also.

It was a fairly good read and showed me a couple of new insights into Zenoss.

However although Zenoss is heavily SNMP based there is little information on SNMP in general and MIB more in detail ? People really new to systems monitoring might want to have a couple of more pages on these topics.

For those not familiar to Zope the mentioning of TALES expressions might also raise some eyebrows, luckily there has been a full Appendix dedicated to it.

Time permitting I would have hoped he could of gone a bit more in detail on creating a ZenPack that did a bit more than checking the status of a page. Looking into the status of an Apache or MySQL or other relevant daemon , or the Squid internal statistics

But if you are new to Zenoss and you want to have a quick headstart into monitoring your infrastructure with it this book is a really really good start.

There's a free preview Chapter at Scribd.com and also at Packt

Aug 25 2008

Where are the Virtualization Zenpacks ?

Zennoss has announced the entries for their zenpack contest . What strikes me is that there is not a single entry that plans on monitoring Xen, KVM, VirtualBox or any of the other Open Source virtualization technologies .. Not in the contest entries and not in the existing community packs .

This sounds like a big opportunity to me :)