Everything is a Freaking DNS problem - monitoring http://127.0.0.1:8080/blog/taxonomy/term/938/0 en Our #monitoringsucks rpm is repository available http://127.0.0.1:8080/blog/our-monitoringsucks-rpm-repository-available <p>Not only our Rubygems Builds have changed, but also my internal #monitoringsucks repository.</p> <p>You might have noticed a variety of vagrant- projects on my github acount</p> <p><a href="http://github.com/KrisBuytaert/vagrant-ganglia" rel="nofollow">http://github.com/KrisBuytaert/vagrant-ganglia </a><br /> <a href="//github.com/KrisBuytaert/vagrant-graphite" rel="nofollow">http://github.com/KrisBuytaert/vagrant-graphite</a><br /> <a href="http://github.com/KrisBuytaert/vagrant-puppet-logstash" rel="nofollow">http://github.com/KrisBuytaert/vagrant-puppet-logstash,</a><br /> Being the #monitoringsucks part of them. All of those Vagrant projects are basically my test setups to play with those new tools.</p> <p>They contain a bunch of puppet modules that install and configure these tools. (Note that they mostly consist of<br /> of git submodules to other puppet module repositories.</p> <p>Given the fact that I also like to have my software cleanly installed from a package, that means that some of these tools had to be packaged, or I had to create a personal / internal repository which had packages from upstream that were hiding on the internet available.</p> <p>I've forked of this repository off the internal Inuits epository so you all can also benefit from these efforts.<br /> (You gotta love pulp :))</p> <p>That means you can now install all of the above mentionned #monitoringsucks tool from our public repo on </p> <p><div class="geshifilter"><pre class="text geshifilter-text" style="font-family:monospace;"><ol><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal">yumrepo { 'monitoringsucks':</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> baseurl =&gt; 'http://pulp.inuits.eu/pulp/repos/monitoring',</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> descr =&gt; 'MonitoringSuck at Inuits',</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> gpgcheck =&gt; '0',</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal">}</div></li></ol></pre></div></p> <p>Patches to both the Vagrant projects and the puppet modules are welcome ...</p> http://127.0.0.1:8080/blog/our-monitoringsucks-rpm-repository-available#comments devops ganglia graphite logstash monitoring monitoringsucks puppet repo vagrant Sat, 11 Aug 2012 19:49:39 +0000 Kris Buytaert 1068 at http://127.0.0.1:8080/blog Devops in Munich http://127.0.0.1:8080/blog/devops-munich <p>Devopsdays Mountainview sold out in a short 3 hours .. but there's other events that will breath devops this summer.<br /> DrupalCon in Munich will be one of them ..</p> <p>Some of you might have noticed that I`m cochairing the devops track for DrupalCon Munich,<br /> The CFP is open till the 11th of this month and we are still actively looking for speakers.</p> <p>We're trying to bridge the gap between drupal developers and the people that put their code to production, at scale.<br /> But also enhancing the knowledge of infrastructure components Drupal developers depend on.</p> <p>We're looking for talks both on culture (both success stories and failure) , automation,<br /> specifically looking for people talking about drupal deployments , eg using tools like Capistrano, Chef, Puppet,<br /> We want to hear where Continuous Integration fits in your deployment , do you do Continuous Delivery of a drupal environment.<br /> And how do you test ... yes we like to hear a lot about testing , performance tests, security tests, application tests and so on.<br /> ... Or have you solved the content vs code vs config deployment problem yet ? </p> <p>How are you measuring and monitoring these deployments and adding metrics to them so you can get good visibility on both<br /> system and user actions of your platform. Have you build fancy dashboards showing your whole organisation the current state of your deployment ? </p> <p>We're also looking for people talking about introducing different data backends, nosql, scaling different search backends , building your own cdn using smart filesystem setups.<br /> Or making smart use of existing backends, such as tuning and scaling MySQL, memcached and others.</p> <p>So lets make it clear to the community that drupal people do care about their code after they committed it in source control ! </p> <p>Please submit your talks <a href="http://munich2012.drupal.org/news/call-for-papers" rel="nofollow">here</a></p> http://127.0.0.1:8080/blog/devops-munich#comments cfp configmgmt deployment devops drupal measurement monitoring mysql puppet testing Tue, 01 May 2012 19:02:30 +0000 Kris Buytaert 1065 at http://127.0.0.1:8080/blog #monitoringsucks and we'll fix it ! http://127.0.0.1:8080/blog/monitoringsucks-and-well-fix-it <p>If you are hacking on monitoring solutions, and want to talk to your peers solving the problem<br /> Block the monday and tuesday after fosdem in your calendar !</p> <p>That's right on february 6 and 7 a bunch of people interrested to fix the problem will be meeting , discussing and hacking stuff together in Antwerp</p> <p>In short a #monitoringsucks hackathon</p> <p>Inuits is opening up their offices for everybody who wants to join the effort Please let us (@KrisBuytaert and @patrickdebois) know if you want to join us in Antwerp</p> <p>Obviously if you can't make it to Antwerp you can join the effort on ##monitoringsucks on Freenode or on Twitter.</p> <p>The <a href="http://www.inuits.eu/contact" rel="nofollow">location will be Duboistraat 50 , Antwerp</a><br /> It is about 10 minutes walk from the Antwerp Central Trainstation<br /> Depending on Traffic Antwerp is about half an hour north of Brussels and there are hotels at walking distance from the venue.</p> <p>Plenty of parking space is available on the other side of the Park</p> http://127.0.0.1:8080/blog/monitoringsucks-and-well-fix-it#comments collectd devops ganglia graphite icinga monitoring monitoringsucks munin nagios rrd Tue, 03 Jan 2012 18:23:00 +0000 Kris Buytaert 1057 at http://127.0.0.1:8080/blog Watching the Guards http://127.0.0.1:8080/blog/watching-guards <p>A couple of weeks ago I noticed a weird drop in web usage stats on the site you are browsing now. Kinda weird as the drop was right around Fosdem when usually there is a spike in traffic. <img src="http://www.krisbuytaert.be/images/PiwiwFail.png" /></p> <p>So before you start.. no I don't preach on practice on my own blog, it's a blog dammit, so I do the occasional upgrades on the actual platform , with backups available, do some sanity tests and move on, yes I break the theme pretty often but ya'll reading this trough RSS anyhow.</p> <p>My backups showed me that drush had made a copy of the Piwik module somewhere early february, exactly when this drop started showing. I verified the module , I verified my Piwik , - Oh Piwik you say .. yes Piwik, if you want a free alternative to Google Analytics , Piwik rocks .. - I even checked other sites using the same piwik setup and they were all still functional happily humming and being analyzed.... everything fine ... but traffic stayed low ..</p> <p><a href="http://drupal.org/node/1052430">This</a> taught me I actually had to upgrade my Piwik too ...</p> <p>So that brings me to the point I`m actually wanting to make...<br /> as according to @patrickdebois in his chapter on Monitoring "Quis custodiet ipsos custodes?" who's monitoring the monitoring tools, who's monitoring the analytics tools, </p> <p>So not only should you monitor the availability of yor monitoring tools, you should also monitor if their api hasn't changed in some way or another.<br /> Just like when you are monitoring an web app you shoulnd't just see if you can connect to the appropriate http port, but you should be checking if you get sensible results back from it , no gibberish.</p> <p>But then again ... there's no revenue in my blog or its statistics :)</p> http://127.0.0.1:8080/blog/watching-guards#comments devops drupal monitoring piwik Thu, 10 Mar 2011 18:44:56 +0000 Kris Buytaert 1038 at http://127.0.0.1:8080/blog Better days Arrive when Dev Meet Ops http://127.0.0.1:8080/blog/better-days-arrive-when-dev-meet-ops <p>A couple of weeks a go <a href="http://twitter.com/LinuxScribe" rel="nofollow">Brian Profitt</a> pinged me for a chat about Devops , the result of that chat , his article can now be found on the Zenoss blog, it's titled <a href="http://community.zenoss.org/blogs/zenossblog/2010/03/05/datacenter-barometer-better-days-arrive-when-dev-meets-ops" rel="nofollow">Datacenter Barometer: Better days arrive when dev meets ops</a></p> <p>It's a very nice read with some pointers to places regular readers of my blog should already know ;)<br /> So with lots of leading Open Source infrastructure companies on different levels, such as config management (OpsCode and Reductive Labs) , monitoring (Zenoss) , deployment (openQRM, RPath, and obviously <a href="http://www.inuits.be">Consultancy</a> companies , the upcoming Devops conferences around the planet promise to be a lot of fun ! ;)</p> <p>Oh, and apparently there is some more on the story on <a href="http://slashdot.org/submission/1186968/" rel="nofollow">/.</a></p> http://127.0.0.1:8080/blog/better-days-arrive-when-dev-meet-ops#comments devops ha krisbuytaert monitoring opensource zenoss Sat, 06 Mar 2010 13:03:27 +0000 Kris Buytaert 991 at http://127.0.0.1:8080/blog Implementing Raid Monitoring on a 3Ware 3w-9xxx based controller. http://127.0.0.1:8080/blog/implementing-raid-monitoring-3ware-3w-9xxx-based-controller <p>When you pull out a disk from your Raid setup it shows a warning in syslog</p> <p><div class="geshifilter"><pre class="text geshifilter-text" style="font-family:monospace;"><ol><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal">Jan 27 10:18:22 EL860 kernel: 3w-9xxx: scsi0: AEN: WARNING (0x04:0x0019): Drive </div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal">removed:port=1.</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal">Jan 27 10:18:22 EL860 kernel: 3w-9xxx: scsi0: AEN: ERROR (0x04:0x0002): Degraded</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> unit:unit=0, port=1.</div></li></ol></pre></div></p> <p>However if no one is looking at syslog that won't really be helpfull.</p> <p>3Ware provides a tool from their site called tw_cli which can be used to manage<br /> the raid setup from the command line.</p> <p><div class="geshifilter"><pre class="text geshifilter-text" style="font-family:monospace;"><ol><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal">[EL860-root@EL860 admin]# tw_cli /c0 show </div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal">&nbsp;</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal">Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal">------------------------------------------------------------------------------</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal">u0 RAID-1 REBUILDING 41% - - 232.82 RiW ON </div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal">&nbsp;</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal">VPort Status Unit Size Type Phy Encl-Slot Model</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal">------------------------------------------------------------------------------</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal">p0 OK u0 232.88 GB SATA 0 - ST3250310NS </div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal">p1 DEGRADED u0 232.88 GB SATA 1 - ST3250310NS </div></li></ol></pre></div></p> <p>I'd figure I'd either have to write wrapper script around that or find some other way of integrating it.<br /> Asking the question on ##infra-talk on irc.freenode.net gave me the following link to a <a href="http://github.com/stanaka/check_tw" rel="nofollow">check script</a> on github</p> <p><cite>koollman: sdog: something like <a href="http://github.com/stanaka/check_tw" title="http://github.com/stanaka/check_tw" rel="nofollow">http://github.com/stanaka/check_tw</a> should work. </cite></p> <p>With that in your snmpd.conf you can get the info via snmp</p> <p><div class="geshifilter"><pre class="text geshifilter-text" style="font-family:monospace;"><ol><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal">[root snmp]# snmpwalk localhost -v 2c -c public .1.3.6.1.4.1.2</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal">021 | grep ext </div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal">UCD-SNMP-MIB::extIndex.1 = INTEGER: 1</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal">UCD-SNMP-MIB::extNames.1 = STRING: TW_RAID</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal">UCD-SNMP-MIB::extCommand.1 = STRING: /usr/local/sbin/check_tw</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal">UCD-SNMP-MIB::extResult.1 = INTEGER: 2</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal">UCD-SNMP-MIB::extOutput.1 = STRING: CRITICAL: Unit: u0, Type: RAID-1, Status: RE</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal">BUILDING</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal">UCD-SNMP-MIB::extErrFix.1 = INTEGER: 0</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal">UCD-SNMP-MIB::extErrFixCmd.1 = STRING: </div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal">UCD-SNMP-MIB::ssSysContext.0 = INTEGER: 2073</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal">UCD-SNMP-MIB::ssRawContexts.0 = Counter32: 11781783</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal">UCD-DLMOD-MIB::dlmodNextIndex.0 = INTEGER: 1</div></li></ol></pre></div></p> http://127.0.0.1:8080/blog/implementing-raid-monitoring-3ware-3w-9xxx-based-controller#comments 3w-9xxx 3ware didimentionihateraid linux monitoring opensource raid snmp Thu, 28 Jan 2010 18:41:42 +0000 Kris Buytaert 981 at http://127.0.0.1:8080/blog Monitoring MySQL http://127.0.0.1:8080/blog/monitoring-mysql-0 <p><a href="http://ronaldbradford.com/blog/what-alert-monitoring-do-you-use-2009-10-07/" rel="nofollow">Ronald Bradford</a> wants to know what kind of Monitoring you use..<br /> He specifically wants to know about Alerting tools</p> <p>There's different cases , looking at it from a full infrastructure point my current favourite is Zabbix or good old Nagios,</p> <p>But when looking at it from a debugging perspective you have MySQLAR or Hyperic, but those aren't in the alerting list.</p> <p>However, when you are building HA clusters, you have custom scripts running either from mon or from pacemaker ..</p> <p>Still .. Ronald probably wants more input :)</p> http://127.0.0.1:8080/blog/monitoring-mysql-0#comments ha monitoring msql zabbix zenos Sun, 11 Oct 2009 18:18:04 +0000 Kris Buytaert 945 at http://127.0.0.1:8080/blog MonitoringForge http://127.0.0.1:8080/blog/monitoringforge <p>I found the initial announcement in my Spambox, thats where I send all the mails from GroundWork as unsubscribing from their marketing lists doesn't seem to work ..</p> <p><a href="http://www.adventuresinoss.com/?p=1117" rel="nofollow">Tarus</a> pointed it out.. he already mentionned Marketing, low community imput etc..</p> <p>Nevertheles I had a look, and my first and last impression was, NagiosForge, no active community, and empty pages on other imvho more relevant Monitoring Projects created by GroundWork employees.</p> <p><a href="http://www.johnmwillis.com/" rel="nofollow">Big John </a> coined the "to pull a GroundWork" phrase aiming at their early products, err logo placements. When <a href="http://www.inuits.be">we</a> were evaluating differen Open Source monitoring tools for our OLS paper about 18 months ago we looked at the tool and it didn't make the cut .. it failed in all areas where it claimed to be strong, ease of installation , lack of errorhandling during the installation, etc ... some while later another collegue tried again and we ran into similar problems</p> <p>Yet some people claim it to be a great product.. I'm not a fan yet ...maybe one day they they'll know to convince me .. but for now<br /> When I need Nagios .. I`ll be looking at <a href="http://djagios.org/" rel="nofollow">Djagios</a> , <a href="http://opsview.org/" rel="nofollow">Opsview</a> , <a href="http://www.icinga.org/" rel="nofollow">Icinga</a> or the real thing.</p> <p>However, ther's still Hyperic, Zenoss and Zabbix :)</p> http://127.0.0.1:8080/blog/monitoringforge#comments hyperic icinga look no link monitoring nagios zabbix zenos Wed, 23 Sep 2009 19:20:57 +0000 Kris Buytaert 939 at http://127.0.0.1:8080/blog Zenoss, the Book , the review http://127.0.0.1:8080/blog/node/735 <p>As mentioned <a href="http://www.krisbuytaert.be/blog/node/726">earlier</a> , Packt sent me their <a href="http://www.packtpub.com/zenoss-core-network-and-system-monitoring/book" rel="nofollow">Zenos Core Network and Systems Monitoring</a> by <a href="http://badgerfiles.com/blog/2008/09/24/zenoss-core-book-excerpts/" rel="nofollow">Michael Badger</a> to review.</p> <p>So I spend the past couple of weeks trying to crawl my way trough it. Not that there was any problem with the book, more with my schedule :) I was hoping to finish it some time ago but time was working against me, even last week when my flight back from Zurich was a bit shorter than planned I couldn't finish reading the book. </p> <p>But yesterday I managed, Michael managed to write a good book on a hot topic during what was probably one of the most busy of his life. </p> <p>In 8 chapters Michael manages to explain the different aspects of Zenoss fairly well, he even touches the creation of ZenPacks and extending Zenoss in general also. </p> <p>It was a fairly good read and showed me a couple of new insights into Zenoss.</p> <p>However although Zenoss is heavily SNMP based there is little information on SNMP in general and MIB more in detail ? People really new to systems monitoring might want to have a couple of more pages on these topics. </p> <p>For those not familiar to Zope the mentioning of TALES expressions might also raise some eyebrows, luckily there has been a full Appendix dedicated to it. </p> <p>Time permitting I would have hoped he could of gone a bit more in detail on creating a ZenPack that did a bit more than checking the status of a page. Looking into the status of an Apache or MySQL or other relevant daemon , or the Squid internal statistics </p> <p>But if you are new to Zenoss and you want to have a quick headstart into monitoring your infrastructure with it this book is a really really good start.</p> <p><a href="http://www.scribd.com/doc/5371809/Zenoss-Core-Network-and-System-Monitoring-Sample-Chapter" rel="nofollow">There's a free preview Chapter at Scribd.com</a> and also at <a href="http://www.packtpub.com/article/user-management-in-zenoss" rel="nofollow">Packt</a></p> http://127.0.0.1:8080/blog/node/735#comments bookreview monitoring zenoss Mon, 29 Sep 2008 19:39:20 +0000 Kris Buytaert 735 at http://127.0.0.1:8080/blog Where are the Virtualization Zenpacks ? http://127.0.0.1:8080/blog/node/715 <p>Zennoss has announced the <a href="http://blog.zenoss.com/2008/08/20/the-submit-a-zenpack-contest-entries/" rel="nofollow">entries for their zenpack contest</a> . What strikes me is that there is not a single entry that plans on monitoring Xen, KVM, VirtualBox or any of the other Open Source virtualization technologies .. Not in the contest entries and not in the <a href="http://www.zenoss.com/community/projects/zenpacks/" rel="nofollow">existing community packs</a> .</p> <p><a href="http://wiki.xensource.com/xenwiki/XenApi" rel="nofollow">This</a> sounds like a big opportunity to me :)</p> http://127.0.0.1:8080/blog/node/715#comments monitoring open source opensource virtualization xen zenoss Mon, 25 Aug 2008 19:07:20 +0000 Kris Buytaert 715 at http://127.0.0.1:8080/blog