ha

Oct 11 2009

Monitoring MySQL

Ronald Bradford wants to know what kind of Monitoring you use..
He specifically wants to know about Alerting tools

There's different cases , looking at it from a full infrastructure point my current favourite is Zabbix or good old Nagios,

But when looking at it from a debugging perspective you have MySQLAR or Hyperic, but those aren't in the alerting list.

However, when you are building HA clusters, you have custom scripts running either from mon or from pacemaker ..

Still .. Ronald probably wants more input :)

Oct 09 2009

Why learn to type ?

When your machine knows what you mean ..

  1. [s3p-root@XMS-1 tomcat6]# crm configure
  2. crm(live)configure# bye
  3. [s3p-root@XMS-1 tomcat6]# crm confiure
  4. crm(live)configure# bye
  5. [s3p-root@XMS-1 tomcat6]# crm confiture
  6. crm(live)configure# bye
  7. [s3p-root@XMS-1 tomcat6]#

I'd better

  1. apt-get install coffee

Jul 01 2009

DRBD2, OCFS2, Unexplained crashes

I was trying to setup a dual-primary DRBD environment, with a shared disk with either OCFS2 or GFS. The environment is a Centos 5.3 with DRBD82 (but also tried with DRBD83 from testing) .

Setting up a single primary disk and running bonnie++ on it worked Setting up a dual-primary disk, only mounting it on one node (ext3) and running bonnie++ worked

When setting up ocfs2 on the /dev/drbd0 disk and mounting it on both nodes, basic functionality seemed in place but usually less than 5-10 minutes after I start bonnie++ as a test on one of the nodes , both nodes power cycle with no errors in the logfiles, just a crash.

When at the console at the time of crash it looks like a disk IO (you can type , but actions happen) block happens then a reboot, no panics, no oops , nothing. ( sysctl panic values set to timeouts etc )
Setting up a dual-primary disk , with ocfs2 only mounting it on one node and starting bonnie++ causes only that node to crash.

On DRBD level I got the following error when that node disappears

  1. drbd0: PingAck did not arrive in time.
  2. drbd0: peer( Primary -> Unknown ) conn( Connected -> NetworkFailure )
  3. pdsk(UpToDate -> DUnknown )
  4. drbd0: asender terminated
  5. drbd0: Terminating asender thread

That however is an expected error because of the reboot.

At first I assumed OCFS2 to be the root of this problem ..so I moved forward and setup an ISCSI target on a 3rd node, and used that device with the same OCFS2 setup. There no crashes occured and bonnie++ flawlessly completed it test run.

So my attention went back to the combination of DRBD and OCFS
I tried both DRBD 8.2 drbd82-8.2.6-1.el5.centos kmod-drbd82-8.2.6-2 and the 83 variant from Centos Testing

At first I was trying with the ocfs2 1.4.1-1.el5.i386.rpm verson but upgrading to 1.4.2-1.el5.i386.rpm didn't change the behaviour

Both the DRBD as the OCFS mailinglist were fairly supportive pointing me out that it was probably OCFS2 fencing both hosts after missing the heartbeat, and suggested increasing the deathtimetimeout values.

I however wanted to confirm that. As I got no entries in syslog I attached a Cyclades err Avocent Terminal server to the device in the hope that I'd capture the last kernel messsages there ... no such luck either.

On the OCFS2 mailinlist people pointed out that i'd use netconsole to catch the logs on a remote node
I set up netconsole using

  1. modprobe netconsole netconsole="@/,@172.16.32.1/"
  2. sysctl -w kernel.printk="7 4 1 7"

After which indeed I catched error on my remote host..

  1. [base-root@CCMT-A ~]# nc -l -u -p 6666
  2. (8,0):o2hb_write_timeout:166 ERROR: Heartbeat write timeout to device
  3. drbd0 after 478000 milliseconds
  4. (8,0):o2hb_stop_all_regions:1873 ERROR: stopping heartbeat on all active
  5. regions.
  6. ocfs2 is very sorry to be fencing this system by restarting

One'd think that it output over Serial console before it log over the network :) It doesn't .

Next step is that I`ll start fiddling some more with the timeout values :) (note the ":)")

Feb 02 2009

Everything is a fine whitespace problem ...

Couple of days ago I was working on a Linux Heartbeat v2 setup.
Upon inserting an XML snippet into the cib cib-adm started eating memory fast until the oom killer kicked in.

The environment was running a fairly old heartbeat-2.0.8 version so I upgraded to heartbeat-2.1.4-2.1 and there I got a nice warning that my XML sintax wasn't correct.

There was a whitespace in the XML syntax.

  1. <expression attribute="#replicationvalue" id="is_lagged" operation ="gt" ... ><

Removing the whitespace solves the problem, also on the older version. So the problem is already fixed upstream.. but you might run into it anyhow.

Oct 25 2008

Wholesale High Availability

Alan just coined WholeSale HA. The idea of rebooting a whole virtual machine rather than just failing over 1 service.

He wants to have the best of both worlds in 1 framework, he however doesn't specify what parts he likes from the WholeSale HA setup

Yes you want to use it coupled with hardware predictive failure analysis tools. In order to achieve Higher Availabilty, but I don't think the WholeSale HA part is real HA.

WholeSale HA isn't going to be fast enough for most of the business critical environments.
You simply cannot afford to reboot or even boot a full machine and the related downtime that brings for your service.

So yes a best effort combination, but with a strong focus on the application state would be preferred. WholeSale is a good start .. but it's definitely not where you want to stop.

Sep 24 2008

Bug in ifconfig ?

So earlier this week I ran into the weirdest problem with Linux-HA. Heartbeat was happily adding an IP address as an active resource so one of my nodes when needed, but upon removal it failed to remove the IP from the stack. Further debugging learned that the Heartbeat scripts claimed the ip wasn't on the actual stack.

It was.. but it the output from ifconfig was different from what it expected it to be.

Heartbeat checks the output of ifconfig and expects to find the IP address it added itselve to be on a :0 or similar interface. Now ifconfig only seems output 8 characters for the interface name Which means that when you have an interface called eth0:0 the output perfectly lists it and heartbea
t is smart enough to remove the ip again when the node goes to standby. If however you have a vlan with 3 digits on a bond interface Heartbeat will add :0 to bond0.129 , the Heartbeat resource will add the ip address perfectly but opon checking all the :0 interfaces the bond0.129:0 interface won't be parsed as ifconfig outputs it as bond0.129 , hence resulting in a potential painfull situation where 2 nodes still share an IP address.

So where's the actual problem ifconfig, or heartbeat, I'd say both, but the easiest fix will be in Heartbeat, afterall there are other preferred ways of adding an ip addres to an interface. ip addr add comes to mind :)

So I filed a bug report :)

Aug 25 2008

Raid is obsolete

In a lot of environments.

Peter gives a nice overview why you don't always need to invest in big fat redundant hardware.

We've tackled the topic last year already ..

Now I often get weird looks when I dare to mention that Raid is obsolete ..people fail to hear the "in a lot of environments"

Obviously the catch is in the second part, you won't be doing this for your small shop around the corner with just one machine. You'll only be doing this in an environment where you can work with a redundant array of inexpensive disks. Not with a server that has to sit in a remote and isolated location.

Next to that there are situations where you will be using raid, but not for redundancy, but for disk throughput.

Apr 28 2008

MySQL and DRBD, Often say NO :)

Florian is replying to James on the subject of using DRBD for MySQL HA. A discussion started earlier by Eric Florian is refuting most of the arguments that James has against using MySQL and DRBD together.

I`m also saying NO to MySQL and DRBD in most of the cases.. but not for any of the reasons James mentions.

I must say upfront I love DRBD and I have been using it in production for a long time but not for MySQL HA.

The problem with using MySQL on DRBD is the same problem you have when killing the power on a standalone MySQL machine and rebooting that machine.
DRBD saves you the time of powering up your machine and OS. But MySQL still needs to be started again on the standby machine. (In limited cases you might have a lengthy startup process due to eg. Innodb consitency checks) But for lots of organisations this (even limited) downtime is not acceptable.

Both MySQL Cluster and MultiMaster replication give you constant access to your data on more nodes .

For lots of shops, those not needing to scale, those that can live with a limited downtime, DRBD and MySQL is a good match,

But if you want to achieve real high availability as compared to less downtime. or if you are looking to scale your MySQL and want to benefit from HA while you are at it , then MultiMaster is probably the preferred alternative as opposed to DRBD.

In the meanwhile I`ll be happy serving other data from my DRBD volumes ;)

Apr 21 2008

Remus got awarded best paper

The Remus paper I blogged about earlier seems to have won the Best Paper Award at NSDI. Good to hear .. err read ! :)

Feb 06 2008

It's February again

It seems like for the past 4 years February is the month that O'Reilly really loves me and decides to publish one of my articles.

This years version was cowritten together with my collegue Johan Huysmans and tackles the creation of Highly Available Gateways

Altough the every HA situation is different and this is a pretty easy setup it's a good start for other setups.

Enjoy the read

PS. Yes I know , in 2006 I also had a January article :)