snmp

Jan 28 2010

Implementing Raid Monitoring on a 3Ware 3w-9xxx based controller.

When you pull out a disk from your Raid setup it shows a warning in syslog

  1. Jan 27 10:18:22 EL860 kernel: 3w-9xxx: scsi0: AEN: WARNING (0x04:0x0019): Drive
  2. removed:port=1.
  3. Jan 27 10:18:22 EL860 kernel: 3w-9xxx: scsi0: AEN: ERROR (0x04:0x0002): Degraded
  4. unit:unit=0, port=1.

However if no one is looking at syslog that won't really be helpfull.

3Ware provides a tool from their site called tw_cli which can be used to manage
the raid setup from the command line.

  1. [EL860-root@EL860 admin]# tw_cli /c0 show
  2.  
  3. Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy
  4. ------------------------------------------------------------------------------
  5. u0 RAID-1 REBUILDING 41% - - 232.82 RiW ON
  6.  
  7. VPort Status Unit Size Type Phy Encl-Slot Model
  8. ------------------------------------------------------------------------------
  9. p0 OK u0 232.88 GB SATA 0 - ST3250310NS
  10. p1 DEGRADED u0 232.88 GB SATA 1 - ST3250310NS

I'd figure I'd either have to write wrapper script around that or find some other way of integrating it.
Asking the question on ##infra-talk on irc.freenode.net gave me the following link to a check script on github

koollman: sdog: something like http://github.com/stanaka/check_tw should work.

With that in your snmpd.conf you can get the info via snmp

  1. [root snmp]# snmpwalk localhost -v 2c -c public .1.3.6.1.4.1.2
  2. 021 | grep ext
  3. UCD-SNMP-MIB::extIndex.1 = INTEGER: 1
  4. UCD-SNMP-MIB::extNames.1 = STRING: TW_RAID
  5. UCD-SNMP-MIB::extCommand.1 = STRING: /usr/local/sbin/check_tw
  6. UCD-SNMP-MIB::extResult.1 = INTEGER: 2
  7. UCD-SNMP-MIB::extOutput.1 = STRING: CRITICAL: Unit: u0, Type: RAID-1, Status: RE
  8. BUILDING
  9. UCD-SNMP-MIB::extErrFix.1 = INTEGER: 0
  10. UCD-SNMP-MIB::extErrFixCmd.1 = STRING:
  11. UCD-SNMP-MIB::ssSysContext.0 = INTEGER: 2073
  12. UCD-SNMP-MIB::ssRawContexts.0 = Counter32: 11781783
  13. UCD-DLMOD-MIB::dlmodNextIndex.0 = INTEGER: 1