Tuesday, January 29, 2013

Zabbix: configure SNMP traps – a simpler approach



Zabbix has well documented guide to configure SNMP traps, I followed the guide and gave it up after facing the challenge to install additional software SNMPTT or recompile SNMPD to include “--enable-embedded-perl” (it seems snmpd In RHEL 6 is not enabled by default)
I found an easier way to use item type “Zabbix trapper” rather than “snmp trap”, The “Zabbix trapper” receive data from zabbix_sender command, which was triggered by traphandle in /etc/snmp/snmptrapd.conf, it is the single configured file need to be configured, unlike “snmp trap” type which require many configuration files:zabbix_server.conf,snmptt.conf,snmptrapd.conf,snmptt.ini,snmptt.conf

Pros and Cons compared to the official “snmp trap”

- Pros:
  • Easier to implement, no need to install or compile additional software.
  • You don’t need to add host resource to Zabbix, you can setup a catch-all trapper in “Zabbix Server”,  real problematic host can be identified in the SNMP values.
  • Distributed monitoring, you can setup many snmptrapd receivers.
  • Only one way connection is required, that is port 162 from snmp host to snmptrapd host and port 10051 from snmptrapd host to Zabbix Server. Zabbix server doesn’t need to query snmp host on port 161.
 For me, I only use SNMP trap to receive alarms of hardware fault of servers/storage etc, I don’t want to pull the hardware health metrics and define the threshold again, it is not needed and not sufficient, because hardware vendor already have all necessary hardware health metrics and  threshhod, you just need to receive the alarms through the management interface (SUN iLOM/DELL DRAC/IBM RSA ).
- Cons:
Maybe performance? The traphandler is a simple Perl script to read SNMP data via STDIN and connect to Zabbix server, I don’t see performance implication of it.

Install snmptrapd and configure snmptrapd(it doesn’t need to be in Zabbix server)

 
[root@zabbix~]# cat /etc/snmp/snmptrapd.conf 
authCommunity   log,execute public
traphandle default /opt/zabbix/bin/user_trap_receiver.pl

[root@zabbix ~]# cat /opt/zabbix/bin/user_trap_receiver.pl
#!/usr/bin/perl
$zserver='192.168.1.10';
$tserver='Zabbix server';
$trapkey='allsnmptrap';
$zsender='/usr/local/bin/zabbix_sender';

$hostname = <STDIN>;
chomp($hostname);
$ipaddress = <STDIN>;
chomp($ipaddress);
$output="Trap received from Host: $hostname ($ipaddress)\n";
while(<STDIN>) {
$output.=$_;
}
system ( "$zsender  -z \"$zserver\" -s \"$tserver\"  -k \"$trapkey\" -o \'$output\' ");

Create the catch-all “zabbix trapper” item in any host, for example in Zabbix Server.

Create trigger for allsnmptrap item

 It is generic trigger based on time, basically it is trigged for any data received and stay in alarm state until 2 days later or manually make it cleared(by edit trigger expression). You may need to customize the trigger to action on string search result etc. (NOTE: new alarm won’t pop up if trigger is already in alarm state, so you need to make it to health state in order to receive new alarm)


Test

[root@zabbix~]#snmptrap -v 1 -c public 127.0.0.1 '.1.3.6.1.6.3.1.1.5.3' '0.0.0.0' 6 33 '55' .1.3.6.1.6.3.1.1.5.3 s “teststring000”
SNMP trap was received:
 

(UDP: [127.0.0.1] …) has the source IP of the problematic host. It is 127.0.0.1, because it was run on the Zabbix server itself.

Trouble shooting

Start snmptrapd daemon with “-f –Le” option will output errors in console.


6 comments:

  1. It is not acceptable to wait two days for a trap to reset. Most devices have an OK state trap that is send when values are not within the Problem state threshold.

    Take a look at these.
    .1.3.6.1.4.1.789.1.21.1.2.1.16
    .iso.org.dod.internet.private.enterprises.netapp.netapp1.storage.enclosure.enclTable.enclEntry.enclFansMaximum
    .1.3.6.1.4.1.789.1.21.1.2.1.17
    .iso.org.dod.internet.private.enterprises.netapp.netapp1.storage.enclosure.enclTable.enclEntry.enclFansPresent
    .1.3.6.1.4.1.789.1.21.1.2.1.18
    .iso.org.dod.internet.private.enterprises.netapp.netapp1.storage.enclosure.enclTable.enclEntry.enclFansFailed

    Present = OK state
    Failed = Problem state
    Maximum = Warning state

    ReplyDelete
  2. What I did was to set the trigger with the following: ({Zabbix server:allsnmptrap.str(RESET)}=0)

    and then have a zabbix script that runs on the zabbix server as follows:
    /usr/local/bin/zabbix_sender -c /etc/zabbix/zabbix_agentd.conf -k allsnmptrap -o 'RESET'

    When I see the alert go red, I can then check out the issue, and then via zabbix, execute my RESET_SNMP script and the alrert goes away..

    ReplyDelete
    Replies
    1. I love your solution Hubert.
      Thanks a lot for the hint!!

      Delete
  3. Hello,

    What do you change in order to display trap with separate lines ? I see it in dashboard and in Latest data in one line and very difficult to see all variables

    Thanks

    ReplyDelete
    Replies
    1. Hi Natalia,

      You probably don have the issue any more, but the snmptt.conf file contains the oid format item. You will have to use the perl script and enable certain perl options in the snmptt.ini file. Add $+* in the FORMAT line to show all variables and values from the trap.

      Follow the steps given below:
      http://www.zabbix.org/wiki/Start_with_SNMP_traps_in_Zabbix

      Delete
    2. Excuse me:
      "Add $+* in the FORMAT line to show all variables and values from the trap." Do this in the snmptt.conf file.

      Delete