NagioS

WebHome | UnixGeekTools | Geekfarm | About This Site

Pointers

Writing Plugins

remote alerts

UPDATE

I am no longer using the SimpleEventCorrelator for nagios event correlation. I now have a much better solution, take a look at: GridPanoptes








  

SEC

I'm hooking up nagios to the Simple Event Correlator like so:

I currently monitor many machines that are spread between a few datacenters, many of which run the same services. I have a nagios check defined for each service associated with each server. Occasionally one of the services is unavailable on multiple machines in the same datacenter (e.g. due to a back end system being down in that datacenter), and I don't really want to get paged by every server having a problem. I really just want one page that tells me the number of servers having problems in that datacenter.

This is where SEC comes in. First, I set up a new passive check in nagios for each service that will be updated by SEC. Each service running on the individual servers in a datacenter are dependent upon the SEC passive check for that service in that datacenter.

I whipped up a little perl script called nagios_tail.pl (inspired by nagtail, which didn't compile on my solaris box). this script monitors the nagios status.dat file for status changes to any services, and generates a line of output like this:

SERVICE, hostname, service description, status, output

When SEC starts up, it spawns nagios_tail.pl. If a service goes to a WARNING or CRITICAL state on more than one server, the SEC rules flag the passive check with the current state and number of servers having a problem. By including the datacenter name in the service description, i can ensure that i will get multiple notifications if multiple datacenters are down. Since the checks on each of the individual servers depend on the SEC check, when the SEC check goes to a WARNING or CRITICAL state, you will only get notified about the state of the SEC check instead of getting one notification per server.

growl

# growl define command{ command_name notify-by-growl command_line /usr/bin/printf "%b" "Notification Type: $NOTIFICATIONTYPE$\n\nService: $SERVICEDESC$\nHost: $HOSTALIAS$\nAddress: $HOSTADDRESS$\nState: $SERVICESTATE$\n\nDate/Time: $LONGDATETIME$\n\nAdditional Info:\n\n$SERVICEOUTPUT$" | /usr/local/nagios/libexec/growl -t "***** Nagios *****" -s } define command{ command_name host-notify-by-growl command_line /usr/local/nagios/libexec/growl -t "***** Nagios *****" -m "Notification Type: $NOTIFICATIONTYPE$\nHost: $HOSTNAME$\nState: $HOSTSTATE$\nAddress: $HOSTADDRESS$\nInfo: $HOSTOUTPUT$\n\nDate/Time: $LONGDATETIME$\nHost $HOSTSTATE$ alert for $HOSTNAME$!" $CONTACTEMAIL$ }

and the growl script:

#!/usr/bin/perl -w use strict; use Mac::Growl; # Command-line options processing BEGIN { use Getopt::Long qw[ :config gnu_getopt ]; use vars qw( $verbose $message $title $nostick $stdin $appname ); GetOptions( '-v|verbose!' => \$verbose, '-m|message:s' => \$message, '-t|title:s' => \$title, '-n|nostick!' => \$nostick, '-s|stdin!' => \$stdin, ) ) } unless ( $title ) { die "Error no title specified\n" } unless ( $appname ) { $appname = "growlpl" } # Sticky is turned off if nostick is set. my $sticky = $nostick ? 0 : 1; if ( $stdin ) { undef $/; $message = <STDIN>; } unless ($message ) { die "ERROR: No specified message\n"; } if ( $verbose ) { print "TITLE: $title\n"; print "MESSAGE:\n$message\n"; } Mac::Growl::RegisterNotifications($appname,[ $appname ],[ $appname ]); Mac::Growl::PostNotification( $appname, $appname, $title, $message, $sticky);




Updated Sun Jul 23, 2006 12:12 PM