This document will
explain the concept of Event Handlers; what Event Handlers
are, how to create an Event Handler, and how to apply these Event Handlers to
both host and service checks. After reading the User Guide, users should be able to create their own Event Handlers and apply
them to host and service checks within their Opsview Monitor system.
Event Handlers are a feature
within Opsview Monitor that moves your monitoring solution away from a ‘detect
and alerting’ system to a more proactive monitoring tool. What does this mean? Well,
if Opsview Monitor detects that the web service is not running on a monitored
host it can not only alert you, but it can also automatically
restart the web service. This means that you will know a problem occurred so
you can diagnose and ensure it doesn’t happen again. But at the same time your
users are not impacted as the web server is back online within seconds of the
outage. This is done via an Event
Event Handlers are
scripts (Perl, Python, etc.) that can be automatically run by Opsview Monitor
when it detects that a host or service check has failed (i.e. gone into a non ‘OK’
or non ‘UP’ state). For reference, the Event Handler commands are executed when host or service check:
a “soft” error state
error state also invokes the handler
a “hard” error state
from a “soft” or “hard” error state.
Event Handlers sit on
the monitoring server and are invoked via Opsview Monitor. In order to
successfully run the Event Handler, it must be stored within /usr/local/nagios/libexec/eventhandlers/
on the Master/Slave with ownership of ‘nagios:nagios’ and file permissions of 0640 so that it can be successfully executed.
graphic above shows the relationship between the Opsview Monitor software, the
Opsview Agent and the Event Handler. The Master/Slave runs the Event Handler
when the service changes to a non-OK state. At the same time, the ‘retry interval’
will be running, meaning the Master/Slave is likely monitoring the server at a one-minute interval (if the default value is unmodified). This means that once
the Event Handler has been run, the Opsview Monitor server should detect that
the service is now back ‘up’ and running, and thus the service check state
should return to an ‘OK’ state (unless there is a problem stopping the service
from restarting, such as misconfiguration, etc).
In the example above, we have chosen to run an Event Handler on the ‘Apache service status’ service
check, however Event Handlers can be run on any host or service check; e.g. you
may create an Event Handler that clears /tmp or ‘Recycle Bin’ when the ‘Disk
capacity’ check changes to WARNING or CRITICAL. Alternatively, you may wish to
create an Event Handler that flashes a series of lights red when a service
check monitoring the number of ‘Severity 1 tickets’ changes from zero to one or
more, in order to alert your support team quickly.