With SNMP Polling, the Network Management Station (NMS), Opsview Monitor, is required to poll various objects for information, on various devices. This could take a large amount of time to configure and fine tune and also in the very large environments use a large amount of computing power.
The alternative is called 'SNMP Traps'. With SNMP Traps, instead of the router (for example) being polled for information on a regular basis by Opsview Monitor, the router itself will let Opsview Monitor know of any problems or issues via a 'trap'.
The above illustration shows how an Opsview Monitor server will regularly poll a host for information, whether there is a problem on that Host or not. In the example below it, the Opsview Monitor is sitting 'listening' for SNMP Traps; when the Host encounters an issue then a trap will be sent to tell Opsview Monitor, which in turn will change a Service Check to the state of 'CRITICAL' or 'WARNING' based on the rules you define.
Devices can usually be configured to send specific types of trap such as link status changes, BGP, HSRP, and many others, making this a flexible monitoring option.
From the device perspective, the traps can usually be configured to send a trap to a maximum of two devices at a time. When received by Opsview Monitor it will be passed through a perl-based rules engine, allowing you to match specific traps from devices and generate appropriate alerts. In order to do this, SNMP Traps must be passed from the device into Opsview Monitor via an SNMP Trap Collector.
See SNMP Traps Collector for details on the SNMP Trap Collector.
See MIBs for SNMP Traps and Gets to add your MIBs that are required for translating SNMP Traps.
Each Service Check configured to accept SNMP traps has an ordered list of rules. Each rule is evaluated in turn. If a rule is false, then the next rule is evaluated. If a rule matches as true, the specified action is taken and no more rules for that Service Check are evaluated.
An action could be either:
- submit a passive check result to Opsview Monitor with an appropriate message, or
- do nothing, and thus stop processing of any further rules
If the incoming trap does not evaluate to true for any rules, then it becomes an exception and will appear on the SNMP Trap Exceptions page. This is required so that an administrator is aware that the rules need tuning to cater for this particular trap.
When a trap is received, it contains information about the source IP. This is associated to a Host. A Host can have more than one SNMP trap service check defined. In this case, each Service Check is evaluated independently of the others. The illustration below shows a trap being evaluated in four Service Checks, represented as columns.
These columns are not ordered, so there is no guarantee which Service Check column will be evaluated first. A consequence of multiple Service Checks is that a single trap could raise multiple alerts to Opsview Monitor. However, there will only ever be one SNMP Trap Exception per trap.
One example of using multiple Service Checks is if you wanted a Service Check to show interface status, with another Service Check alerting on error log messages.
Note: Traps received will have the SNMP community value hidden so that passwords are not stored on the file system.
SNMP messages come in two major flavors - GETs and TRAPs. From the Opsview Monitor point of view, an SNMP GET is when the monitoring server requests a piece of information from a Host. An SNMP Trap is when the Host tells Opsview Monitor when an event has happened. For example, network devices can send messages about ports going off or on line, or about bandwidth on a particular link meeting a specified level; servers can send TRAPs about someone logging onto or off a server, or when a new connection is made to a service (see the manuals for your particular Host). All devices should be able to send a message about a power-on event (i.e. when the system is booting up).
There are a number of versions of SNMP. SNMPv1 was first released in 1988 and has been improved over time with the release of SNMPv2 (and v2c) - these improved security and amended the message format, and SNMPv3, which improved security and authentication even further.
It is relatively easy to configure Opsview Monitor to handle SNMPv1 and SNMPv2c, but SNMPv3 is more complicated to set up due to the extra security involved.
SNMPv1 had one type of message; the TRAP. This is a message sent from a device with no response expected. SNMPv2c and SNMPv3 use TRAPs but also introduces a new message type; the INFORM (which was reworked further in v3). The basic difference between the two is that INFORMS must be acknowledged by the receiving device. If the message is not acknowledged then after a period of time it is resent.
A daemon must be set up and running on the Master and Slave Nodes to receive the TRAPs and INFORMs. After receiving one and doing some initial processing, the daemon passes the messages into Opsview Monitor. Opsview Monitor will handle both TRAPs and INFORMs in the same way; it does not hold information to differentiate between them.
Security in SNMPv3 is handled by creating users. Each user may have:
- A name (securityName)
- An authentication protocol (authProtocol)
- An authentication key (authKey)
- A privacy type (privProtocol)
- A privacy key (privKey)
Authentication uses the user's authKey to sign the message being sent with the authProtocol (MD5 or SHA)
Messages are then encrypted using the user's privKey with the privProtocol (AES or DES)
Messages are sent using one of the following securityLevel levels:
- Unauthenticated (noAuthNoPriv)
- Authenticated (authNoPriv)
- Authenticated and Encrypted (authPriv)
SNMP Daemon configuration for all versions of TRAPs are normally held in /etc/snmp/snmptrapd.conf, but this location may differ between some platforms. In order to receive SNMPv3 TRAPs a User must be created (for authentication) with the appropriate role (authorization).
Creating Users to receive TRAPs/INFORMs is done in the format:
createUser -e <engineID> <securityName> <authProtocol> <privProtocol> <privKey>
The " -e " may be committed if only INFORMs are being received. For TRAPs it must match the configuration on the device sending them.
createUser myInformUser SHA myPassword AES myPassPhrase createUser -e 0x0011223344 myTrapUser MD5 myPassword DSA myPassPhrase
Note: The engine ID can be retrieved from the device sending the traps. On Cisco IOS devices this is usually:
# show snmp engineid
Authorization is handled with authUser tokens.
authUser log,exec myInformUser authPriv
To configure Opsview Monitor for receiving SNMPv3 TRAPs and INFORMs sent as the user 'opsview', the following configuration may be applied to snmptrapd.conf:
createUser opsview -e 0x8000123acd1ab43abbfff000fa SHA myPassword AES myPassPhrase authUser log,exec opsview authPriv traphandle default /opt/opsview/snmptraps/bin/stdin2sock
Be aware that the following line will allow TRAPs to be received without any authorization:
After the SNMP trap daemon snmptrapd has been restarted, the configuration can be tested on the Opsview Master or Slave Node as follows:
snmpinform -v3 -u opsview -a SHA -A myPassword -x AES -X myPassPhrase -l authPriv localhost 1 0
snmptrap -e 0x8000123acd1ab43abbfff000fa -v3 -u opsview -a SHA -A myPassword -x AES -X myPassPhrase -l authPriv localhost 1 0
To test from a different server the reference to 'localhost' should be changed to the Opsview Master or Slave Node hostname or IP address.
After a few moments the message will be passed to Opsview Monitor and (if no rules are yet set up) recorded in the 'SNMP TRAP exceptions list'. The message may also be logged to syslog if snmptrapd is configured to do so.
There is more information about SNMP in general on this Wikipedia page.
Go to the Configuration > SNMP Traps menu.
This page will display a summary of the current SNMP Trap configuration along with details on any trap exceptions (traps that failed to match rules) and tracing detail:
The summary information of all the SNMP Trap settings will be shown in five separate boxes:
- Hosts Expecting SNMP Traps
- Hosts With Tracing Enabled
- SNMP Trap Service Checks
- SNMP Trap Exceptions
- SNMP Trap Debugging Rows
Clicking on one of the 'Hosts with Tracing Enabled', 'SNMP Trap Exceptions' and 'SNMP Trap Debugging Rows' boxes will take you to the appropriate tab. Making any changes on these tabs will update the main summary when you go back to it.
If the incoming trap does not match rules, then it becomes an exception. The Exceptions tab will display a grid with the following data:
- Host IP - Trap Exceptions Source IP
- Date/Time - Date/Time of trap returned
- Trap Name
- Reason - Reason for incoming trap not evaluating to true for any rules
Clicking on the
Delete All button will remove all trap exceptions, no matter if a filter is applied on any of the columns.
Alternatively, you can remove individual traps by clicking on the contextual menu and selecting the 'Delete' option:
The Debug tab will display a grid with the following data:
- Time - Time of debug trap being recorded
- Processing Time - Time of execution of the rules
Debug Traps can only be removed individually by clicking on the contextual menu and selecting 'Delete'; there is no option for a bulk deletion, although when you delete a trace, all associated debug traps will be removed.
Traces allow you to trace exactly how an incoming Trap is processed by the rules engine. This tab controls the list of traces that have been set up. Each trace can be in one of three states:
- Pending - the trace has been configured and is ready to start tracing
- Running - the trace is in progress. It will stop at the requested time
- Completed - the trace has finished
You can only have one pending/running trace per host.
From the grid, the contextual menu will give you two possible options:
- Delete - only available when the trace is Pending or Completed. This will remove the trace and all the debug traces associated with it
- Stop Trace - only available when the trace is in a Running state. Allows you to stop the trace ahead of the requested finish time
When setting up a new SNMP Trap Servicecheck to process incoming traps, tracing will work best when a rule has been added to the host to catch all traps. The easiest way to do this is to add the Servicecheck
Network > Base > SNMP Trap Handler > SNMP Trap - Alert on any trap to the host. Without such a rule you may be miss traps while running a trace.
Click on the New Trace button in the toolbar to add a new trace.
The list of hosts is based on hosts that do not already have a running or pending trace, and that have at least 1 SNMP Trap service check configured, either directly assigned to the host (via the Host Edit Service Checks tab) or via a host template.
After you have chosen the host and trace period and submitted it, the trace is saved in a Pending state, ready for Opsview to automatically start tracing.
This tab allows you to see what traps have been received based on your traces, and you can test your rules against them by replaying the traces to see what would happen. There are 3 stages:
- Choose the traces
- Select the service check's trap rules you want to test against
- Select the traps to replay, or replay all
In the toolbar, click on the Trace Selection button to choose an existing trace. You can also create a New Trace from here.
When a trace has been selected, the title of the Replay Traces Grid will be updated to the host name and the trace start time. You can select multiple traces.
The Replay Trace grid will be populated with all the traps collected from that trace.
On the Service Check Selection drop down, select the service check that you want to view its trap rules.
On selection, the first grid will update with all of the service checks' trap rules.
Note: If you change the Service Check, then all the previous replay results will be cleared.
- Single Replay Trace
To replay a single trap, click on the contextual menu and select Replay:
This will update the row with the results of the trap as if it was just received:
The Rule column holds the last rule that matched. To see all the rules, click on the Show All link:
- Replay All Traces
If you select the Replay All button, all traces in the current page of the grid will be replayed, as if you clicked Replay on each contextual menu.
Go to the Configuration > Service Checks menu.
Select "SNMP Trap" in the "Filter By" check type combo box.
- On your Service Check of choice, go onto the context menu and select edit.
Click on the 'Add New' button to associate a new rule with this Service Check
Once the 'Name' and 'Rule' field has been filled out, the top section of the page will show the new rule.
You can see some example rules by clicking on the Trap Rules Help button. More details below.
Rules can be amended by clicking on the rule in the top section of the modal window. When the field values are updated the top grid will be updated with the row details when another part of the UI is selected.
On the monitoring host that would receive the trap (collector or orchestrator), do the following:
1. Create a test snmp trap file (
192.168.15.288 UDP: [192.168.15.288]:57725->[192.168.15.211]:162 DISMAN-EVENT-MIB::sysUpTimeInstance 28:3:42:10.19 SNMPv2-MIB::snmpTrapOID.0 IF-MIB::linkDown IF-MIB::ifIndex
Change the IP
192.168.15.288 to a host that exists on your Opsview system or add host
192.168.15.288 to your configuration - this IP doesn't have to actually exist on your network but it does have to be defined in Opsview.
192.168.15.211 to your Orchestrator or Collector IP (whichever server should be receiving the trap from the sending device).
2. Run this command
cat /tmp/trap | /opt/opsview/snmptraps/bin/stdin2sock
You should either have a new entry in the SNMP Trap Exceptions page or your traps under host 192.168.15.288 have changed state.
If you received error
Traps fifo not found: No such file or directory the Host in question does not yet have the SNMP Traps service checks applied. Adding "SNMP - Accept Traps" to the Host will resolve this error.
Click on the contextual menu of the rule that needs to be deleted and select menu item 'Delete'.
A window will then pop-up to validate the users request; clicking 'OK' will then update the grid and remove the rule from the service check.
Click on the contextual menu of the rule that needs to be cloned and select the menu item 'Clone'.
The box at the bottom will indicate you are attempting to create a 'New SNMP Trap Rule cloned from '.
Here the user must change the name field of the new rule as no two rules can have the same name. Clicking away from the form will then add on the new trap rule.
The syntax for SNMP Trap rules is "mini bits of Perl code". These are the possible operators that can be used:
Returns true if the left argument is numerically less than the right argument
Returns true if the left argument is numerically greater than the right argument
Returns true if the left argument is numerically less than or equal to the right argument
Returns true if the left argument is numerically greater than or equal to the right argument
Returns true if the left argument is stringwise less than the right argument
Returns true if the left argument is stringwise greater than the right argument
Returns true if the left argument is stringwise less than or equal to the right argument
Returns true if the right argument is stringwise greater than or equal to the right argument
Returns true if the left argument is numerically equal to the right argument
Returns true if the left argument is numerically not equal to the right argument
Returns true if the left argument is stringwise equal to the right argument
Returns true if the left argument is stringwise not equal to the right argument
Returns the logical AND of the left and right expressions
Equivalent to "&&" but has a lower precedence
Returns the logical OR of the left and right expressions
Equivalent to "||" but has a lower precedence
Returns the logic negation of the expression to its right
Equivalent to "!" but has a lower precedence
Compares the left hand expression with the right hand regular expression
Equivalent to "=~" but the return value is negated
Note: Any blocked code will be considered as a rule evaluation failure
Updated over 1 year ago