Opsview Knowledge Center

Analysis

Data analysis overview for Hosts and Host Groups in Opsview Monitor

Overview

You should now be comfortable with the creation, removal and modification of Host Groups, moving Hosts within these Host Group, the creation of Hosts and their configuration.

After the Host Groups and Hosts have been configured, Users can begin to interpret their current status and analyze the monitored data within the 'Host Groups, Hosts and Services' section of 'Monitoring':

This section is the default view for all Host Group, Host and Service Check analysis and allows a range of functions including investigation of Hosts and services, 'actions' at a Host Group/Host Service Check level (actions are covered later) and more.

The 'Host Group, Hosts and Services' section is split into two 'sections'; the top half is known as the 'Navigator' and the bottom half is known as the 'Checker'.

Example 'Host Group, Hosts and Services' section with one Host selected

Example 'Host Group, Hosts and Services' section with one Host selected

The Navigator contains the Host Group hierarchy, with the Hosts as the 'end point'. In the example screen below, there is a Host Group hierarchy containing four Hosts:

Example 'Navigator' with 4 Host Groups expanded to reveal 4 Hosts

Example 'Navigator' with 4 Host Groups expanded to reveal 4 Hosts

To view the details of a Host, you/user should check the 'View' box next to the Host's contextual menu. Checking the box will display the checker window and populate it with the Service Checks of the Host(s) selected.

The checker only contains the Service Checks from the Hosts who have been selected within the navigator. For example, if you select the Host 'opsview' then the checker will appear and display all of the service checks for the 'opsview' Host.

Note: The checker will not be covered in this section, instead the documentation covering 'Service Groups, Service Checks and Host Templates' will cover it in detail.

The navigator has a column on the left where the Host Groups and Hosts can be viewed and interacted with, via the contextual menu. There is also the 'View' column, where you can select Hosts to display within the checker.

The right-half of the navigator contains the status information for both the Hosts and their Service Checks. The status information is split into two sections; 'Host Status' and 'Service Status' columns. Host Status refers to the state of the Host as defined by the results of the Host Check Command associated with it. The Service Status refers to the service checks running on those Hosts. In the screen below we have examples of where the Hosts are 'DOWN' (denoted as 'dn'), meaning the Host Check Command has failed to get a response from the Host. There are also examples of failed service checks (CRITICAL (cr), UNKNOWN (un) and WARNING (wn)).

Host's that have an 'OK' status from the Host Check Command are categorised as 'UP'. In the example below there are 20 Hosts that are 'UP' and 3 Hosts that are 'DOWN'.

Example Host Group Navigator with failed Hosts

Example Host Group Navigator with failed Hosts

Each of the 'Host'/'Service Status' columns are split into 'Handled' and 'Unhandled' respectively. 'Handled' /'Unhandled' refers to whether the non-OK status of a Host or Service Check has been acknowledged by a User. For example, when a Host fails (i.e. changes into a non-OK state), it will initially go into the 'Unhandled' column. The same logic applies for service checks.

In order to convert an Unhandled problem into a Handled problem, you must 'ACKNOWLEDGE' it. This can be done via the contextual menu, either at a Host Group, Host or Service Check level. In the example below, we are going to ACKNOWLEDGE all unhandled problems with the 'Monitoring Servers' Host Group by clicking 'Mass Acknowledgement' within the contextual menu:

Contextual menu against 'Monitoring Servers' Host Group

Contextual menu against 'Monitoring Servers' Host Group

Once the contextual menu is clicked, a modal window will appear confirming the items that are to be acknowledged. This window is covered later in the document, along with all the other contextual menu items and their respective modal windows.

For reference, an 'UP' Host, or an 'OK' Service Check command are automatically considered handled by the Opsview Monitor software.

Viewing a Host's Service Checks

This section of the documentation covers the ‘checker’, including an overview of icons and what they mean, how to add Hosts to the checker, how to filter the checker and how to clear all data from the Checker.

Overview of Icons

Within both the Navigator and Checker a series of 'badges' are used to denote a status / available action against a given Host Group, Host or Service Check. The available badges are:

Icon Name Explanation
Graph This icon is displayed against a Service Check that has graphing data available. Clicking the graph icon will load the 'Investigate window' for the service check, and focus on the 'Graph' tab.
Active Downtime This icon is displayed against Host Groups, Hosts and Service Checks that have scheduled downtime against them that is currently active. I.e. the time is 15:23, and the downtime window is 15:00 to 16:00.
Pending Downtime This icon is displayed against Host Groups, Hosts and Service Checks that have scheduled downtime against them that is currently pending. I.e. the time is 14:23, and the downtime window is 15:00 to 16:00.

Adding a Host to the Checker

To view the Service Checks of one or more Hosts, a user must add it to the 'checker'. To do this, they can simply click on the 'View' button next to the Host in question:

Host Groups, Hosts and Service Checks with no Hosts selected

Host Groups, Hosts and Service Checks with no Hosts selected

Once the checkbox has been checked, the 'Checker' window will appear. Within the checker window, Opsview Monitor will load the Service Checks of the Host that was previously selected:

Host Groups, Hosts and Service Checks with one Host selected

Host Groups, Hosts and Service Checks with one Host selected

To view the Service Checks of multiple Hosts within the checker, simply check their relevant checkboxes as below:

Host Groups, Hosts and Service Checks with 4 Hosts selected

Host Groups, Hosts and Service Checks with 4 Hosts selected

Filtering/Sorting within the Checker

Within the checker, Service Checks can be sortable and filtered using the contextual menus within the column headers. In the example given in Adding a Host to the Checker, we added four Hosts of the same 'type' to the checker.

What we can now do is compare the metrics against the four Hosts ' i.e. is one of the Host's load average much higher than the others. To do this, simply click on the 'Service Check' header which will sort the data in the Checker A-Z/Z-A, as below:

Four Hosts added to the 'Checker', with the 'Service Check' column sorted alphabetically

Four Hosts added to the 'Checker', with the 'Service Check' column sorted alphabetically

We can also choose to filter the 'Service Check' column using the contextual menu within the column header. Searching for the term 'Unix Load' will filter the Checker results to display only the 'Unix Load Average' service checks:

Host Groups, Hosts and Services ' Four Hosts added with Checker filtered to show only Service Checks with the term 'Unix Load' in the name

Host Groups, Hosts and Services ' Four Hosts added with Checker filtered to show only Service Checks with the term 'Unix Load' in the name

Other columns can be filtered based on non-free text searches, such as 'State':

Sharing a View

The filtered/configured 'Host Groups, Hosts and Services' section can be shared to other users via the 'share arrow'; this 'share arrow' is located on the navigation bar, as per the screen shown below:

This share icon can be used to generate a unique URL, which when navigated to will load the view as it was when the URL was generated:

This means that if you select 100 hosts from disparate host groups, and filter the 'Checker' for only 'CRITICAL' state service checks, you can generate and share a URL which when loaded by other users will give them the exact same view that you are seeing.

Removing Data from the Checker

There are two main ways to remove data from the checker:

  • Remove each Host one by one
  • Remove all Hosts (i.e. 'Clear' the checker)

Removing Hosts individually

To remove Hosts individually, simply uncheck the checkbox next to the Host within the Navigator. This will remove the Host's service checks from the Checker.

If by unchecking a Host you render the Navigator in a state where no Hosts have their 'View' checkbox checked, the Checker window will be minimized to the bottom of the screen:

Host Groups, Hosts and Services: With one Host checked

Host Groups, Hosts and Services: With one Host checked

Host Groups, Hosts and Services: With no Hosts checked

Host Groups, Hosts and Services: With no Hosts checked

Removing all Hosts

If there are a lot of Hosts added to the checker, or you simply wish to remove every Host from the Checker, there is an 'unlink' button in the top of the checker:

This 'unlink' button will uncheck all Hosts within the navigator and thus empty the Checker of all data.

Analysis

Data analysis overview for Hosts and Host Groups in Opsview Monitor