Opsview Knowledge Center

Reloading

Updating the live Opsview Monitoring configuration

Overview

A 'Reload' is the reloading of the Opsview Monitor configuration files to reflect the changes made since the last time the configuration files were loaded. For example, the addition of a new Host, the modification of a Host Template, etc. When a reload is requested, the changes are written to the configuration files which are then re-loaded by Opsview Monitor for use within the user interface, dashboards and more.

To perform a Reload, click on the 'Reload' tile within the Settings menu tab. If there are pending changes, i.e. changes which will require a reload in order to be applied, the tile will change to an orange color as shown below:

Opsview Monitor menu: No reload pending

Opsview Monitor menu: No reload pending

Opsview Monitor menu: Reload pending

Opsview Monitor menu: Reload pending

When clicked, the tile will load a page similar to below:

Reload running

Reload running

This page will display the last time a reload was applied ('Configuration last updated:'), the estimated time taken to reload (this will be '10 seconds' if a reload has never been executed), and also the number of changes to be applied.

If there are no changes to be applied, the number will be '0'. This means that no changes have been made to the Opsview Monitor software that requires a configuration file change. If there are changes to be applied, this number will reflect the number and changes and also convert to a clickable hyperlink.

This clickable link will load the Audit Log and filter it so that it only shows the category 'CONFIG', and only CONFIG Audit Log entries that have occurred since the last reload, as shown below:

This allows you to see why a reload is required, i.e. what changes are you applying by performing the reload. By having this functionality, you can ensure that only a few users have the ability to perform a reload and thus apply the changes. This is great for an ITIL environment where you want to ensure that changes are being done in a controlled manner, i.e. through the change control process and only applied (reloaded) on a Saturday evening within a downtime period.

Performing a Reload

To reload the Opsview Monitor system to use the new configuration files, simply click on the 'Reload' navigation option within the Settings tab of the overlay menu, as shown below:

Once the 'Reload' tile has been clicked, you will be presented with a screen similar to the one shown below:

Here you can click on the hyperlinked number, which will take you to the Audit Log pre-filtered to display just Audit Log entries that have been generated by configuration changes.

You can also click on the 'Apply Changes' button which will write the changed to the configuration files, and then re-load the configuration files so that the changes are now applied.

Whilst this is happening, you should not make any other changes ' i.e. continue to edit the Opsview Monitor software, such as adding new Hosts, etc. If you do this then a failure may occur ' these are caught correctly by the Opsview Monitor software and will be displayed on the screen.

Once the 'Apply Changes' button has been clicked, Opsview Monitor will begin the reload process as shown below:

Once the reload is successfully completed, you will be presented with a page similar to the one below:

This page will display the number of changes that have been applied, and allow you to acknowledge the reload. Once acknowledged, you will be presented with the standard Reload page, which will allow you to reload Opsview Monitor ' even if no changes are pending.

Graphs

You may wish to reload Opsview Monitor, even if no changes are pending, in order to have the software detect that a newly added Service Check is generating graph data. By default, a Service Check that is generating graph data requires two reloads. The first reload applies the Service Check to a Host, at which point it is executed and begins storing performance data on the file system. The second reload is then required to ensure that Opsview Monitor detects these new files and begins displaying the data within graph configuration windows, and also within investigate modes, sparkline graphs and more.

Troubleshooting/FAQs

Reloading requires the opsviewd process to be running. If you have issues after a reboot, ensure that MySQL is started before starting Opsview Monitor.

A reload flag is set to stop multiple reloads from running simultaneously. If the reload does not complete, check for the file /usr/local/nagios/var/rw/reload_flag.

A reload will generate messages in /var/log/opsview/opsviewd.log like:

[2010/09/02 11:43:48] [opsviewd] [INFO] Running 'web_reload' with args:

[2010/09/02 11:43:51] [create_and_send_configs] [INFO] Starting overall

[2010/09/02 11:43:56] [create_and_send_configs] [INFO] Ending overall with error=0

[2010/09/02 11:44:08] [ndoutils_configdumpend] [INFO] Start

[2010/09/02 11:44:12] [ndoutils_configdumpend] [INFO] End

IP 192.168.10.22 has more than 1 host associated with it: host1, host2

This occurs because there is more than one host that resolves to the same IP address. This is allowed in normal circumstances, except when SNMP traps Service Checks are associated to it. The reason is that if an IP address is associated to a Host multiple times, then one SNMP trap alert will be sent to all Hosts associated to that IP address. The best way to resolve is that Host-specific Service Checks (such as SNMP traps or other hardware Service Checks) are assigned to only one Host based on that IP address.

Host key verification failed. Lost connection.

This means that the Host address of the slave needs to be accepted on the master server. On the command line of the master, run:

send2slaves -t {slave host name}

Configuration Generation Output

The last successful configuration generation output will be in: /usr/local/nagios/var/rw/config_output.last_okay. If this was in error, the filename is /usr/local/nagios/var/rw/config_output.

This includes timings for each section. If there is one section which is taking a long time, this may highlight an area for optimisation.

Reload Process Management

The reload process kicks off several parallel jobs. You can see the workflow in the debug file:

/usr/local/nagios/var/log/create_and_send_configs.debug.

This shows the dependencies and the durations of the jobs.

We have seen one instance where a poorly configured DNS server on one slave was slowing down the entire reload process, as all slaves must finish successfully for dependent jobs to continue.

Unreachable Slave Nodes

In distributed environment it may happen that one of the slave nodes is unreachable and the latest configuration cannot be submitted. 'Slave-node: %slavename%' Service Check will verify if the configuration on that node is up to date. In addition, it will also confirm that the node has been upgraded as a part of Opsview Monitor upgrade.

Reload however will proceed if configuration for the Master Monitoring Server has been verified, so a single failure of one of the slaves would not stop other nodes from being upgraded.

Slow Reload Times

If you find that your Opsview Monitor server has a long reload time, these are the things to look at:

  • Check the reload process management debug output
  • Check the configuration generation output for any slow sections
  • Take a copy of the Opsview Monitor configuration database and run on a different Opsview Monitor server and compare the times

For further investigation, you can raise a support ticket with Opsview's Customer Success team, or if you are an Opsview Monitor Atom user, then raise it within the Opsview.com Forums.

Slow Post-reload Times

If you find that there is a large delay between the reload finishing and new status results coming in, you can check the opsviewd.log file for the following:

[2012/01/06 12:13:32] [import_ndologsd] [WARN] Import of 1325852003.546428, size=968865, took 8.60 seconds > 5 seconds

This shows that there was a large import file (968865 bytes) which took 8.6 seconds to import.

After a reload, a large import file is generated (to hold all the Nagios configuration) which needs to be imported into the Runtime database.

You can enable debug mode for the ndo log files by changing the /usr/local/nagios/etc/Log4perl.conf file and uncommenting:

log4perl.logger.import_ndologsd=DEBUG

Within 30 seconds, a new directory in /usr/local/nagios/var/ndo.archive will be created with a copy of every NDO log that is sent to the database. This can be useful for debugging and replaying the import of that NDO log.

Remember to switch this debugging off!

Reloading

Updating the live Opsview Monitoring configuration