Opsview Knowledge Center

Upgrading From Opsview 3.x or 4.x

Upgrading from the original Opsview Enterprise versions

In this section, we provide you with guidance on how to successfully upgrade your existing Opsview Monitor software installation. It's important that you have reviewed Pre-requisites to ensure that any software and hardware dependencies, along with any limitations, are fully understood prior to the upgrade taking place.

Most upgrade actions are managed during package installation, but there are certain pre- and post-upgrade steps that require manual intervention, which are described below.

Pre-upgrade Steps

Opsview Software Repositories

Using Ubuntu as an example, start by creating a new file /etc/apt/sources.list.d/opsview.list and add the line to it, as shown below:

# Opsview packages
deb https://downloads.opsview.com/opsview-commercial/latest/apt/ <DIST>  main

where:

<DIST> is your distribution name, such as squeeze, lucid or precise, for example.

Updating The Package Lists

Once your configuration is complete, you will need to update the repository information by using the command line shown in the example below:

sudo apt-get update

However, if you receive the error message, "The method driver /usr/lib/apt/methods/https could not be found." , you do not have the necessary transport methods installed. As such, you will need to install apt-transport-https, as shown in the example below:

sudo apt-get install apt-transport-https

Stop All Opsview Monitor Processes

MPORTANT - You must stop all Opsview Monitor processes prior to the upgrade from 4.x to 5.0.0 onwards:

sudo /etc/init.d/opsview-web stop
sudo /etc/init.d/opsview fullstop
sudo /etc/init.d/opsview-agent stop
sudo pkill -u nagios
sudo pkill -u opsview

Updates To Sudoers Configuration

Ensure the sudo commands do not require TTY:

visudo
# Comment out the following line if it is set
#Defaults requiretty

ODW Upgrades

If you are upgrading from an earlier version of Opsview Monitor, ensure that all your database tables in the Opsview Data Warehouse (ODW) have been converted to InnoDB prior to upgrading.

Upgrade Process (optional)

During the upgrade process, the database schema may change. In the example below, we show typical output during the upgrade.

Mon Sep 11 17:47:00 2015: Starting for runtime-nagios
Mon Sep 11 17:47:01 2015: DB at version 3.3.0
Re-arranging configuration indexes
Updated database to version 3.3.1.

However, it is possible to override some of the steps that are undertaken during the upgrade process, which we now describe in the following list.

  • Initially, you will need to find the appropriate upgrade script for each database. Each database script has the name upgradedb_<NAME>.pl, which can be found at the Opsview Installer site.
  • You should review the upgrade step that does not need to be run and note the database name and upgrade step number (runtime is split in to runtime-nagios and runtime-opsview).
  • Now, create the directory, /tmp/opsview_upgrade_override.
  • And then create a file with the name, <DATABASE_NAME>-<UPGRADE_STEP_NUMBER>.
  • Finally, when the upgrade runs, the upgrade step number should be avoided.

For example, the upgrade script, upgradedb_runtime.pl has an upgrade step number of 3.0.1, which is for runtime-nagios; it converts tables to the InnoDB format. If this database has already been converted, then the step can be skipped where you can create a file to skip this conversion, as shown in the below example. As such, this particular upgrade step will not run during the upgrade process.

touch /tmp/opsview_upgrade_override/runtime-nagios-3.0.1

During The Upgrade Process

You should upgrade the Opsview Monitor packages using your typical OS method; however, in this section we provide additional information that may help you with your particular package management system.

apt

Upgrade your Opsview Monitor installation by issuing the command show below. This will automatically apply database credential changes using the debian-sys-maint user.

apt-get update && apt-get install opsview opsview-timeseries opsview-timeseries-enqueuer opsview-timeseries-rrd && apt-get install opsview

yum

You will need to install each package individually, as shown in the example below. This will not only upgrade Opsview Monitor, but will also upgrade the database schemas, if applicable.

yum install opsview-timeseries opsview-timeseries-enqueuer opsview-timeseries-rrd
yum install opsview opsview-core opsview-base opsview-perl opsview-web mod_auth_tkt_opsview

RPMs

If you do not have access to yum and need to install RPMs locally, then you must install all of the dependencies first, followed by the Opsview Monitor packages, as shown in the example below. You will need to review the output of the RPM installation to verify that it was successful.

rpm -U opsview-perl opsview-base opsview-core opsview-web Opsview mod_auth_tkt_opsview

If you need to reinstall a failed package, then you will need to remove it first. For example, if you were attempting to install opsview-base-5.0.0.2276 and it failed, then run the command as shown in the example below to remove the failed package before attempting to reinstall it.

rpm -e --nodeps opsview-core-3.0.2.2276

Post-upgrade Steps

If you are using the Apache proxy (which we recommend for all installs), check the sample configuration file at /usr/local/nagios/installer/apache_proxy.conf for any changes.

There are some manual post install steps that need to be done.
Note: You should follow instructions for all the versions that you upgrade through.

See the sections below for more details:

Nagios 4 Command Arguments

Nagios 4 includes improvements to the command arguments parsing to make it more 'shell-like'. This means that single quotes, double quotes and back slashes should work in a more logical way.
One side effect is that Opsview Monitor's Test Service Check feature now more closely matches with the expected results when Nagios executes plugins.

However, this means that there may be arguments that used to work for Opsview Monitor 4.2 and earlier which do not work for Opsview Monitor 4.3.0 and above. Possible results that you may get:

Output
Notes

(No output on stdout) stderr: /bin/sh: Syntax error: Unterminated quoted string

Unnecessary backslash escaping of quote symbols

(No output on stdout) stderr: /bin/sh: Syntax error: ( unexpected

Check for use of parenthesis. Must be quoted correctly

WMIQuery failed: ConnectServer failed!:failed to lookup error code: 2147749902( reson: 317)

Unnecessary backslashes when defining wmi query location

CRITICAL: Hyper-V: not found (critical), Heartbeat: not found (critical), Service: not found (critical)

Incorrect quoting of services with spaces using the CheckServiceState check

This affects Service Checks that use a backslash () in the arguments. This is usually Windows agents checks.

To help with the transition, there is a tool available to take you through potential issues. Save the tool as /usr/local/nagios/installer/nagios_3-4_servicechecks.pl. This will run through the Opsview Monitor database and find arguments for Service Checks, Host/Host template exceptions, timed Host/Host template exceptions that may be an issue.

On the command line as the nagios user, execute:

/usr/local/nagios/installer/nagios_3-4_servicechecks.pl

You will be prompted with a changes ('C') or list ('L') question as to whether you wish to make changes or just list would-be changes. Listing changes ('L') will go through all Service Checks and show you what would have happened. If you choose to make changes ('C'), the Opsview Monitor database will first be backed up to /usr/local/nagios/var/backups/opsview-db-[timestamp]-nagios_3-4-servicechecks.gz, and you will be prompted to confirm each change if necessary, which will then update the database. Changes will be logged to the audit log. You can use the Test Service Check feature to try your changes before confirming the change, but you should use the results from Nagios as the definitive status.

Note: Running the tool twice will give unpredictable results because the conversion is not idempotent. Take special care if running multiple times.

Note: The tool may not always calculate the correct arguments to use. If you find that you have not been given the correct value, please let us know via your support account or the forums. We will need to know the old Opsview Monitor 4.2 argument, the 4.2 output if possible, the argument that works for Opsview Monitor 4.4 and the new output.

If you have exceptions (timed or normal on Host or Host templates), you may find that the Host is listed in the first list of Service Checks. For instance:

Suspect ID: 349: Win Hyper-V
Host: ov-hv-win2k8-base OUTPUT=....
Host: bob STATE=unknown OUTPUT=....
Host: crummock STATE=critical OUTPUT=CHECK_NRPE: Socket timeout after 10 seconds.
Host: ov-dev STATE=ok OUTPUT=....
Change -H $HOSTADDRESS$ -c CheckWMI -a Query='Select * from Msvm_VirtualSystemManagementServiceSettingData' namespace='root\\virtualization'
to     -H $HOSTADDRESS$ -c CheckWMI -a Query='Select * from Msvm_VirtualSystemManagementServiceSettingData' namespace='root\virtualization'
y/n? y
Updated

In the above, crummock is listed. But as crummock has a Host exception, it will also be shown later on:

Suspect host timed override exception on host: crummock, for service check: Win Hyper-V - TESTING
Change -H $HOSTADDRESS$ -c CheckWMI -a Query='Select * from Msvm_VirtualSystemManagementServiceSettingData' namespace='root\\virtualization'
to     -H $HOSTADDRESS$ -c CheckWMI -a Query='Select * from Msvm_VirtualSystemManagementServiceSettingData' namespace='root\virtualization'
y/n? y
Updated

As the tool is trying to locate arguments across Opsview Monitor, it may find possible problems in various locations for the same Host. This is fine.

Opsview Master Housekeep Service Checks

Opsview Monitor 4.4.0 introduces a new Service Checks to monitor the result of the daily housekeeping, as well as recording the time taken to run.
If you are upgrading from earlier versions of Opsview Monitor, you will need to manually add them to relevant templates:

This Service Checks are defined as:

  1. Opsview Housekeeping Monitor

    • Plugin: check_opsview_housekeeping
    • Check Interval: 12 hours
    • Keywords: opsview-components
    • Service Group: Application - Opsview
    • Host Template: Application - Opsview Master
  2. Opsview Housekeeping Cronjob Monitor

    • Plugin: check_opsview_cronjobs
    • Check Interval: 12 hours
    • Keywords: opsview-components
    • Service Group: Application - Opsview
    • Host Template: Application - Opsview Common

Security Wallet Upgrading

When upgrading to Opsview Monitor 5.0, a key will be generated to encrypt your data.
As part of the upgrade, certain data within Opsview Monitor will be encrypted using this key:

  • Host snmp_community, snmpv3_authpassword, snmpv3_privpassword and rancid_password
  • Notification method passwords for iOS Push Notifications, Android Push Notifications and AQL
  • Variables (formerly attributes) and Host variables (formerly Host attributes) will not be encrypted.

Security Enhanced Opsview Agent

Opsview Monitor 4.6.3 introduced the capability of a more secure method of communication between the Opsview Agent and its Unix clients. This involves the addition of two new host variables:

NRPE_CIPHERS and NRPE_CERTIFICATES. These variables won't appear automatically on an upgraded system (they will on a new installation of Opsview Monitor), so, in order to use this new feature of the Opsview Agent, they'll have to be added manually by following the steps below. It is, however, not necessary to add these variables and use them with the Opsview Agent. Your Opsview Agent and checks will continue to work without them; they just won't be as secure.

Add these two variables:

  • NRPE_CIPHERS

    • Label Arg1: Cipher list
    • Default Arg1: ADH-AES256-SHA:ADH-AES128-SHA
  • NRPE_CERTIFICATES

    • Label Arg1: Path to certificate
    • Label Arg2: Path to private key
    • Label Arg3: Path to CA certificate

For each check_nrpe based Service Check that is to be used with the new feature, add these arguments:

-C '%NRPE_CERTIFICATES:1%' -k '%NRPE_CERTIFICATES:2%' -r '%NRPE_CERTIFICATES:3%' -y '%NRPE_CIPHERS:1%'

Upgrading Modules

As all Opsview Monitor Modules are contained within the same repository, you should also upgrade all modules that you have currently installed. You can upgrade modules by using your package manager, e.g. on Ubuntu run 'apt-get update && apt-get install opsview-reporting-module opsview-jasper'.

After the apt-get, or yum commands, above, run, as the 'opsview' user, /opt/opsview/jasper/installer/postinstall. This will update the current Opsview Monitor reports within the reports module.

Please note, you should upgrade all modules at the same time. For example, do not upgrade Opsview Monitor and then upgrade the modules at a later date.

Also, All services must be stopped before initiating the upgrade process. To do this, run

sudo pkill -u opsview

Opsview Agent (NRPE)

There may have been changes made to the Agent's nrpe.cfg file. This file exists on the master, slaves and any monitored Host. When Opsview Monitor is upgraded, or when a reload happens, this file does not change on slaves. This is so that any local changes won't disappear. However, it does mean that, if we have made changes to the distributed version of this file, they will need to be applied to the copies on any slaves. Compare the nrpe.cfg file on the master with the ones on slaves, and make any necessary changes.

For reference, any local changes should be made by modifying the file '/usr/local/nagios/etc/nrpe_local/local.cfg' - this file will not be changed on any upgrade or reload on the master server, slaves or monitored Host.

Troubleshooting

Errors During Upgrade

Ensure you review the output of the upgrade to verify that it was successful. If there was an error, the upgrade may halt and further steps will need to be manually performed. These are the steps run by the postinstall as the root user:

su - nagios -c '/usr/local/nagios/installer/upgradedb.pl' 
su - nagios -c '/usr/local/nagios/bin/send2slaves' 
su - nagios -c '/usr/local/nagios/bin/populate_db.pl' 
su - nagios -c 'OPSVIEW_NOSTART=true /usr/local/nagios/bin/rc.opsview gen_config' /usr/local/nagios/installer/postinstall_root 
su - nagios -c '/usr/local/nagios/installer/postinstall' 
/sbin/chkconfig --add opsview # For Redhat, CentOS, SLES
 /etc/init.d/opsview start 
rm -f /usr/local/nagios/var/upgrade.lock

Reloads Fail Following an Upgrade

After an upgrade, a reload will automatically take place. In a distributed environment, if there is a problem where the newly upgraded files have not been sent to the slaves, then new configurations created by Opsview Monitor may not pass validation, so you may see errors like:
Reading configuration data...

Error in configuration file 
'/usr/local/nagios/tmp/nagios.27520/nagios.cfg' - Line 635 (UNKNOWN VARIABLE)

This usually means that the slave servers do not have the latest software.
To force sending the new binaries to the slave systems, run:

su - nagios /usr/local/nagios/bin/send2slaves {slavename}

There may be permission errors which mean that the sending to slaves did not work correctly during the upgrade.

Database Errors After an Upgrade

If you are encountering database errors after an upgrade, it maybe that the database upgrade scripts didn't run properly. All database upgrades are handled automatically and will continue from the last database state.
To invoke the database changes, run as the nagios user:

/usr/local/nagios/installer/upgradedb.pl

This is not destructive (it will only make changes that are required), but you should only run it if you have problems.

Upgrading From Opsview 3.x or 4.x

Upgrading from the original Opsview Enterprise versions