From 5.4.x to 6.3
Learn how to do an in-place upgrade from Opsview Monitor 5.4.2 to Opsview Monitor 6.3
Introduction
This document describes the steps required to upgrade an existing Opsview Monitor 5.4.2 system running on either a single server instance or a distributed Opsview environment (with a remote database and slaves) to the current version of Opsview Monitor.
Depending on the size and complexity of your current Opsview Monitor system, this process may take between a few hours to a full day.
You must be running a minimum of Opsview Monitor 5.4.x to perform an in-place upgrade If you do not have version 5.4.2x then you must either upgrade to that first or perform a migration to new hardware instead.
If you are running Opsview Monitor in a virtual environment then you should snapshot the servers before the upgrade is started. You then have the option to 'Roll Back', should any problems occur.
If you are not using a virtual environment, then you may wish to perform a migration to new hardware instead of an in-place upgrade.
Overview
The in-place upgrade process is made up of the following stages:
- Setup of local repo (offline only)
- Bootstrap Opsview Deploy
- Preparing to upgrade
- Gather information about the Opsview 5.4 environment.
- Manually fine-tune the opsview-deploy configuration.
- Performing the upgrade (this will include OS updates unless otherwise overridden, documented below)
- Registering Opsview collectors
Prerequisites
Before upgrading from Opsview Monitor 5.4 to Opsview Monitor 6.3, make sure that you have...
- A supported operating system.
If you are running Ubuntu 14.04 or earlier then upgrade the OS before upgrading Opsview by following these steps OS and Opsview Upgrades. - Shell access (as 'root') to the Opsview Master server.
- SSH access from the Opsview Master server (as root user) to itself.
- SSH access from the Opsview Master server (as root user) to all Opsview Slaves.
- This must use key-based authentication - password authentication is not supported.
- The target user (on the Opsview Slaves) must either be 'root' user or have full, password-less 'sudo' access including the ability to elevate to the 'root' user.
- Root access to the MySQL server(s) from the Opsview Master server.
- The MySQL username must be 'root'.
- The root user must have access using mysql_native_password (installation using the
auth_socket
plugin to access the database is not currently supported).
- Offline upgrades - Setting up the mirror is outside of this documentation, but we suggest using a tool to achieve this such as reposync for CentOS and RHEL, or apt-mirror for Debian and Ubuntu.
The base URLs for mirroring our repositories are:
# RHEL/CentOS
https://downloads.opsview.com/opsview-commercial/6.3/yum/
# Ubuntu/Debian
https://downloads.opsview.com/opsview-commercial/6.3/apt
Limitations
- Any remote Opsview databases must listen on 3306/TCP for inbound connections.
- External 'opsview-timeseries*' hosts are not automatically added to the inventory.
- Local 'opsview-agent' configuration must be manually reapplied when the upgrade process is complete.
- Using NAT'd IP addresses to communicate from the Opsview Master to the Opsview Slaves has not currently been tested.
- Advanced Apache configurations cannot be converted and adopted automatically.
- The deployment must be executed from the Opsview Master.
- 'opsview-timeseries' nodes which are not using the RRD backend must be manually upgraded as the InfluxDB backend is not yet supported by opsview-deploy.
- If you are using opsview-sshtunnels for communications to some or all collectors, check SSH Tunnels page to ensure you have the correct configuration set up
Performing the Upgrade
Disable Previous Opsview Monitor Repositories
Disable older Opsview repos!
You must comment out or remove any previous Opsview Monitor repository configuration for
yum
orapt
.
Opsview Monitor 6.x uses a slightly different repository to versions 4 and 5. If references to either remain, older packages may be incorrectly installed or referenced.
Edit
/etc/yum.repos.d/opsview.repo
(CentOS/RHEL) or
/etc/apt/sources.list.d/opsview.list
(Debian/Ubuntu)
and place a #
character at the start of each line. If these files do not exist, check other files in the same directory for references to downloads.opsview.com
.
Refresh the repositories:-
# CentOS/RHEL
yum clean all
yum makecache fast
# Debian/Ubuntu
apt-get clean
apt-get update
Offline Only
You will also need to take a copy of the installation script (and verify its checksum) before transferring this to the target server:
curl -sLo- https://deploy.opsview.com/6.3 > install_opsview
sha256sum install_opsview
Ensure the returned string matches the following:
d30303d3f5171c34ec56781d9748d51849e900b9ebca18266acb0c38b9970bd4
Bootstrap Opsview Deploy (Online)
To configure the latest Opsview Monitor repositories and install opsview-deploy, run:
curl -sLo- https://deploy.opsview.com/6.3 | sudo bash -s -- -A boot
Bootstrap Opsview Deploy (Offline Only)
Using the Opsview Monitor deploy script loaded earlier, install the initial package set. Specify the Opsview Monitor admin
<password>
.
bash ./install_opsview -p <password> -O fire,boot
Gather information about Opsview Monitor 5.4 environment
As root user, cd to the '/opt/opsview/deploy' directory and execute the 'upgrade-pre' playbook.
root@opsview-master:~# cd /opt/opsview/deploy
root@opsview-master:/opt/opsview/deploy# ./bin/opsview-deploy lib/playbooks/upgrade-pre.yml
This will create a directory at '/opt/opsview/deploy/var/upgrade/' with a symlink at '/opt/opsview/deploy/var/upgrade/latest'.
This directory will get populated with various files to be copied or parsed during the upgrade. These include:
- /usr/local/nagios/etc/opsview.conf
- /usr/local/nagios/etc/sw.key
- /usr/local/opsview-web/opsview_web_local.yml
- /opt/opsview/timeseriesrrd/etc/timeseriesrrd.yaml (original version)
- Process maps
- Custom Host Icons
- Custom Host Template icons
Finally, the 'opsview-deploy' configuration will be produced:
- /opt/opsview/deploy/etc/opsview_deploy.yml
- /opt/opsview/deploy/etc/user_secrets.yml
- /opt/opsview/deploy/etc/user_upgrade_vars.yml
- /opt/opsview/deploy/etc/user_upgrade_vars_tmp.yml
The file 'user_upgrade_vars_tmp.yml' will be removed once the upgrade is complete.
Fine-tune the opsview-deploy configuration
Once the initial 'opsview-deploy' configuration is created, it may need to be tweaked to suit the installation environment as outlined below.
Before continuing, make sure that you review 'user_upgrade_vars.yml' configuration; it will contain extra information with additional steps you may need to follow.
You should merge the settings from user_upgrade_vars.yml
into user_secrets.yml
and user_vars.yml
once the upgrade is complete. Before proceeding with the upgrade, make sure these three files do not have any conflicts with each other. If you find a conflicting setting, merge the setting you want to keep into user_vars.yml
(for configuration) or user_secrets.yml
(for passwords).
Configuring Opsview Collector Clusters
If the Opsview 5.4 environment had slaves configured, these will have been added to 'opsview_deploy.yml' in the 'collector_clusters' section. Any inactive nodes will be commented out for convenience.
Each node needs to be updated with the correct SSH user. Additional variables (e.g. SSH port) can be configured as per the below:
collector_clusters:
<CLUSTER_NAME>:
collector_hosts:
<HOST_NAME>:
ip: <IP_ADDR>
user: <SSH_USER>
port: <SSH_PORT>
A note on YML File Spacing
Tip:
The yml configuration files are sensitive to spacing.
Do not use 'tabs', use individual spaces and ensure that the alignment is retained.
Configuring the root password for the Opsview database
The MySQL root password must be added to 'user_secrets.yml':
# /opt/opsview/deploy/etc/user_secrets.yml
---
[...]
opsview_database_root_password: MyPassword
[...]
The password must be the 'root' user's password for the database server hosting the 'opsview' and 'runtime' databases.
If you do not do this then you may receive the below error when attempting an "Apply Changes" within the UI later:
"ERROR 500: (1142, "DROP command denied to user 'opsview'@'localhost' for table 'nagios_contacts_temp'")
Failed to generate configuration"
If you don't have root access to the database server, see the notes in 'user_upgrade_vars.yml' for the steps you need to follow.
Configuring the database hosts
You may notice this option in your 'opsview_deploy.yml':
database_hosts: {}
You MUST leave this option as-is. Your database hosts are configured in 'user_upgrade_vars.yml' instead (otherwise, 'opsview-deploy' might reinstall your database server!). See the notes in 'user_upgrade_vars.yml' for more info.
SNMP Trap Processing
This is a licensed feature. To have the SNMP trap Processing software upgraded, add to /opt/opsview/deploy/etc/user_upgrade_vars.yml
:
opsview_module_snmp_traps: True
Configuring common overrides and parameters
Any additional options can be added to 'user_vars.yml'. For example:
# /opt/opsview/deploy/etc/user_vars.yml
---
# Install and manage a local Postfix MTA on each node
opsview_manage_mta: yes
# Relay host for outgoing mail
mta_relay_host: mail.example.com
# Disable automatic updating of the OS packages (not recommended) .
opsview_manage_os_updates: False
# Force installation of the named module instead of only installing if enabled by the license
opsview_module_netaudit: True
opsview_module_netflow: True
opsview_module_servicedesk_connector: True
opsview_module_reporting: True
opsview_module_snmp_traps: True
Offline Only
Amend the file /opt/opsview/deploy/etc/user_vars.yml
and add in the following appropriate line for your OS to specify the URL to your local Opsview Monitor package repository mirror:
# CentOS/RHEL
opsview_repository_str_yum: 'http://my.repo/$basearch/'
# Debian/Ubuntu
opsview_repository_str_apt: 'deb http://my.repo/ trusty main'
check_uri_opsview_repository: "http://my.repo/"
Pre-Deployment Checks
Before running opsview-deploy, we recommend Opsview users to check the following list of items:
Manual Checks
What | Where | Why |
---|---|---|
All YAML files follow correct YAML format | opsview_deploy.yml, user_*.yml | Each YAML file is parsed each time opsview-deploy runs |
All hostnames are FQDNs | opsview_deploy.yml | If Opsview Deploy can't detect the host's domain, the fallback domain 'opsview.local' will be used instead |
SSH user and SSH port have been set on each host | opsview_deploy.yml | If these aren't specified, the default SSH client configuration will be used instead |
Any host-specific vars are applied in the host's "vars" in opsview_deploy.yml | opsview_deploy.yml, user_*.yml | Configuration in user_*.yml is applied to all hosts |
An IP address has been set on each host | opsview_deploy.yml | If no IP address is specified, the deployment host will try to resolve each host every time |
All necessary ports are allowed on local and remote firewalls | All hosts | Opsview requires various ports for inter-process communication. See: Ports |
If you have rehoming | user_upgrade_vars.yml | Deploy now configures rehoming automatically. See Rehoming |
If you have Ignore IP in Authentication Cookie enabled | user_upgrade_vars.yml | Ignore IP in Authentication Cookie is now controlled in Deploy. See Rehoming |
Webserver HTTP/HTTPS preference declared | user_vars.yml | In Opsview 6, HTTPS is enabled by default, to enforce HTTP-only then you need to set opsview_webserver_use_ssl: False. See opsview-web-app |
For example:
---
orchestrator_hosts:
# Use an FQDN here
my-host.net.local:
# Ensure that an IP address is specified
ip: 10.2.0.1
# Set the remote user for SSH (if not default of 'root')
user: cloud-user
# Set the remote port for SSH (if not default of port 22)
port: 9022
# Additional host-specific vars
vars:
# Path to SSH private key
ansible_ssh_private_key_file: /path/to/ssh/private/key
Automated Checks
Opsview Deploy can also look for (and fix some) issues automatically. Before executing 'setup-hosts.yml' or 'setup-everything.yml', run:
root:~# cd /opt/opsview/deploy
root:/opt/opsview/deploy# ./bin/opsview-deploy lib/playbooks/check-deploy.yml
If any potential issues are detected, a "REQUIRED ACTION RECAP" will be added to the output when the play finishes.
The automatic checks look for:
Check | Notes or Limitations | Severity |
---|---|---|
Deprecated variables | Checks for: opsview_domain, opsview_manage_etc_hosts | MEDIUM |
Connectivity to EMS server | No automatic detection of EMS URL in opsview.conf overrides | HIGH |
Connectivity to Opsview repository | No automatic detection of overridden repository URL(s) | HIGH |
Connectivity between remote hosts | Only includes LoadBalancer ports. Erlang distribution ports, for example, are not checked | MEDIUM |
FIPS crypto enabled | Checks value of /proc/sys/crypto/fips_enabled | HIGH |
SELinux enabled | SELinux will be set to permissive mode later on in the process by setup-hosts.yml, if necessary | LOW |
Unexpected umask | Checks umask in /bin/bash for 'root' and 'nobody' users. Expects either 0022 or 0002 | LOW |
Unexpected STDOUT starting shells | Checks for any data on STDOUT when running /bin/bash -l | LOW |
Availability of SUDO | Checks whether Ansible can escalate permissions (using sudo) | HIGH |
When a check is failed, an 'Action' is generated. Each of these actions is formatted and displayed when the play finishes and, at the end of the output, sorted by their severity.
The severity levels are:
Level | Meaning |
---|---|
HIGH | Will certainly prevent Opsview from installing or operating correctly |
MEDIUM | May prevent Opsview from installing or operating correctly |
LOW | Unlikely to cause issues but may contain useful information |
By default, the check_deploy role will fail if any actions are generated MEDIUM or HIGH severity. To modify this behaviour, set the following in user_vars.yml
:
# Actions at this severity or higher will result in a failure at the end of the role.
# HIGH | MEDIUM | LOW | NONE
check_action_fail_severity: MEDIUM
The following example shows the 2 MEDIUM severity issues generated after executing check-deploy playbook
REQUIRED ACTION RECAP **************************************************************************************************************************************************************************************************************************
[MEDIUM -> my-host] Deprecated variable: opsview_domain
| To set the host's domain, configure an FQDN in opsview_deploy.yml.
|
| For example:
|
| >> opsview-host.my-domain.com:
| >> ip: 1.2.3.4
|
| Alternatively, you can set the domain globally by adding opsview_host_domain to your user_*.yml:
|
| >> opsview_host_domain: my-domain.com
[MEDIUM -> my-host] Deprecated variable: opsview_manage_etc_hosts
| To configure /etc/hosts, add opsview_host_update_etc_hosts to your user_*.yml:
|
| >> opsview_host_update_etc_hosts: true
|
| The options are:
| - true Add all hosts to /etc/hosts
| - auto Add any hosts which cannot be resolved to /etc/hosts
| - false Do not update /etc/hosts
Thursday 21 February 2019 17:27:31 +0000 (0:00:01.060) 0:00:01.181 *****
===============================================================================
check_deploy : Check deprecated vars in user configuration ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ 1.06s
check_deploy : Check for 'become: yes' -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 0.03s
*** [PLAYBOOK EXECUTION SUCCESS] **********
Initiating the upgrade
The following steps will make changes to your Opsview environment. Ensure that you've taken adequate backups before continuing.
If you have Reporting Module installed then now is the time to shut it down so that packages upgrade successfully:
root:~# /etc/init.d/opsview-reporting-module stop
To upgrade the Opsview environment, run the following playbooks in the '/opt/opsview/deploy' directory:
You will receive either a SUCCESS or FAILURE message, if you receive a SUCCESS, proceed, but if you receive a FAILURE please investigate and resolve this before moving forward.
You will be required to re-run the same steps in order to see if the error/failure has been resolved
Example:
* [PLAYBOOK EXECUTION SUCCESS] ****
or
* [PLAYBOOK EXECUTION FAILURE] ****
root:~# cd /opt/opsview/deploy
root:/opt/opsview/deploy# ./bin/opsview-deploy lib/playbooks/setup-everything.yml
root:/opt/opsview/deploy# ./bin/opsview-deploy lib/playbooks/upgrade-post.yml
Upgrade Opspacks
Run the following as the opsview user:
root:/opt/opsview/deploy# su - opsview
opsview:/opt/opsview$ /opt/opsview/coreutils/bin/install_all_opspacks -f -d /opt/opsview/monitoringscripts/opspacks
This may take a moment to run.
Syncing all Plugins to Collectors
This step will copy all updated plugins on the Master to each of the Collectors and should be run as the root user:
#
# This will...
# Distribute plugins from Master to Collectors by running the following playbook
#
root:~# cd /opt/opsview/deploy
root:/opt/opsview/deploy# ./bin/opsview-deploy lib/playbooks/sync_monitoringscripts.yml
Webserver SSL Certificate
In Opsview 6 we now provide a new location to drop in SSL Certificates for use in encrypting traffic between the users and the Web UI. Because the location of the SSL certificates previously in use by Opsview 5 is not determined by the upgrade, you must manually drop the certificate and key into the new location. The new location for the certificate and key is /opt/opsview/webapp/etc/ssl/ while the old location was /usr/local/opsview-web/etc/ssl/.
As root
, move your SSL certificates into the correct location while backing up the generated self signed ones:
Note: these steps assume the default location for server certificates, your location may be different, so change /usr/local/opsview-web/etc/ssl/server.crt
and /usr/local/opsview-web/etc/ssl/server.key
accordingly.
cd /opt/opsview/webapp/etc/ssl/
mv server.crt{,_selfsigned}
mv server.key{,_selfsigned}
mv /usr/local/opsview-web/etc/ssl/server.crt ./server.crt
mv /usr/local/opsview-web/etc/ssl/server.key ./server.key
/opt/opsview/watchdog/bin/opsview-monit restart opsview-webserver
If you have other SSL configuration such as the use of intermediate CA certificates or the use of .pem
files then please see opsview-web-app for more information
Registering the Opsview Collectors
If you were using Opsview Slaves, these will now be Opsview Collectors and they need to be added to their respective clusters in the UI.
- Navigate to: 'Configuration' -> 'Monitoring Collectors'.
- Select each of the 'Unregistered Collectors' pane and choose 'Register Collector'
- Select the collector cluster to associate this collector with
- This must be the same cluster as per the 'opsview_deploy.yml' configuration
Running the initial Apply Changes (formerly known as Reload)
In order to get all the new components up and running correctly, an Apply Changes must be completed
- Go to 'Configuration' > 'Apply Changes' to finalise all the changes made by the upgrade process.
Moving the Database Server
If you wish to move your database to an infrastructure server, refer to Moving database.
Moving to the InfluxDB Graphing Engine
If you wish to move from the RRD graphing engine to InfluxDB, refer to Timeseries Graphing Engine.
If you are already using InfluxDB then you do not need to take any further steps.
Opsview 5.X Host Templates deprecation
The following host templates are no longer relevant for Opsview 6.x, so you need to manually remove them from the Opsview Orchestrator and Collectors; you can safely delete the Templates and their Service Checks:
- Application - Opsview BSM
- Application - Opsview Common
- Application - Opsview Master (succeeded by the Host Template "Application - Opsview" with it's own new service checks, you can add this template now)
- Application - Opsview NetFlow Common (succeeded by the Host Template "Opsview - Component - Results Flow")
- Application - Opsview NetFlow Master
Reporting Email Configuration
As part of the upgrade to this Opsview Monitor release, the JasperReports Server part of the Reporting Module has been upgraded from 5.1.1 to 7.1.1. With this upgrade, the file js.quartz.properties
will be replaced, overwriting any contents in it. This file is used to configure the emailing capabilities of the Reporting Module and will need reconfiguring if it was previously configured.
To edit this file, open /opt/opsview/jasper/apache-tomcat/webapps/jasperserver/WEB-INF/js.quartz.properties
in a text editor and edit the default "report.scheduler.mail.sender." values to match your requirements. See Reports - optional module for more details.
Double Proxying and Rehoming
Please note that with the introduction of websockets in 6.2 there must be an additional ProxyPass section in the proxy configuration if you are using Apache2 or NGINX for a forward or reverse proxy. See Rehoming
NRPE overrides in nrpe_local
Custom NRPE plugins and configuration overrides are not migrated by the upgrade - these changes should be performed manually.
As the root
user on the Orchestrator and Collector servers, copy the configuration files from the old location to the new one:
cp /usr/local/nagios/etc/nrpe_local/*.cfg /opt/opsview/agent/etc/nrpe_local/
chown root:opsview /opt/opsview/agent/etc/nrpe_local/*.cfg
chmod 640 /opt/opsview/agent/etc/nrpe_local/*.cfg
The paths to plugins will also need to be updated within the configuration files, which can be easily done with the following command:
sed -i 's!/usr/local/nagios/libexec!/opt/opsview/agent/plugins!' /opt/opsview/agent/etc/nrpe_local/*.cfg
All of the plugins configured within the cfg files should be copied over, too (the example below is for a custom plugin called check_exec.sh
):
cp /usr/local/nagios/libexec/check_exec.sh /opt/opsview/agent/plugins/
chown root:opsview /opt/opsview/agent/plugins/check_exec.sh
...
...
After all plugin have been copied over, restart the agent and make sure the nrpe
process is running as the opsview
user:
service opsview-agent restart
ps -ef|grep nrpe
You can test a custom plugin called check_exec
for running the following, which should produce the help output for the plugin:
sudo -iu opsview /opt/opsview/monitoringscripts/plugins/check_nrpe -H localhost -c check_exec -a '-h'
You may also need to amend the custom plugins to ensure they do not use old obsolete paths.
/tmp files owned by Nagios
Some plugins may create temporary files within commonly used directories such as with /tmp
and running the plugins as the opsview
user may show permissions problems resulting in non-OK states or error messages within the UI.
This can usually be resolved by running the following command on the Orchestrator and Collector servers:
chown -R --from=nagios:nagios opsview:opsview /tmp/* /var/tmp/*
Updated over 2 years ago