Opsview Knowledge Center

Planning Your System

Considerations for planning your Opsview Monitor environment

In this section, we discuss several design considerations to aid in planning your Opsview Monitor system such as scalability considerations, resilience, disk partitioning and security. We discuss how Opsview Monitor uses databases later within this section.

Achieving Scalability

When deploying your Opsview Monitor server, you should bear in mind the variables that may affect your system and how many Hosts can actually be monitored, As such, your need to be mindful of the following factors:

  • The number of Service Checks per Host.
  • The median interval for Service Checks.
  • The type of checks being executed, that is, quality of plugin code, local execution vs. agent queries and so on.
  • Network latency.

Typically we recommend 300 Hosts as a comfortably manageable limit for a single Opsview Monitor server, however there are a number of assumptions made in making this recommendation. These assumptions result in approximately ten Service Checks per second being executed by the monitoring server.

  • 10 Service Checks per Host (average).
  • Five minute interval per Service Check (average).
  • The majority of Service Checks are made against a remote agent; for example, Nagios® Remote Plugin Executor (NRPE) or Simple Network Management Protocol (SNMP).
  • The majority of monitored Hosts are on the same Local Area Network (LAN).
  • That your system specification is a modest physical or virtual server with 4-8 CPU cores and 16GB RAM

With appropriate tuning and use of better hardware, however, a single server can typically be made to scale well beyond 300 hosts.

Opsview Monitor's distributed architecture can also be used to monitor a large number of Hosts. As such, the Opsview Monitor system can be divided across the following components with each running on a dedicated server, as we describe in the following list.

  • Web server and application server (including monitoring engine);
  • Database server: 'opsview' and 'runtime' databases
  • Database server #2: Opsview Data Warehouse ('odw')
  • Slave servers (collectors) either standalone or grouped in highly available clusters

Service Checks

When designing a system, one of the most important metrics for consideration is 'service checks per second', which is a factor of both the total number of checks configured as well as the interval between those checks. Generally we recommend no more than around 20 service checks per second on a single machine. For example, if we have around 2000 hosts with ten checks per host using a five minute interval, this will clearly exceed our recommended checks per second, as shown in the example below:

An example configuration, which exceeds our recommended checks per second:

2000 (hosts) * 10 (service checks) / 300 (seconds) = 66 (service checks per second)

To achieve a comfortable rate and to bring down the checks per second within our recommended guidelines, we would need to attach three slaves to the master server, which will achieve a rate of 22 checks per second per slave. Moreover, if we utilize each CPU core of our slave systems to handle a separate worker thread, we can further divide our checks per second (66) by the number of cores our slave servers possess. For example, if we have 2 x dual core CPUs in our slave servers, this further reduces the number of checks per second for each core to 11, as we show below:

Utilizing CPU cores can further reduce checks per second:

66 (service checks per second) / 4 (number of cores) = 16.5

Achieving Resilience

The Opsview Monitor distributed architecture combines both scalability and resilience; however, resilience can, in fact, be effectively enhanced by 'doubling' the components that comprise your system, as we demonstrate in the following list.

  • Master server (active)
  • Master server (standby)
  • Database server #1: 'opsview' and 'runtime', replica of 'odw'
  • Database server #2: 'odw', replica of 'opsview' and 'runtime'
  • Slave clusters:
    • Slave cluster #1
    • Slave cluster #2
    • Slave cluster #3
    • Slave cluster #4

Note: When assessing a slave cluster, you should allow the possibility of at least one node failure. If a slave cluster is nearing capacity, then the failure of one node may cause other nodes to exceed capacity.
Note: It is not possible to run two Opsview Monitor master servers in active/active configuration, only active/passive. Running a second master node in either High Availability or Disaster Recovery configuration requires an HA or DR subscription from Opsview.

Disk Partitioning

In this section, we detail the sizes of the disk partitions that are needed to operate the Opsview Monitor software and its software dependencies.

Opsview System

In general, one large root partition is sufficient, although we recommend the root and /var directories have at least 1GB of disk space available. In the following list, we provide further information about other areas and their recommended sizes.

  • root: We recommend at least 2GB is set aside for the operating system, allowing for any upgrades and so on.
  • /boot: We recommend a separate boot partition of at least 256MB.
  • /usr/local: The majority of the Opsview Monitor software is installed here and, as such, we would recommend a minimum size of 10GB. If you are running a large system which collects a lot of performance data, this partition may need additional space
  • /opt/opsview: Some Opsview Monitor software and runtime data is stored in this location. We recommend a minimum of 1GB.

Temporary Directory

Opsview Monitor uses a temporary directory (/tmp by default) when running opsview-web and other related applications. You can set a system level environment variable, namely TMPDIR=/<NEW TMP DIRECTORY>, if you wish to use an alternate area.

Database System

The database can either be located on the master or on a separate server. Nonetheless, in both instances the Opsview Monitor database and backups are located in the /var directory and we provide our recommended size below.

  • /var: We recommend that this directory has more than 100GB available when used in conjunction with the ODW; however, if ODW is not used, then 50GB is sufficient for small and medium sized systems.

Backups

Opsview has a nightly housekeeping job, invoked from cron, that runs at 03:11 every night. This includes backups, which are stored on the Opsview master server filesystem.

The scope of the backups is to be able to restore a system as soon as possible on the same (or similar) hardware and operating system.

You will need to design your own backup strategies for long term archival of data.

Opsview Nightly Backups

A cronjob for the nagios user will be present on the master server called rc.opsview cron_daily. This runs daily housekeeping tasks, including the backing up of various parts of Opsview.

Note: The backup should reside on a different disk/filesystem to /usr/local/nagios, otherwise there is no redundancy at this level. You will also want the backup transferred to another server, to provide redundancy in case of server failure.

Long term data is not necessarily backed up. See the section below, 'Data not covered by nightly backup' for more details of data that should be backed up by local policy.

There are backup variables in opsview.defaults:

$backup_dir            # which directory to store the backups. Default ''/usr/local/nagios/var/backups''
$backup_retention_days # number of days worth of backups to keep. Default 30

Note: All files older than $backup_retention_days within $backup_dir are removed, not just opsview backups.

Invoking a Backup

You can invoke an adhoc backup by running as the nagios user:

/usr/local/nagios/bin/rc.opsview backup

Data Included by Nightly Backup

Databases

The Opsview configuration database is backed up. It is also backed up after every reload.

The reports database is backed up.

From runtime, only the nagios_objects and nagios_instances table is backed up. This is required to ensure consistency with the primary keys for all the various objects in the database with ODW.

Files

This includes:

  • /usr/local/nagios (including Nagvis)
  • /usr/local/opsview-web
  • /usr/local/nagios/var - if this is a symlink, it will follow it

Note: Other than /usr/local/nagios/var, symlinks are not followed - if you set up any other symlinks, you should ensure those are backed up appropriately.

Files may change as they are being backed up, but these should only be transient information (like log files or rrd files).

The backup excludes:

  • /usr/local/nagios/var/backup
  • /usr/local/nagios/var/mrtg
  • /usr/local/nagios/var/ndologs
  • /usr/local/nagios/nmis/database
  • /usr/local/nagios/snmp/all
  • /usr/local/nagios/tmp

Data Not Covered by Nightly Backup

ODW

Due to its size, ODW is not backed up as part of the Opsview nightly backups. It is recommended that you use a 3rd party tool to save the historical data in ODW to a separate system. You can use mysqldump, but beware that you are backing up the same data every time and that the tables are locked during the backup. You may be able to run incremental backups with a different tool.

Script for Backing up Opsview Datawarehouse to File

#!/bin/bash
BACKUPTARGET="/var/backups"
BACKUP_RETENTION_DAYS=14
DATE=`date +"%Y-%m-%d-%H%M"`
echo Creating ODW backup in $BACKUPTARGET
nice -n 19 /usr/local/nagios/bin/db_odw db_backup > $BACKUPTARGET/opsview-datawarehouse-$DATE.sql
echo Compressing backup with gzip
nice -n 19 gzip -9 $BACKUPTARGET/opsview-datawarehouse-$DATE.sql
echo Removing old ODW backups in $BACKUPTARGET
find $BACKUPTARGET -type f -name "opsview-datawarehouse-*.gz" -mtime +${BACKUP_RETENTION_DAYS-30} -exec rm {} \;

runtime

Again, due to size constraints, runtime is not backed up. If you use the data within runtime and require it to be stored for recovery, it is recommended that you use a 3rd party tool to back this up too. The configuration tables in runtime are generated by Opsview Monitor and Nagios(R) Core, so will not need to be restored.

Only the nagios_objects and nagios_instances tables in runtime are saved in the nightly bakcup as these are required to ensure primary key consistency with ODW.

Slaves

Opsview Monitor slaves collect data for Nagios Core, so results are sent back up to the master. Slaves do have Nagios Core logs stored in /usr/local/nagios/var/archives and these will be stored locally and housekept appropriately. There is no automatic backup run on the slaves, as they are generally considered to be expendable in the event of a major failure.

Restore

This assumes that you are restoring from the backup files generated by the nightly backup mentioned above. This also assumes that you are restoring to the original Opsview Monitor server. If you are interested in migrating Opsview Monitor to a different server, see Migrating to New Hardware instead.

Note: If you have a slave system, when Opsview Monitor starts, it will try to initiate communications with the slave.

su -
/etc/init.d/opsview stop
/etc/init.d/opsview-web stop
cd /
tar --gzip -xvf /{path_to_backups}/nagios-files-{datestamp}.tar.gz
mysql -u root
mysql> drop database opsview;
mysql> drop database runtime;
mysql> drop database reports;
mysql> exit
gunzip -c /{path_to_backups}/opsview-db-{timestamp}.tar.gz | mysql -u root -p{mysqlrootpassword}
gunzip -c /{path_to_backups}/runtime-db-{timestamp}.tar.gz | mysql -u root -p{mysqlrootpassword} runtime
gunzip -c /{path_to_backups}/reports-db-{timestamp}.tar.gz | mysql -u root -p{mysqlrootpassword}
su - nagios
/usr/local/nagios/installer/upgradedb.pl  # Check upgrades are okay. See below if error
rc.opsview gen_config
opsview-web start
exit # back to root

Note: If you get 'No database selected', notice that you have to specify runtime as the target database when restoring nagios-runtime-db-{timestamp}.tar.gz.

Note: If you get an error which says:

Upgrading Opsview part of Runtime database
DB at version 2.7.0
DBD::mysql::db do failed: Table 'opsview_contact_services' already exists [for Statement "CREATE TABLE opsview_contact_services (

then it is likely that you are trying to restore over an existing database. You need to remove the old database (taking a different backup first, if possible) before trying to restore from the backup.

Full Opsview Offline Backup

You can do a full offline backup by taking the following steps:

  • Shutdown Opsview Monitor on master
  • Shutdown Opsview Monitor on slaves
  • Backup filesystem /usr/local/nagios on all Opsview Monitor servers
  • Backup filesystem /usr/local/opsview-web on Opsview Monitor master
  • Backup filesystem /opt/opsview on Opsview Monitor master
  • Backup filesystem /var/opt/opsview on Opsview Monitor master
  • Stop mysql on database server
  • Backup mysql data files

This will back up all data regarding Opsview Monitor so you can restore to this point in time.

Exporting Data in CSV Format

Data can be exported from MySQL in CSV format for use with spreadsheet software and other databases.

The following example can be run from the Unix shell. It exports Opsview Monitor Service Check configuration into a file called: /tmp/servicechecks.csv. You may have to change this SQL to match your 'opsview' database name, if it is different from the default.

echo "USE opsview; SELECT * INTO OUTFILE '/tmp/servicechecks.csv' FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '\"' LINES TERMINATED BY '\n' from servicechecks;" | mysql -u root -p

Security

In this section, we highlight several security aspects of the Opsview Monitor system.

Web authentication

Opsview Monitor's web authentication uses an authentication ticket with a shared secret and must be set to a unique value for your system.

Network

Your Opsview Monitor server should be placed in a secure location. If your server is accessible through a public network, we recommend using a firewall to restrict access to various ports; see Ports.

Agents

Opsview Monitor Agents are applications which run on a host to be monitored, and which will return status or performance metrics when requested. Agents can be contacted by the master or slave system (or clients) using an anonymous cipher to encrypt communication. Opsview Monitor agents only permit strong ciphers, such as ADH-128 and ADH-256 to be accepted. Agent Security

For additional security, we recommend using firewall rules to restrict which servers can connect to the agent. You can also use the ‘allowed_hosts’ variable in the agent configuration to limit connections to only the monitoring servers.

Opsview agents also support Secure Socket Layer (SSL) certificates.

Security Wallet

Opsview Monitor has a feature called Security Wallet, which allows you to avoid storing clear text passwords for external systems in the filesystem and database. The user interface will not display any stored passwords and it will not be possible to retrieve any passwords once they have been set.

Note: If you have any passwords stored in the Audit Log or any log files before an upgrade, these will not be altered. However, no passwords will be added in any new audit log entries. The master key file is stored in /usr/local/nagios/etc/sw.key and is randomly generated on installation or upgrade.
If this key file is lost, then all passwords will need to be re-entered. Nagios has been changed so that the Nagios configuration will hold a special macro, of the form @SW{TYPE:ID:KEYNAME}, which will be expanded just before execution of a plugin.
Where possible, plugins have been updated to hide any sensitive arguments in their command line. Net-SNMP, which provides the snmpget and snmpwalk commands, has been confirmed to hide sensitive arguments in version 5.3.2.2 (on Centos5) and 5.4.3 (Debian 7/wheezy).

Variables

Opsview Monitor allows specific arguments to be marked as encrypted. When this has been chosen, a message will appear to confirm that this is what you want to do.
When you save, the default arg value will be encrypted for the Variable object and all related Host attributes will be encrypted as well.
If you decide to mark the arg as unencrypted, then the argument will be cleared from this attribute and all related Host attributes. There is no way in the UI to recover these arguments.

On a new install of Opsview Monitor, we will set the Password argument of the following attributes to be encrypted:

  • MSSQLCREDENTIALS
  • MYSQLCREDENTIALS
  • ORACREDENTIALS
  • VMWAREGUESTCREDENTIALS
  • VMWAREHOSTCREDENTIALS
  • WINCREDENTIALS

On an upgrade, none of the existing Variable configuration will be encrypted. We recommend you manually convert the above Variable args to be encrypted.

Database Connection Passwords

Database connection passwords, specified in /usr/local/nagios/etc/opsview.conf, can optionally be encrypted. This is the default for new installations. To convert passwords to be encrypted, you will need to manually follow the instructions below for each database:

Opsview DB

The process for the Opsview Monitor database is:

  • Run /usr/local/nagios/bin/db_opsview db_exists. This will return no output with an exit code of 0 as the credentials will be correct
  • Note the current database password, $dbpasswd, used in the /usr/local/nagios/etc/opsview.conf file
  • Run /usr/local/nagios/bin/opsview_crypt and enter the password when prompted. This will print the encrypted password to screen for copy-and-paste
  • Edit /usr/local/nagios/etc/opsview.conf and set the new encrypted value, e.g:
    $dbpasswd_encrypted = "53f0990f9ddfaec4769c6facbb93e66bdefcc80cb6913faa74559eeafb5863da";
    (The encrypted password will be based on your randomly-generated key, so the value above will not work on other systems.)
  • Run /usr/local/nagios/bin/db_opsview db_exists to confirm the database connection works with the new password
  • Remove the previous $dbpasswd value. Again, run /usr/local/nagios/bin/db_opsview db_exists to confirm the connection still works

Runtime DB

The process for the Runtime database is:

  • Run /usr/local/nagios/bin/db_runtime db_exists. This will return no output with an exit code of 0 as the credentials will be correct
  • Note the current database password, $runtime_dbpasswd, used in the /usr/local/nagios/etc/opsview.conf file
  • Run /usr/local/nagios/bin/opsview_crypt and enter the password when prompted. This will print the encrypted password to screen for cut-and-paste
  • Edit /usr/local/nagios/etc/opsview.conf and set the new encrypted value, e.g:
    $runtime_dbpasswd_encrypted = "53f0990f9ddfaec4769c6facbb93e66bdefcc80cb6913faa74559eeafb5863da";
  • Run /usr/local/nagios/bin/db_runtime db_exists to confirm the database connection works with the new password
  • Remove the previous $runtime_dbpasswd value. Run /usr/local/nagios/bin/db_runtime db_exists to confirm the connection still works

ODW DB

The process for the ODW database is:

  • Run /usr/local/nagios/bin/db_odw db_exists. This will return no output with an exit code of 0 as the credentials will be correct
  • Note the current database password, $odw_dbpasswd, used in the /usr/local/nagios/etc/opsview.conf file
  • Run /usr/local/nagios/bin/opsview_crypt and enter the password when prompted. This will print the encrypted password to screen for cut-and-paste
  • Edit /usr/local/nagios/etc/opsview.conf and set the new encrypted value, e.g:
    $odw_dbpasswd_encrypted = "53f0990f9ddfaec4769c6facbb93e66bdefcc80cb6913faa74559eeafb5863da";
  • Run /usr/local/nagios/bin/db_odw db_exists to confirm the database connection works with the new password
  • Remove the previous $odw_dbpasswd value. Again, run /usr/local/nagios/bin/db_odw db_exists to confirm the connection still works

Dashboard DB

The process for the Dashboard database is:

  • Run /usr/local/nagios/bin/db_dashboard db_exists. This will return no output with an exit code of 0 as the credentials will be correct
  • Note the current database password, $dashboard_dbpasswd, used in the /usr/local/nagios/etc/opsview.conf file
  • Run /usr/local/nagios/bin/opsview_crypt and enter the password when prompted. This will print the encrypted password to screen for cut-and-paste
  • Edit /usr/local/nagios/etc/opsview.conf and set the new encrypted value, e.g.:
    $dashboard_dbpasswd_encrypted = "53f0990f9ddfaec4769c6facbb93e66bdefcc80cb6913faa74559eeafb5863da";
  • Run /usr/local/nagios/bin/db_dashboard db_exists to confirm the database connection works with the new password
  • Remove the previous $dashboard_dbpasswd value. Again, run /usr/local/nagios/bin/db_dashboard db_exists to confirm the connection still works

SMSGateway DB

The process for the SMSGateway database is:

  • Run bin/check_smsgateway. This will return SMSGATEWAY OK | queued=0 failed=0 total=0 if connection parameters are correct and there is nothing in the queue.
  • Note the current database password, $dbpasswd, used in the smsqueued.conf file (Note: This is not the same as the opsview dbpasswd)
  • Run opsview_crypt and enter the password when prompted. This will print the encrypted password to screen for cut-and-paste
  • Edit smsqueued.conf and set the new encrypted value, e.g:
    $dbpasswd_encrypted = "53f0990f9ddfaec4769c6facbb93e66bdefcc80cb6913faa74559eeafb5863da";
  • Run bin/check_smsgateway to confirm the database connection works with the new password
  • Remove the previous $dbpasswd value. Again, run bin/check_smsgateway to confirm the connection still works

You will need to do this on each Opsview Monitor system (master or slave) where smsgateway is installed.

ServiceDesk Connector DB

The process of encrypting the password for Opsview servicedesk connector DB is:

  • Make sure you have run the grant all on notifications.* to notifications@'%' identified by 'your new password unencrypted here' in your mysql.
  • Run opsview_crypt and enter the new password when prompted for encryption.
  • Edit the YAML config file notifications.yml in 'etc/opt/opsview/notifications' and fill in the encrypted_password string.
  • Send a DB notification to confirm.

Web Authentication

Opsview uses authticket to authenticate to the web application.
On an install, a randomly generated secret will be used and will be encrypted.
On an upgrade, if you have got the old default shared secret ( shared-secret-please-change), a new secret will be generated and encrypted. Otherwise, no changes will occur.
To convert your shared secret to be encrypted:

  • Ensure your live Apache includes /usr/local/opsview-web/etc/apache-authtkt.conf
  • Run /usr/local/nagios/bin/opsview_crypt and enter the shared secret to encrypt
  • Edit /usr/local/nagios/etc/opsview.conf and set $authtkt_shared_secret_encrypted='encryptedvalue';
  • Run '/usr/local/opsview-web/bin/postinstall' to generate the new Apache configuration file
  • Restart opsview-web and apache

Nagios Results Distributor (NRD)

Opsview uses NRD to distribute results from slaves.
On an install, a randomly generated secret will be used and will be encrypted.
On an upgrade, if you have got the default shared secret, no changes will occur.
To convert your shared secret to be encrypted:

  • Run /usr/local/nagios/bin/opsview_crypt and enter the shared secret to encrypt
  • Edit /usr/local/nagios/etc/opsview.conf and set $nrd_shared_secret_encrypted='encryptedvalue';
  • Reload to send settings to all the slaves

NSCA

Opsview starts an NSCA daemon on the Opsview Monitor master and slaves for integration with any existing NSCA clients you may use.
On an install, a randomly-generated secret will be created. However, as the secret needs to be known to external clients that Opsview Monitor does not control, the secret will not be encrypted.
To convert your shared secret to be encrypted:

  • Run /usr/local/nagios/bin/opsview_crypt and enter the shared secret to encrypt
  • Edit /usr/local/nagios/etc/opsview.conf and set
    $nsca_shared_password_encrypted='encryptedvalue';
  • Reload to generate the nsca.cfg file
  • Restart Opsview Monitor to restart the nsca daemon
  • You can now use send_nsca from any NSCA clients

Key and Password Reset Tool

Opsview provides a reset tool which will do the following

  • Stop Opsview Monitor processes.
  • Create a new key for encryption.
  • Back up the current key files used for encryption into the var/sw-migration/ directory.
  • Back up the databases into the var/sw-migration/ directory.
  • Re-encrypt all the passwords stored in the databases into temporary store.
  • Re-encrypt all the configuration files into temporary location.
  • Replace the old encrypted data
  • Replace the old configuration files
  • Generate the Nagios configuration.
  • Start opsview and opsview-web.

If your Opsview Monitor installation is using distributed monitoring (slaves), please stop Opsview Monitor on each slave node:

nagios@slave-node$ rc.opsview stop

The command can be run as follows:

nagios@opsview-master$ securewallet_reset

You will also be asked to restart Apache HTTPD server manually.
If any Opsview Monitor modules have been installed on slave nodes, you would need to re-run the command on each node

nagios@slave-node$ securewallet_reset

Rollback process

This process prompts for a confirmation before proceeding. There is no automatic revert from this process so if it fails the user can re-instate the database from the backup which the tool creates; also restore all the config files from the same directory which is var/sw-migration/
Also move the etc directory back and run the following command:

nagios@opsview-master$ rc.opsview stop
nagios@opsview-master$ opsview-web stop

To restore previous databases user will have to now run

    nagios@opsview-master$ /usr/local/nagios/bin/db_opsview db_restore < var/backups/sw-migration/opsview_backup.sql
nagios@opsview-master$ /usr/local/nagios/bin/db_dashboard db_restore < var/backups/sw-migration/dashboard_backup.sql

Once the database has been restored run the following

nagios@opsview-master$ rc.opsview gen_config
nagios@opsview-master$ opsview-web start

The process stops if the backup steps fail and no changes are made to the system so the above restoring process is not needed in case backup failure. After the process has finished successfully, you should some of the stored passwords e.g. Service Checks, SNMP or Notification Methods, are still working correctly. If there is a problem, follow the process above to restore from previous config and database.

System Setting: Time Zone

You can may set the Linux/Unix server to any time zone; however, we recommend you set the time zone to be UTC. Opsview Monitor will show the time in the browser based on the browser's time zone.

All data that is stored in files and databases will be time stamped in UTC format for consistency.

If you have changed the time zone of your Linux/Unix server, you will need to restart your system so that all services are aware of the update.

Users and Groups

The Opsview Monitor installation package will create the system users and groups, as shown in the table below, if they do not already exist. You can also create them yourself but you must create the users and groups exactly as shown in the table. Finally, a user should be assigned the group(s) as shown below.

USER
GROUP

opsview

opsview

nagios

nagios, nagcmd, opsview

If you use an external authentication system, then you may need to change the local user and group files and recommend that these Users and Groups are set up locally to remove the dependency on your external system. Additionally, the web server user will need to be in the nagcmd group which will, of course, be subject to your distribution.

The nagios user needs write permissions to its home directory. The Opsview Monitor installation will update the .profile (or .bash_profile) in the nagios user’s home directory to source the profile, /usr/local/nagios/bin/profile; this is required to allow the setup of several Opsview Monitor environment variables. If this is not set up on your system, you can run '/usr/local/nagios/installer/set_profile' as the nagios user.

Databases

Opsview Monitor uses four databases, as described in the table below:

DATABASE
PURPOSE

opsview

Monitoring configuration and access control

runtime

Status data and short-term history

odw

Long-term retention of data

dashboard

State information about the Dashboard application

Both the opsview and runtime databases must be on the same server, whereas the odw and dashboard databases can be located on separate servers. In fact, some performance improvements can be achieved by locating the odw and dashboard databases on separate servers. For more information see section Databases on a Different Server.

New installations of Opsview Monitor will use randomly-generated passwords to connect to the databases. These passwords will be encrypted, so it will not be possible to decrypt these credentials to use to connect to the databases.

If you need to connect to the databases to run your own queries, we recommend that you create your own accounts for this purpose. Also, you can restrict the amount of information that would be available to this account, by limiting the tables that can be queried.

Database Storage

Opsview Monitor uses MariaDB for both RHEL7 and CentOS7 distributions, and MySQL is used for all other distributions for database storage. Opsview Monitor software supports the version of MySQL included as part of our supported Operating Systems (OSs). This ranges from MySQL v5.0, v5.1 through to v5.5 and includes MariaDB v5.5; MySQL v5.6 is not currently supported. Similarly, if you use a remote database server, Opsview Monitor will only support the database server if it is based on one of our supported platforms.

MySQL Strict Mode

Opsview does not support MySQL's strict mode. Ensure that the following is not set in MySQL's my.cnf:

sql_mode=NO_ENGINE_SUBSTITUTION,STRICT_TRANS_TABLES

Performance Tuning

It is recommended that you install MySQL prior to installing the Opsview Monitor software since you can then 'tune' your MySQL database server before Opsview Monitor creates its necessary databases.

Note: Make a note of your MySQL root password, as you will be prompted for this during the installation process.

Add these entries in the mysqld section of the my.cnf file.

innodb_file_per_table=1
innodb_flush_log_at_trx_commit=2

These options will cause transactional changes to be flushed to disk once per second, as opposed to on a 'per transaction' basis; see InnoDB Startup Options and System Variables , for more information. Finally, on a dedicated MySQL server, you should set the innodb_buffer_pool_size as large as possible, leaving approximately 40% free memory for the operating system; see below:

innodb_buffer_pool_size=1G

MySQLReport is a useful tool to evaluate the performance of MySQL. To tune MySQL, edit these values in the mysqld section of /etc/mysql/my.cnf (OS dependent) and restart mysqld.
Good starting values for a small database server with 2-4GB of memory are:

  • table_cache = 768 (check tables opened/sec in mysqlreport)
  • query_cache_size = 16M (this should not be any higher due to limitations in mysql - see this post )
  • key_buffer = 256M
  • innodb_buffer_pool_size = 1024M
  • innodb_file_per_table = 1
  • innodb_flush_log_at_trx_commit = 2
  • innodb_autoinc_lock_mode=1 # Required for replication with MySQL 5.1 or later
  • max_allowed_packet = 16M
  • binlog_format = 'MIXED' # when using binary logs in replication
  • max_connections = 150
  • tmp_table_size = 64M # To allow each connection to sort tables in memory. Maximum possible is max_connections x tmp_table_size
  • max_heap_table_size = 64M # Set to the same as tmp_table_size

You can see the current values with mysqladmin variables.

You may want to consider starting mysqld without name resolution. See http://dev.mysql.com/doc/refman/5.0/en/dns.html for more information.

You can use the opsview_preupgrade_check script to see if there are any variables that need changing. Obviously, values will depend on the resources available on your server, so this acts as guidelines.

Download the script from our svn repository for the latest version. This needs to be copied into /usr/local/nagios/installer to run. You can specify which components to check from your Opsview Monitor system.

Note: the recommendations for MySQL server variables depend on your system and what other services run on it, so you have to exercise judgement when changing your system. Make sure that you do not over-commit resources to MySQL because if it causes the server to go into swap space, this will reduce the performance of MySQL.

Note: the crashed tables check may take a while to run. More information about the innodb parameters are on the mysql documentation site.

General Hints for Performance Tuning

Check iostat -x 5. This gives I/O statistics per disk. You could have a low overall I/O wait time, but it could be due to a single disk being used 100% of the time.
For maximum I/O, you should stripe the disks so that all disks are being utilized.
You should use separate disks for data files and index files - this improves read and write times.
You should use a fast disk for redo logs.
We have also seen improvements if innodb_flush_log_at_trx_commit is set to 2, although it is possible to lose up to 1 second of data in the event of a disk failure.

Databases on a Different Server

In this section, we describe how you can set up Opsview Monitor to use databases on a different server rather than exclusively running them on the Opsview Monitor master. As a result of undertaking this process, there will be an outage of the Opsview Monitor server during backup and restore. We recommend that you undertake a routine test to better understand how long the restore process will take prior to committing.

Set up MySQL on a Database Server

The database server only requires MySQL to be installed, along with its dependencies (the opsview-agent may be installed if you are not using SNMP to monitor the server). You should ensure that mysql is listening on the appropriate port and that other Opsview-specific configuration have been.

Stopping Opsview

You will need to stop Opsview Monitor to achieve a consistent snapshot of the database and, as such, in the example below, we show you how to stop Opsview Monitor.

Stop Opsview for a consistent snapshot of the database.

/opt/opsview/watchdog/bin/opsview-monit stop all

Backup Your Databases

Here, we assume that you have undertaken a full database export; however, you may of course use any other application to back up your MySQL databases. You should be aware that if the new database server is located on a different architecture, that is, 32- or 64-bit, then you will need to export your database, as shown in the example below:

Undertake a full database export.

mysqldump -u root -p --add-drop-database --opt --databases opsview runtime odw dashboard| gzip -c > databases.sql.gz

Restoring Your Databases

On your new server, restore the databases, as shown in the example below. You should also verify that your character set for the new databases are the same, as your previous version, since you may experience issues when upgrading.

On your new server restore your databases.

gunzip -c databases.sql.gz | mysql -u root -p

Set up Access Controls

Locate the /usr/local/nagios/etc/opsview.conf file on the Opsview Monitor master server and update the file with the contents, as shown in the example below:

Update opsview.conf with this content.

#
# This file overrides variables from opsview.defaults
# This file will not be overwritten on upgrades
#
$dbhost = "localhost";
$dbport = "3306";
$dbpasswd_encrypted = "redacted';
$odw_dbhost = "localhost";
$odw_dbport = "3306";
$odw_dbpasswd_encrypted = "redacted';
$runtime_dbhost = "localhost";
$runtime_dbport = "3306";
$runtime_dbpasswd_encrypted = "redacted';
$dashboard_dbhost = "localhost";
$dashboard_dbport = "3306";
$dashboard_dbpasswd_encrypted = "redacted';
$nsca_shared_password = "ff10ddd51-ww11sds-c8dfff4f-sss241";
$authtkt_shared_secret_encrypted = "redacted';
$nrd_shared_password_encrypted = "redacted';
1;

The values here are encrypted by default on new installations of Opsview Monitor (since 4.6.0, December 2014).

To find the encrypted string for a plaintext password, you can use the following command:

root@ov-author:~# /usr/local/nagios/bin/opsview_crypt
Enter text to encrypt: ********
Encrypted value:
fe4940b982ee95eb881c8269fa1b227c08b84dd71e286c6050682715ea11d818

Simply copy and paste this value and add within the quotation marks, i.e. runtime_dbpasswd_encrypted = 'fe4940b982ee95eb881c8269fa1b227c08b84dd..

Note: Both runtime ($runtime_dbhost) and opsview ($dbhost) need to be on the same host since this enables queries (joins) to be made across both databases.

Access Control

You should also set up access controls with your new database server so that the Opsview Monitor master is allowed to connect. So, from the Opsview Monitor master run the command shown below:

Set up access controls to allow the Opsview Monitor master to connect.

/usr/local/nagios/bin/db_mysql -t > opsview_access.sql

The file, opsview_access.sql now contains all the necessary access credentials, which should be transferred to the new database and imported as follows:

mysql -u root -p < opsview_access.sql

For additional security, you may want to restrict access to the Opsview Monitor master only.

Restarting Opsview Monitor

From the Opsview Monitor master server, restart the opsview-web and regenerate the configuration for Opsview Monitor, as this will update third-party software applications with the new connection information, as shown below:

Restart opsview-web to ensure third-party software has the right information.

/usr/local/nagios/bin/rc.opsview gen_config
/opt/opsview/watchdog/bin/opsview-monit start opsview-web

Common Tasks

Backing up Opsview Monitor Databases and Configuration

  • Edit /usr/local/nagios/etc/opsview.conf to set the correct backup destination
    su - nagios
    /usr/local/nagios/bin/rc.opsview backup
    

Backing up Opsview Monitor Database Only

  • Ensure opsview.conf is correct
    su - nagios
    /usr/local/nagios/bin/db_opsview db_backup | gzip -c > {backup file}
    

The runtime, odw and reports databases may be backed up in the same way.

Restoring From a Database Backup

Identify the required image to restore from (location is held in $backup_dir variable within the opsview.conf file if using a full backup rather than database only).

su - nagios
gunzip -c {/path/to/nagios-db-{date}.sql.gz} | /usr/local/nagios/bin/db_opsview db_restore

If you need to upgrade the database schema because you have restored a backup from an earlier release of Opsview, you can run the following:

/usr/local/nagios/installer/upgradedb_opsview.pl

Setting MySQL root Password

We recommend you set a password for 'root' user

mysqladmin -u root password {password}

Granting Access to Remote User

For making remote database connections to Opsview Data Warehouse

 grant all privileges on *.* to '<username>'@'<hostname>' identified by '<password>' with grant option;
 flush privileges;

Fixing Damaged Database Tables

If a database table is damaged, you may get error messages like:

Table 'service_saved_state' is marked as crashed and should be repaired

A common cause is running out of space on /var partition where mysql writes its table files.

 mysqlcheck -p -u <user> <database>

To repair table (from MySQL client - note that you'll need enough disk space free for MySQL to make a new copy of the damaged table as a .TMD file):

 use <database name>;
 REPAIR TABLE <tablename>;

To check all databases, you can use the following as the mysql root user

mysqlcheck -A -r -u root -p

Using a Read-Only Database

In Opsview Monitor 4.6.3 onwards, you have the ability to use a separate, read-only, database, for certain REST related database queries. This will, for the most part, increase the responsiveness of the main REST API calls that Dashboard uses, and reduce the load on the master database.

Here, we assume you have already replicated at least the opsview and runtime databases. You'll need to make sure that the opsview and nagios users exist on the slave MySQL database server, and have SELECT access to the opsview and runtime databases, respectively. You should allow them to connect from the Opsview Monitor master database server.

For example:

CREATE USER 'opsview'@'masterserver' IDENTIFIED BY 'password';
GRANT SELECT ON opsview.* TO 'opsview'@'masterserver';
CREATE USER 'nagios'@'masterserver' IDENTIFIED BY 'password';
GRANT SELECT ON runtime.* TO 'nagios'@'masterserver';

/usr/local/nagios/etc/opsview.defaults has database related 'ro' variables. Copy these to opsview.conf and change them to point to the slave MySQL server. The 'opsviewro' Variables are for the opsview user, and the 'runtimero' variables are for the Nagios user.

Any variables that aren't copied into opsview.conf or changed will default to their respective values from the non-read-only database. Be sure to double-quote the variable values (see the other database related values around that area in opsview.defaults for some examples).

You can obtain the encrypted versions of the passwords by running /usr/local/nagios/bin/opsview_crypt

Restart opsview-web: 'service opsview-web restart'. Dashboard and other parts of Opsview Monitor should now be using the slave MySQL database for most status views.

Troubleshooting

If Opsview Monitor doesn't start back up after a restart, check that you've changed the values of the correct 'ro' variables in opsview.conf.

Planning Your System

Considerations for planning your Opsview Monitor environment