The Opsview Monitor Timeseries graphing data engine included in Opsview Monitor from version 5.2.0 provides a very flexible service for storing data used by the graphing services in the UI.
In the default configuration, all data is stored on the master server in exactly the same way as previous versions. However, if you experience high IO or load on the master server, then the graphing data engine can now be moved onto another server.
The graphing data engine is provided in 4 packages that are all installed via your normal OS package manager. They will be installed by default on the master server.
- opsview-timeseries - request dispatcher
- opsview-timeseries-enqueuer - request queuing and caching daemon
- opsview-timeseries-lib - shared libraries between the other timeseries packages
- opsview-timeseries-rrd - provides the RRD based data storage
All of these packages install under /opt/opsview and the directory names match the package names:
Each package uses the same directory structure and they all log to syslog (iusually into log files within /var/log, depending on how your system is configured).
All of the timeseries processes are stopped and started using the Opsview Monitor Watchdog. You can check them by running the following as the nagios user:
$ opsview_watchdog summary +----------------------------------------+------------+-------------------+ | Service | Status | Monitoring Status | +----------------------------------------+------------+-------------------+ .... cut .... | Process 'opsview-timeseriesrrdupdates' | Running | Monitored | | Process 'opsview-timeseriesrrdqueries' | Running | Monitored | | Process 'opsview-timeseriesenqueuer' | Running | Monitored | | Process 'opsview-timeseries' | Running | Monitored |
The processes can be stopped, started and restarted individually, if required, e.g.:
$ opsview_watchdog opsview-timeseries restart
All the daemon packages (i.e. all packages except timeseries-lib) provide two configuration files within their etc directory.
- the <package name>.defaults.yaml file contains all default settings for the package. This file should not be changed or modified in any way. All changes to this file will be lost when the package is upgraded.
- the <package name>.yaml.example file can be copied to <package name>.yaml and amended for local configuration changed - this file will not get overwritten on an upgrade.
If you need to change any of the default settings, copy the specific lines into the locally copied <package name>.yaml file.
The formatting of these files is very specific - spaces should be used to indent lines, not tabs. When changes have been made, restart the relevant daemon using opsview_watchdog as outlined in the Processes section above.
There are a number of steps involved in moving Timeseries to another server.
The first step will be to manually install the 4 packages (and their prerequisites) on the new server as per the Installation section above (you cannot use the autoinstall method for this at this time). You should not remove any of the timeseries packages from the master.
If you already have graphing data on your master server, you must transfer all the files to the new timeseries server using rsync (or similar), otherwise all graphing history will be lost. By default, the master server uses the directory /usr/local/nagios/var/rrd but on a newly installed and separate Timeseries server this will be /opt/opsview/timeseriesrrd/var/data.
On the new Timeseries server you must amend /opt/opsview/timeseries/etc/timeseries.yaml to set the correct listening address (as by default Timeseries only listens on the loopback interface). To do this, amend the file as follows:
timeseries: host: 0.0.0.0
and restart the timeseries daemons - you can do this as root by running:
/opt/opsview/watchdog/bin/opsview-monit all restart
Then, you must amend /usr/local/nagios/etc/opsview.conf on the Opsview Monitor server to set the following variable to point to the correct server IP address and port:
$timeseries_url = 'http://192.168.10.22:1600';
and restart all Opsview Monitor daemons - you can do this as root by running:
/opt/opsview/watchdog/bin/opsview-monit all restart
Finally, you can shut down all of the Timeseries daemons on the master server.
/opt/opsview/watchdog/bin/opsview-monit opsview-timeseriesrrdupdates unmonitor /opt/opsview/watchdog/bin/opsview-monit opsview-timeseriesrrdqueries unmonitor /opt/opsview/watchdog/bin/opsview-monit opsview-timeseriesenqueuer unmonitor /opt/opsview/watchdog/bin/opsview-monit opsview-timeseries unmonitor
Graphing data should now be provided from the new Timeseries server.
The daemon process 'import_perfdatarrd' reads files from /usr/local/nagios/var/perfdatarrd and then passes the data on to the timeseries manager daemon on port 1600 on the configured host (localhost by default).
The Timeseries manager process launches and monitors worker processes (four by default) which are responsible for parsing and dispatching incoming requests. Write requests (adding more metrics from import_perfdatarrd) are dispatched to Timeseries Enqueuers (localhost on port 1620 by default), while the queries are dispatched to Timeseries RRD Queries (localhost port 1660 by default).
Timeseries Enqueuer passes the data to all configured RRD Updater workers simultaneously (localhost with ports 1640-1643 by default)
The timeseries RRD update worker writes out the data into the rrd files. On an upgraded system (one that previously ran an older version of Opsview Monitor) the rrd files are stored in /usr/local/nagios/var/rrd/<hostname>/<servicename>/<metric>/value.rrd, whereas on a new Opsview Monitor 5.2 (or later) installation, the data is stored in /opt/opsview/timeseriesrrd/var/data/<hostname>/<servicename>/<metric>/value.rrd.
The timeseries manager, enqueuer and RRD writer daemons can all be installed on separate hosts. However, for network bandwidth usage it is generally better to keep the enqueuer and RRD daemons on the same machine.
When using RRD (Round Robin Database), numerical values are stored in "time buckets" so there is a single value for each of these buckets. These are the default values used by Opsview:
- Expects a 5 minute interval for values
- Will keep 5 minute buckets for the last 50 hours
- Will keep 30 minute buckets for the last 2 weeks
- Will keep 2 hour buckets for 2 months
- Will keep 1 day buckets for 2 years
This means the resolution of data gradually gets "thinned out" over time. When calculating a "bigger bucket" (such as taking six 5 minute buckets and consolidating into a single 30 minute bucket), the average value will be used.
Note, the "RRD heartbeat" is set to 4200 seconds by default, which means that if no values are received after an hour and 10 minutes, there will be gaps in the data. If any value is received during this time, all the buckets during the last hour and 10 minutes will be filled with this value.