The Opsview Monitor Timeseries graphing data engine provides a very flexible service for storing data used by the graphing services in the UI.
In the default configuration, all data is stored on the master server. However, if you experience high IO or load on the master server, then the graphing data engine may be moved onto another server.
The graphing data engine is provided in 4 packages that are installed by default on the Orchestrator via Opsview Deploy.
opsview-timeseries - request dispatcher
opsview-timeseries-enqueuer - request queuing and caching daemon
opsview-timeseries-lib - shared libraries between the other timeseries packages
opsview-timeseries-rrd - provides the RRD based data storage
All of these packages are installed under `
/opt/opsview` and the directory names match the package names:
Each package uses the same directory structure and they all log to syslog (usually into log files within `
/var/log`, depending on how your system is configured).
All of the timeseries processes are stopped and started using the Opsview Monitor Watchdog. You can check them by running the following as the `
The processes can be stopped, started and restarted individually, if required, e.g.:
All configuration should be done using [Opsview Deploy](🔗); no changes should be made manually to any timeseries configuration file.
### Moving RRD Timeseries to another server
There are a number of steps involved in moving Timeseries to another server.
The first step is to add the correct configuration into the Deploy `
opsview_deploy.yml` file, such as
and then run a deploy as `
root` to install the packages on the new timeseries server:
At this point you should shut down the performance data component and all of the timeseries daemons on both the existing server _and_ the new server:
On the existing server as `
On the new server as `
You must transfer all the data files from the existing timeseries server to the new timeseries server using rsync (or similar), otherwise all graphing history will be lost. By default, Timeseries RRD uses the `
After transferring, run deploy to reconfigure opsview:
This will restart all the daemons on the new timeseries server as well as reconfigure the UI. At this point, graphing data should now be provided from the new Timeseries server and reloads should work successfully.
After you have tested the graphs and reloads you can remove the timeseries packages and data from the old timeseries server.
### Data Flow
The Results-Performance reads the results from MessageQueue and then passes the data on to the timeseries manager daemon on port 1600 on the configured host (localhost by default).
The Timeseries manager process launches and monitors worker processes (four by default) which are responsible for parsing and dispatching incoming requests. Write requests (adding more metrics from Results-Performance) are dispatched to Timeseries Enqueuers (localhost on port 1620 by default), while the queries are dispatched to Timeseries RRD Queries (localhost port 1660 by default).
Timeseries Enqueuer passes the data to all configured RRD Updater workers simultaneously (localhost with ports 1640-1643 by default)
The timeseries RRD update worker writes out the data into the rrd files. Opsview Monitor stores RRD data in `
The timeseries manager, enqueuer and RRD writer daemons can all be installed on separate hosts. However, for network bandwidth usage it is generally better to keep the enqueuer and RRD daemons on the same machine.
### Data Storage - RRD
When using RRD (Round Robin Database), numerical values are stored in "time buckets" so there is a single value for each of these buckets. These are the default values used by Opsview:
Expects a 5 minute interval for values
Will keep 5 minute buckets for the last 50 hours
Will keep 30 minute buckets for the last 2 weeks
Will keep 2 hour buckets for 2 months
Will keep 1 day buckets for 2 years
This means the resolution of data gradually gets "thinned out" over time. When calculating a "bigger bucket" (such as taking six 5 minute buckets and consolidating into a single 30 minute bucket), the average value will be used.
Note, the "RRD heartbeat" is set to 4200 seconds by default, which means that if no values are received after an hour and 10 minutes, there will be gaps in the data. If any value is received during this time, all the buckets during the last hour and 10 minutes will be filled with this value.
InfluxDB is a timeseries database created by InfluxData. It is a part of their set of tools focused at performance data which they collect, store, visualise and then raise alerts. We do not provide InfluxDB directly, instead we provide a client component that is able to communicate with InfluxDB to query and store data. The main between InfluxDB and RRD is that InfluxDB does not aggregate the data after 15 days and 30 days and will require a considerably larger amount of disk space than RRD.
The suggestion would be to extend the disk or move the InfluxDB data to a dedicated disk.
Use of InfluxDB version 1.8.x is supported.
RRD will continue being the default timeseries engine.
InfluxDB has the following differences with RRD:
InfluxDB will store the raw value received, whereas RRD will apply averaging based on the intervals it is defined with. This means RRDs may return non-round numbers for things that should be round (eg: number of bits transferred or number of users), whereas InfluxDB will return whole numbers back when the granularity is small enough (obviously, there maybe fractional numbers when querying the average over a whole day). For example, this is a plugin that returns back the hour it is run in. For RRD, it has an average value of 9.420 at 10:00:
InfluxDB will show the value of 10 at 10:00
RRD has a value for all times going back to the last year, even if that is considered NULL. InfluxDB will only return NULL points when it has got some data for the range requested.
For counters, RRD stores the last counter value and records the difference based on the step size. InfluxDB stores the actual values of each counter but at query time will return the derivative. If a counter is reset, this would provide a negative difference with the previous value. However, this can be a normal scenario (eg: a device restart resets its counters) - in these cases, we assume the same rate as the previous value. For an initial value that is negative, Opsview will return a NULL point
### Migration from RRD to InfluxDB
Make sure system is running the latest packages of Opsview Monitor - see our Installation/Upgrade instruction on [Installation/Upgrade instructions](🔗).
#### Pausing performance metrics processing
#### Extracting existing timeseries data from RRDs
#### Installing InfluxDB
#### Installing Opsview Timeseries InfluxDB Connector
Amend the Opsview Deploy configuration by amending `
/opt/opsview/deploy/etc/user_vars.yml` by adding in the following lines:
Remove the current timeseries-rrd packages
and then run a deploy as `
root` to install the required packages:
#### Restoring previous timeseries data
Get the generated username and password for the timeseries connector:
and substitute <user>:<passwd> with the values in the following command:
#### Configure Opsview
and then run a deploy as `
and then run a reload in the UI
Authentication can be enabled on the InfluxDB database to improve security by following the instructions at https://docs.influxdata.com/influxdb/v1.8/administration/authentication_and_authorization.
To enable Opsview to communicate with InfluxDB using the authentication:
Add the following variables to the /opt/opsview/deploy/etc/user_vars.yml:
Run the following commands as `
Drop the whole database and recreate
Drop all metrics for specific host
Drop all metrics for a specific servicecheck on specific host
NOTE: Single and double quotes are not interchangeable! See the [InfluxDB Documentation](🔗) for more information