Opsview Infrastructure Agent (Beta) Configuration

How to configure the Opsview Infrastructure Agent

Required configuration

Most configuration is provided by default in the agent.default.yml file; however, the following required configuration options must be set in a custom configuration file prior to running the agent.

ValueTypeDefaultDescriptionExample configuration
commands:
command:
path: /path/to/plugin
stringDefault pluginsLocation of the plugin or executable to be run when command_name is requested via check_nrpe.commands: nsc_checkcpu: path: /path/to/plugin
server:
allowed_hosts:
list of strings-The list of addresses of the clients that are allowed to use the Agent. An empty list will allow any client to connect. Note: if check_client_cert is enabled then only hostnames will be validated (not IP addresses).server: allowed_hosts: - hostname.com - 192.1.2.3
server:
tls:
ca_cert:
string-Path to the CA certificate.server: tls: ca_cert: /path/to/ca_cert ca_path: /path/to/ca_path cert_file: /path/to/server_key check_client_cert: true key_file: /path/to/server_cert tls_enabled: true
server:
tls:
ca_path:
string-Path to the CA directory.See above
server:
tls:
cert_file:
string-Path to the server certificate.See above
server:
tls:
check_client_cert:
booleantrueWhether a client certificate is required and if so then it has to be specified in the allowed hosts.See above
server:
tls:
key_file:
string-Path to the server keySee above
server:
tls_enabled:
booleantrueWhether the Agent uses TLS for communications. If tls_enabled is set to false, the above TLS configurations are not required. Setting tls_enabled to false is not recommended as communications will not be secure.See above

Basic configuration

The following basic configuration options can be overridden in a custom config file.

ValueTypeDefaultDescriptionExample configuration
commands:
command:
cache_manager:
booleanfalseWhether or not command uses the Cache Manager to save temporary state information.commands: nsc_checkcpu: cache_manager: true long_running_key: $PATH$ path: /path/to/plugin
commands:
command:
long_running_key
$PATH$ | $NAME$ | custom-key$PATH$To reduce resource spikes due to startups a long-running process can be used to maintain the specified plugin as a long-running process that is communicated with via STDIN/STDOUT. The key is used to identify an instance of long running process.

Values:

$PATH$ - shortcut using the path as the key (minus any arguments)
$NAME$ - shortcut using the command name as the key
* custom-key - process key used directly
commands: nsc_checkcpu: cache_manager: true long_running_key: $PATH$ path: /path/to/plugin
execution:
execution_timeout:
integer60The maximum time that a service check is allowed to run in seconds. If this time is reached, then the service check is terminated and an error response is sent to the client.execution: execution_timeout: 60
logging:
handlers:
syslog:
facility:
stringlocal6The syslog storage location for accepting log messages. (applies to the Linux Agent only).logging: handlers: file: filename: C:/Program Files/Infrastructure Agent/ logs/agent.log loggers: agent: level: INFO cache: level: INFO main: level: INFO nrpe: level: INFO
Note: DEBUG logging will produce very verbose logs
logging:
handlers:
file:
filename:
string-Path to logging file.See above
logging:
loggers:
agent:
level:
ERROR | WARNING | INFO | DEBUGINFOLogging level for the main Agent process.See above
logging:
loggers:
cache:
level:
ERROR | WARNING | INFO | DEBUGINFOLogging level for cache manager.See above
logging:
loggers:
default:
level:
ERROR | WARNING | INFO | DEBUGINFOLogging level for the Agent launcherSee above
logging:
loggers:
helpers:
level:
ERROR | WARNING | INFO | DEBUGINFOLogging level for the helper functions.See above
logging:
loggers:
nrpe:
level:
ERROR | WARNING | INFO | DEBUGINFOLogging level for the nrpe serverSee above
server:
bind_address:
string0.0.0.0The address on which the NRPE server listens.server: bind_address: 0.0.0.0 port: 5666
server:
port:
integer5666Number of the port on which the agent listens for NRPE requests.See above
windows_runtimes:
runtime: /path/to/runtime
string-Path to the plugin runtime.windows_runtimes: python: C:/Path/To/Python.exe

Advanced configuration

The following configuration options are for advanced users and can be overridden in a custom config file.

ValueTypeDefaultDescriptionExample configuration
cachemanager:
host:
string127.0.0.1IP that the Cache Manager is listening on, which defaults to localhost.cachemanager: host: 127.0.0.1 housekeeping_interval: 60 max_cache_size: 1GB max_item_size: 0 port: 8184 timestamp_error_margin: 30
cachemanager:
housekeeping_interval:
integer60Determines how often to purge cache manager items that have expired in seconds.See above
cachemanager:
max_cache_size:
string1GBTotal maximum size of the cache manager cache, for example 500KB, 1GB, and 2MB.See above
cachemanager:
max_item_size:
integer0Largest size for any single item in the cache. 0 means there is no limit (subject to max_cache_size).See above
cachemanager:
port:
integer8184Port number the cache manager listens on.See above
cachemanager:
timestamp_error_margin:
integerTimestamps are required to prevent replay attacks. It is recommended that you make allowances for some delays in the system, and the timestamp_error_margin is one factor is to account for that.See above
poller_schedule:
command name:
integer-List of commands that use the poller and the associated sampling interval in seconds.

Use the plugin name to look up configured plugins to see if there are any default args that need applying.
poller_schedule: nsc_checkcpu: 10
process_recycle_time:integer86400The time after which the long-running processes are automatically recycled (restarted), so as to avoid potential memory leaks.process_recycle_time: 86400
server:
housekeeping_interval:
integer300Determines how often to flush the cache of looked-up addresses (non-TLS).server: housekeeping_interval: 300 max_active_connections: 15 max_queued_connections: 30 max_request_time: 120 receive_data_timeout: 5 tls: ca_cert: null ca_path: null cert_file: null check_client_cert: true cipher_suite: ECDH+AESGCM:ECDH+AES256:ECDH+AES128:!aNULL:!MD5:!DSS context_options: - NO_SSLv3 - NO_TLSv1 - NO_TLSv1_1 key_file: null tls_enabled: true tls_handshake_timeout: 3
server:
max_queued_connections:
integer30Maximum number of connections that may be queued waiting to be accepted by the NRPE server.See above
server:
max_active_connections:
integer15Maximum number of connections that may be handled concurrently by the NRPE server. This is effectively the number of commands that can be run in parallel.See above
server:
max_request_time:
integer120Maximum time the NRPE server waits to acquire the lock that allows it to process a request.

Up to max_connections may hold the lock at any one time.

If the server reaches max_request_time while waiting for the lock, then it should start terminating requests that are in progress, as they have either overrun, or become stuck. However, this functionality has yet to be implemented.

This should probably be set to something similar to execution.timeout but there is no advantage in having it set for much longer than that.
See above
server:
receive_data_timeout:
integer5Maximum time in seconds that the NRPE server will wait for data to arrive after the connection has been established.
This is needed to mitigate a class of DoS attacks, where the client establishes a TLS connection and keeps it open but sends no data.
See above
server:
tls_handshake_timeout:
integer3TLS handshake timeout in secondsSee above
server:
tls:
cipher_suite:
stringECDH+AESGCM:ECDH+AES256:ECDH+AES128:!aNULL:!MD5:!DSSThe cipher used to secure a network connection.See above
server:
tls:
context_options:
list of strings NO_SSLv3
NO_TLSv1
* NO_TLSv1_1
Advanced TLS contextual options.See above

Example configurations

Linux

---
# Example Linux configuration file

cachemanager:
  host: 127.0.0.1
  housekeeping_interval: 60
  max_cache_size: 1GB
  max_item_size: 0
  port: 8184
  timestamp_error_margin: 30
commands:
  check_cpu_stats:
    cache_manager: false
    path: /opt/itrs/infrastructure-agent/plugins/check_cpu_stats $ARG1$
  check_memory:
    cache_manager: false
    path: /opt/itrs/infrastructure-agent/plugins/check_memory $ARG1$
execution:
  execution_timeout: 60
logging:
  handlers:
    syslog:
      facility: local6
  loggers:
    agent:
      level: INFO
    cache:
      level: INFO
    main:
      level: INFO
    nrpe:
      level: INFO
poller_schedule: {}
process_recycle_time: 86400
server:
  allowed_hosts: null
  bind_address: 0.0.0.0
  housekeeping_interval: 300
  max_active_connections: 15
  max_queued_connections: 30
  max_request_time: 120
  port: 5666
  receive_data_timeout: 5
  tls:
    ca_cert: /path/to/ca_cert
    ca_path: /path/to/ca_directory
    cert_file: /path/to/server_cert
    check_client_cert: true
    cipher_suite: ECDH+AESGCM:ECDH+AES256:ECDH+AES128:!aNULL:!MD5:!DSS
    context_options:
    - NO_SSLv3
    - NO_TLSv1
    - NO_TLSv1_1
    key_file: path/to/server_key
  tls_enabled: true
  tls_handshake_timeout: 3

Windows

---
# Example Windows configuration file

cachemanager:
  host: 127.0.0.1
  housekeeping_interval: 60
  max_cache_size: 1GB
  max_item_size: 0
  port: 8184
  timestamp_error_margin: 30
commands:
  nsc_checkcpu:
    long_running_key: $PATH$
    path: C:/Program\ Files/Infrastructure\ Agent/plugins/check_windows.exe check_cpu_load $ARG1$
  nsc_checkdrivesize:
    long_running_key: $PATH$
    path: C:/Program\ Files/Infrastructure\ Agent/plugins/check_windows.exe check_drivesize $ARG1$
execution:
  execution_timeout: 60
logging:
  handlers:
    file:
      filename: C:/Program\ Files/Infrastructure\ Agent/logs/agent.log
  loggers:
    agent:
      level: INFO
    cache:
      level: INFO
    main:
      level: INFO
    nrpe:
      level: INFO
poller_schedule:
  nsc_checkcpu: 10
process_recycle_time: 86400
server:
  allowed_hosts: 
    - myallowedhost.com
    - 10.1.2.3
  bind_address: 0.0.0.0
  housekeeping_interval: 300
  max_active_connections: 15
  max_queued_connections: 30
  max_request_time: 120
  port: 5666
  receive_data_timeout: 5
  tls:
    ca_cert: C:/path/to/ca_cert
    ca_path: C:/path/to/ca_directory
    cert_file: C:/path/to/server_cert
    check_client_cert: true
    cipher_suite: ECDH+AESGCM:ECDH+AES256:ECDH+AES128:!aNULL:!MD5:!DSS
    context_options:
    - NO_SSLv3
    - NO_TLSv1
    - NO_TLSv1_1
    key_file: C:/path/to/server_key
  tls_enabled: true
  tls_handshake_timeout: 3
windows_runtimes: {}