Orchestrator

Description

The Orchestrator creates and distributes collection plans to Collectors within Opsview (actually to a Scheduler component on the Collector). The plans contain configuration and state information which allows the Collectors to schedule and monitor host/service checks, execute event handlers and send out notifications.

  • Upon a 'reload', individual collection plans are built and sent to each configured Collector within a Cluster. If a Cluster has no active Collectors, the 'reload' will fail. Also included in these initial plans are configuration files for other components (e.g. SNMP traps, SNMP interfaces, Notification methods and Netflow). What files are sent are based on whether the feature is available within that Cluster.

  • If a Collector re-starts, the Orchestrator will re-send the collection plan to that Collector (with update state info').

  • If a Collector drops out from a Cluster, then the Orchestrator will re-distribute the hosts monitored by that Collector to the remaining Collectors within the Cluster.

  • If a dropped Collector comes back on-line, the Orchestrator will re-distribute the hosts again, whilst attempting to ensure that the hosts are back on their original Collectors.

The Orchestrator also supports an HTTP API to allow for various real-time actions initiated from the UI (e.g. recheck host/service or set state).

Dependencies

The Orchestrator requires access to the Database, MessageQueue, DataStore and LicenseManager. Please make sure these are installed, configured and running before you attempt to run Orchestrator.

You will also need to ensure the mysql client binary is installed.

Installation

This component is deployed on the Master server when installing Opsview Monitor and must not be moved to another server.

Configuration

The user configuration options should be set in "/opt/opsview/orchestrator/etc/orchestrator.yaml". Default values are shown in "/opt/opsview/orchestrator/etc/orchestrator.defaults.yaml", but changes should not be made here since the file will get overwritten on package update.

The below list shows the options that can be set. Note that any changes made to the component configuration file will be overwritten when opsview-deploy is next run.

  • master_database_connection: The connection to the master database server.
  • opsview_database_name: The name of the Opsview configuration database.
  • runtime_database_name: The name of the Opsview runtime database.
  • bsm_queue: The message-queue configuration to send BSM recalculation messages.
  • collector_queue: The message-queue configuration to send collection plans and commands.
  • downtime_queue: The message-queue configuration to send downtime requests.
  • resultsdispatcher_queue: The message-queue configuration to send acknowledgements and set-state messages.
  • flow_request_queue: The message-queue configuration to send flow request messages.
  • flow_response_queue: The message-queue configuration to receive flow response messages.
  • orchestrator_queue: The message-queue configuration to receive command messages.
  • snmp_trap_trace_queue: The message-queue configuration to send SNMP trap messages.
  • command_queue: The message-queue configuration to send command messages.
  • license_manager: The connection to the license manager.
  • orchestrator_store: The data-store configuration for configuration.
  • notifications_logs_store: The data-store configuration for notification logs.
  • http_server: The endpoint for the orchestrator API.
  • registry: Connection configuration for the Registry.
  • snmp_mib_dirs: A list of directories to search for MIBs.
  • logging: Component logging configuration.
  • service_check_defaults: The initial state of a Servicecheck when first added into the configuration (the state before the first check is performed)

The service_check_defaults configuration default is defined as follows:

orchestrator:
  service_check_defaults:
    state: 0
    output: 'Service assumed OK - no results received'

Allowed states are:

StateMeaning
0OK
1WARNING
2CRITICAL
3UNKNOWN

Note: Changing the service_check_defaults has the following limitations:

  • This initial state is not shown in the history of this service as we only record from the first received result onwards.
  • If you set a non-OK state, ODW statistics will still assume an initial OK state, so service availability statistics will not be accurate until the first result is received.
  • If a service is deleted and then added back, the initial state will be the last known state of the service

API

The API operates over HTTP and supports the following commands:

Command

Description

CommandPurpose
generateGenerate collection plans (e.g. for reload).
downtimeSchedule or cancel downtime.
recalculate_bsm_statusesRecalculate bsm objects.
acknowledgementAcknowledgements, by name or object-id.
flow_queryExecute a flow query using the flow collector.
process_resultProcess a manual result (set status).
get_collectors_for_hostsReturn a map of collectors for hosts.
get_machine_node_dataReturn machine node data (from opsview-watchdog).
statsReceive general stats.
recheckPerform a recheck.
set_actionsSets actions such as enabling/disabling active checks.
send_snmptrap_traceSend set trace message to snmptraps collector
get_notification_logsReturns notification logs from the Datastore back to the UI.
execute_remote_commandExecutes a remote command on the specified collector, or will lookup the appropriate collector

Management

Configuration

DPKGs

Watchdog service files are now managed by the package, doing a remove would leave the watchdog service file behind with a .save extension. Purging the package will remove it. The package managed config files are as follows

  • /opt/opsview/watchdog/etc/services/opsview-orchestrator.conf

RPMs

Watchdog service files are now managed by the package. Any modifications will be saved at upgrade and remove processes with the .rpmnew and .rpmsave extensions correspondingly.

  • /opt/opsview/watchdog/etc/services/opsview-orchestrator.conf

Service Administration

As root, start, stop and restart the service using:

/opt/opsview/watchdog/bin/opsview-monit <start|stop|restart> opsview-orchestrator