This repo contains the scripts used by Carpe Noctem Tactical Operations to monitor the status of its online services.
Every script is compliant to the Nagios Plugins standards in order to be attached to a Nagios based monitoring system.
- Python 3.5
It is recommended to install the package under a virtual environment (for example virtualenv).
Installation is possible via pip: clone or download the repository, open a terminal window and change location to the project main folder then run pip install .
The cnto-check-runner script executes a program which must abide to Nagios Plugins guidelines (see
http://nagios-plugins.org/doc/guidelines.html) and updates a single component on a configured
instance of Cachet to reflect the status detected by the script. If the status detected is WARNING
or CRITICAL the runner will execute the monitoring program again for a maximum number of times
given as argument and waiting a given amount of time between each attempt. Cachet will be updated
only if the status remains either WARNING or CRITICAL after all the retries. A final status of
CRITICAL will trigger a status of Major Outage on Cachet. A final status of WARNING will trigger
a status of Partial Outage on Cachet. As soon as OK status is detected (within the maximum number
of retries) a Operational status is forwarded to Cachet. Having custom mappings between Nagios exit codes
and Cachet statuses will be possible in the future. If the detected status is UNKNOWN Cachet will not be
updated.
This script is itself compatible with Nagios Plugins standards and will return a OK status if
Cachet has been updated (no information on the update is provided though), a CRITICAL status if
the executed program returned an incompatible exit code and a UNKNOWN status if the service status
is uncertain.
Note: if this script is used as Nagios Plugin then the debug option must be disabled.
The following scripts are provided to monitor CNTO's services and can be used with cnto-check-runner:
cnto-http-monitor: checks a resource availability over HTTP/HTTPS. By default the resource is considered to be in aOKstatus if the HTTP status code is 200 and inCRITICALstatus otherwise, different behaviours may be specified using optional arguments.cnto-ts3-monitor: checks a TeamSpeak 3 Server availability. If the server responds to a connection requests on the client port aOKstatus is triggered, if the response exceeds the timeout aCRITICALstatus is triggered, if the provided hostname/address is invalid aUNKNOWNstatus is triggered.cnto-arma3-monitor: checks an Arma3 Server availability usingA2Sserver queries. If the server responds aOKstatus is triggered, if the response exceeds the timeout aCRITICALstatus is triggered, if either one of the provided hostname/address and port is invalid aUNKNOWNstatus is triggered.