edit this page

Alerting

Alert Configuration

replication-manager offers multiple way of alerting on cluster node status change.

External Script Configuration

An alert script can be triggered when enabled via this config file parameter:

`monitoring-ignore-errors` (2.1)

Item	Value
Description	List errors or warnings to be ignored.
Type	String
Default Value	"WARN0067,WARN0066"

`alert-script` (2.0)

Item	Value
Description	Full path to an alerting script.
Type	String
Default Value	""

The following arguments are passed to the script

[x] Server URL
[x] Server previous state
[x] Server current state

Email Configuration

An email can be send via postfix using the following parameters:

`mail-from` (0.7)

Item	Value
Description	Alert email sender, a valid postfix user should be used.
Type	String
Example	"user@hostname"

`mail-smtp-addr` (0.7)

Item	Value
Description	Alert email SMTP server in host:[port] format.
Type	String
Example	"localhost:25"

`mail-smtp-user` (2.1)

Item	Value
Description	Auth SMTP User
Type	String
Default	""

`mail-smtp-password` (2.1)

Item	Value
Description	Auth SMTP password
Type	String
Default	""

`mail-to` (0.7)

Item	Value
Description	Single email to send the alert.
Type	String
Example	"[email protected]"

Important Note No secure mail server is supported .

Alerting from slack

__ Slack reporting is common to all clusters

`alert-slack-url` (2.1)

Item	Value
Description	Slack webhook URL to alert.
Type	String
Default Value	""

`alert-slack-channel` (2.1)

Item	Value
Description	Slack channel to alert.
Type	String
Default Value	"#support"

`alert-slack-user` (2.1)

Item	Value
Description	Slack user for alert.
Type	String
Default Value	""

Alerting from logs

The logs can be send to syslog services via

`log-syslog` (0.7)

Item	Value
Description	Duplicate messages to syslog.
Type	Boolean
Default Value	false

User can lookup in logs for tag type=state to trigger some custom alerting.

We can improve log facilities to send messages to various log analyze systems, contact signal18.io for NRE.

External status monitoring

The API provide some useful endpoint to check for status

Checking for monitoring daemon status can be done via URL
http://replicaion-manager-host:3000/api/status

 {"alive": "running"}
 {"alive": "starting"}

http://replicaion-manager-host:3000/api/clusters/{clusterName}/status

{"alive": "running"}
{"alive": "errors"}

http://replicaion-manager-host:3000/api/clusters/{clusterName}/topology/alerts

Client call checking status

replication-manager-cli status

running

replication-manager-cli status --cluster=cluster_haproxy_masterslave

errors

replication-manager-cli status --cluster=cluster_haproxy_masterslave --with-errors

{
    "errors": [
        {
            "number": "ERR00021",
            "desc": "All cluster db servers down",
            "from": "TOPO"
        },
        {
            "number": "ERR00010",
            "desc": "Could not find a slave in topology",
            "from": "TOPO"
        },
        {
            "number": "ERR00012",
            "desc": "Could not find a master in topology",
            "from": "TOPO"
        }
    ],
    "warnings": [
        {
            "number": "INF00001",
            "desc": "Server 127.0.0.1:3310 is down",
            "from": "TOPO"
        }
    ]
}

replication-manager-cli bootstrap --cluster=cluster_haproxy_masterslave

Can't found topology after bootstrap

The cluster is not provisioned launch it manually or via the replication-manager-tst or replication-manager-pro release

replication-manager-cli bootstrap --cluster=cluster_haproxy_masterslave --with-provisioning

Provisioning done

replication-manager-cli status --cluster=cluster_haproxy_masterslave

running

Alerting

Alert Configuration

External Script Configuration

monitoring-ignore-errors (2.1)

alert-script (2.0)

Email Configuration

mail-from (0.7)

mail-smtp-addr (0.7)

mail-smtp-user (2.1)

mail-smtp-password (2.1)

mail-to (0.7)