Requirements
A Linux server (Ubuntu/Debian) with a running Polkadot node
Open ports: 9100
(node_exporter), 9090
(Prometheus), 9093
(Alertmanager), 9615
(Polkadot metrics endpoint)
A Telegram bot token from
Automatic Installation
Copy source <(curl -s https://raw.githubusercontent.com/validexisinfra/polkadot/main/install-alertmanager.sh)
Manual Installation
Install Node Exporter
Node Exporter collects server-level metrics such as CPU, memory, disk, and more.
Copy cd $HOME
sudo wget $(curl -s https://api.github.com/repos/prometheus/node_exporter/releases/latest | grep "tag_name" | awk '{print "https://github.com/prometheus/node_exporter/releases/download/" substr($2, 2, length($2)-3) "/node_exporter-" substr($2, 3, length($2)-4) ".linux-amd64.tar.gz"}')
sudo tar xvf node_exporter-*.tar.gz
sudo cp ./node_exporter-*.linux-amd64/node_exporter /usr/local/bin/
sudo useradd --no-create-home --shell /usr/sbin/nologin node_exporter
sudo rm -rf ./node_exporter*
Create a dedicated system user and service:
Copy sudo tee /etc/systemd/system/node_exporter.service > /dev/null <<EOF
[Unit]
Description=Node Exporter
Wants=network-online.target
After=network-online.target
[Service]
User=node_exporter
Group=node_exporter
Type=simple
ExecStart=/usr/local/bin/node_exporter
[Install]
WantedBy=multi-user.target
EOF
sudo systemctl daemon-reload
sudo systemctl enable node_exporter.service
sudo systemctl start node_exporter.service
sudo systemctl status node_exporter.service
Install Prometheus
Download and install
Copy curl -s https://api.github.com/repos/prometheus/prometheus/releases/latest \
| grep browser_download_url | grep linux-amd64.tar.gz \
| cut -d '"' -f 4 | wget -qi -
tar xvf prometheus-*.tar.gz
cd prometheus-*.linux-amd64
sudo cp prometheus promtool /usr/local/bin/
sudo mkdir -p /etc/prometheus /var/lib/prometheus
if [ -d "consoles" ]; then
sudo cp -r consoles /etc/prometheus/
fi
if [ -d "console_libraries" ]; then
sudo cp -r console_libraries /etc/prometheus/
fi
Create a dedicated user and set ownership
Copy sudo id -u prometheus &>/dev/null || sudo useradd --no-create-home --shell /usr/sbin/nologin prometheus
sudo chown -R prometheus:prometheus /etc/prometheus /var/lib/prometheus
Prometheus Configuration
Copy sudo tee /etc/prometheus/prometheus.yml > /dev/null <<EOF
global:
scrape_interval: 15s
evaluation_interval: 15s
rule_files:
- 'rules.yml'
alerting:
alertmanagers:
- static_configs:
- targets: ['localhost:9093']
scrape_configs:
- job_name: 'node_exporter'
scrape_interval: 5s
static_configs:
- targets: ['localhost:9100']
- job_name: 'polkadot_node'
scrape_interval: 5s
static_configs:
- targets: ['localhost:9615']
EOF
Create Prometheus systemd service
Copy sudo tee /etc/systemd/system/prometheus.service > /dev/null <<EOF
[Unit]
Description=Prometheus Monitoring
Wants=network-online.target
After=network-online.target
[Service]
User=prometheus
Group=prometheus
Type=simple
ExecStart=/usr/local/bin/prometheus \
--config.file /etc/prometheus/prometheus.yml \
--storage.tsdb.path /var/lib/prometheus/ \
--web.console.templates=/etc/prometheus/consoles \
--web.console.libraries=/etc/prometheus/console_libraries \
--storage.tsdb.retention.time 30d \
--web.enable-admin-api
ExecReload=/bin/kill -HUP $MAINPID
[Install]
WantedBy=multi-user.target
EOF
Create Alert Rules
Copy cd /etc/prometheus
sudo tee rules.yml > /dev/null <<EOF
groups:
- name: alert_rules
rules:
- alert: PolkadotNodeSyncLag
expr: (max(substrate_block_height{status="best"}) by (instance) - max(substrate_block_height{status="finalized"}) by (instance)) > 20
for: 5m
labels:
severity: critical
annotations:
summary: "Node polkadot-1 lagging behind"
description: "Node polkadot-1 is lagging more than 20 blocks behind the network."
- alert: NodeDown
expr: up{job="polkadot_node"} == 0
for: 1m
labels:
severity: critical
annotations:
summary: "Node polkadot-1 down"
description: "Node polkadot-1 has been down for more than 1 minute."
- alert: HighDiskUsage
expr: (node_filesystem_avail_bytes{job="node_exporter", fstype!="tmpfs", fstype!="sysfs", fstype!="proc"} / node_filesystem_size_bytes{job="node_exporter", fstype!="tmpfs", fstype!="sysfs", fstype!="proc"}) * 100 < 2
for: 5m
labels:
severity: critical
annotations:
summary: "High disk usage on polkadot-1"
description: "Disk usage is above 98% on polkadot-1."
- alert: PolkadotNodeNotSyncing
expr: substrate_sub_libp2p_sync_is_major_syncing{job="polkadot_node"} == 1
for: 5m
labels:
severity: critical
annotations:
summary: "Node polkadot-1 not syncing"
description: "Node polkadot-1 is not syncing blocks for more than 5 minutes."
- alert: PolkadotNodeHighCPUUsage
expr: rate(process_cpu_seconds_total{job="polkadot_node"}[5m]) > 0.8
for: 5m
labels:
severity: warning
annotations:
summary: "High CPU usage on polkadot-1"
description: "CPU usage is above 80% on polkadot-1 for more than 5 minutes."
EOF
Start and Enable the Prometheus Service
Copy sudo chown prometheus:prometheus rules.yml
sudo systemctl daemon-reload
sudo systemctl enable prometheus.service
sudo systemctl start prometheus.service
sudo systemctl status prometheus.service
Install Alertmanager
Download and install
Copy cd ~
sudo wget https://github.com/prometheus/alertmanager/releases/download/v0.24.0/alertmanager-0.24.0.linux-amd64.tar.gz
sudo tar xvf alertmanager-0.24.0.linux-amd64.tar.gz
sudo rm alertmanager-0.24.0.linux-amd64.tar.gz
sudo mkdir /etc/alertmanager /var/lib/prometheus/alertmanager
cd alertmanager-0.24.0.linux-amd64
sudo cp alertmanager amtool /usr/local/bin/
sudo cp alertmanager.yml /etc/alertmanager
Copy sudo useradd --no-create-home --shell /bin/false alertmanager
sudo chown -R alertmanager:alertmanager /etc/alertmanager /var/lib/prometheus/alertmanager
sudo chown alertmanager:alertmanager /usr/local/bin/{alertmanager,amtool}
Configuration
Example configuration (replace YOUR_TOKEN
):
Copy sudo tee /etc/alertmanager/alertmanager.yml > /dev/null <<EOF
route:
group_by: ['alertname', 'instance', 'severity']
group_wait: 30s
group_interval: 5m
repeat_interval: 1h
receiver: 'telepush'
receivers:
- name: 'telepush'
webhook_configs:
- url: 'https://telepush.dev/api/inlets/alertmanager/<YOUR_TOKEN>'
inhibit_rules:
- source_match:
severity: 'critical'
target_match:
severity: 'warning'
equal: ['alertname', 'dev', 'instance']
EOF
Create Alertmanager service
Copy sudo tee /etc/systemd/system/alertmanager.service > /dev/null <<EOF
[Unit]
Description=AlertManager Server Service
Wants=network-online.target
After=network-online.target
[Service]
User=root
Group=root
Type=simple
ExecStart=/usr/local/bin/alertmanager --config.file /etc/alertmanager/alertmanager.yml --web.external-url=http://$IP_ADDRESS:9093 --cluster.advertise-address='0.0.0.0:9093'
[Install]
WantedBy=multi-user.target
EOF
sudo systemctl daemon-reload
sudo systemctl enable alertmanager
sudo systemctl start alertmanager
Copy sudo systemctl restart prometheus.service
sudo systemctl restart alertmanager.service
Grafana
Install Required Dependencies
Copy sudo apt-get install -y apt-transport-https software-properties-common wget
wget -q -O - https://packages.grafana.com/gpg.key | sudo apt-key add -
Add the Grafana Repository
Copy echo "deb https://packages.grafana.com/enterprise/deb stable main" | sudo tee -a /etc/apt/sources.list.d/grafana.list
Create a User for Grafana
Copy sudo useradd -m -s /bin/bash grafana
sudo groupadd --system grafana
sudo usermod -aG grafana grafana
Install Grafana Enterprise
Copy #Install Additional Utilities
sudo apt-get install -y adduser libfontconfig1
Copy #Download and Install Grafana Enterprise
wget https://dl.grafana.com/enterprise/release/grafana-enterprise_9.3.2_amd64.deb
sudo dpkg -i grafana-enterprise_9.3.2_amd64.deb
Start and Enable the Grafana Server
Copy sudo systemctl daemon-reload
sudo systemctl enable grafana-server
sudo systemctl start grafana-server
sudo systemctl status grafana-server
Importing Dashboards into Grafana
To set up dashboards in Grafana, you need the JSON files of the dashboards. These files can either be:
Downloaded from a public source like Grafana's Dashboard Library
Created manually by you directly in Grafana
If you're using Grafana's library, search for the dashboard by its ID and download the JSON file.
Once downloaded, you can import the JSON file into Grafana via:
Grafana UI β Dashboards β Import β Upload JSON file
Final Checks
Access Prometheus: http://<your-server-ip>:9090
Access Alertmanager: http://<your-server-ip>:9093
Access Grafana: http://<your-server-ip>:3000
Verify that your Polkadot node metrics (:9615
) and alerts are visible