@yellowmew
Cloud infrastructure, monitoring engineer. SRE

Почему keepalived не выполняет notify скрипт при первом старте на мастере?

При первом старте (AWS, ASG - то есть ручное вмешательство нежелательно, все максимально автоматизировано) сервис стартует на двух машинах. Одна становится мастером, вторая бэкапом - с выборами все ок.
При смене статуса (MASTER\BACKUP\FAULT\STOP) срабатывает скрипт notify.sh в котором эти статусы обрабатываются.
Проблема: при первом получении статуса MASTER скрипт NOTIFY не выполняется. Логи с мастера:
Mar 20 11:53:50 ip-10-20-112-233 Keepalived[2770]: Opening file '/etc/keepalived/keepalived.conf'.
Mar 20 11:53:50 ip-10-20-112-233 systemd: Reloaded LVS and VRRP High Availability Monitor.
Mar 20 11:53:50 ip-10-20-112-233 Keepalived_healthcheckers[2771]: Got SIGHUP, reloading checker configuration
Mar 20 11:53:50 ip-10-20-112-233 Keepalived_healthcheckers[2771]: Opening file '/etc/keepalived/keepalived.conf'.
Mar 20 11:53:50 ip-10-20-112-233 Keepalived_vrrp[2772]: Registering Kernel netlink reflector
Mar 20 11:53:50 ip-10-20-112-233 Keepalived_vrrp[2772]: Registering Kernel netlink command channel
Mar 20 11:53:50 ip-10-20-112-233 Keepalived_vrrp[2772]: Registering gratuitous ARP shared channel
Mar 20 11:53:50 ip-10-20-112-233 Keepalived_vrrp[2772]: Opening file '/etc/keepalived/keepalived.conf'.
Mar 20 11:53:50 ip-10-20-112-233 Keepalived_vrrp[2772]: WARNING - default user 'keepalived_script' for script execution does not exist - please create.
Mar 20 11:53:50 ip-10-20-112-233 Keepalived_vrrp[2772]: SECURITY VIOLATION - scripts are being executed but script_security not enabled.
Mar 20 11:53:50 ip-10-20-112-233 Keepalived_vrrp[2772]: Using LinkWatch kernel netlink reflector...
Mar 20 11:53:50 ip-10-20-112-233 Keepalived_vrrp[2772]: VRRP sockpool: [ifindex(2), proto(112), unicast(1), fd(10,11)]
Mar 20 11:53:50 ip-10-20-112-233 Keepalived_vrrp[2772]: VRRP_Script(script1) succeeded
Mar 20 11:53:50 ip-10-20-112-233 Keepalived_vrrp[2772]: VRRP_Script(script2) succeeded
Mar 20 11:53:51 ip-10-20-112-233 Keepalived_vrrp[2772]: VRRP_Instance(VI_1) Transition to MASTER STATE
Mar 20 11:54:12 ip-10-20-112-233 dhclient[2251]: XMT: Solicit on eth0, interval 65330ms.
Mar 20 11:55:17 ip-10-20-112-233 dhclient[2251]: XMT: Solicit on eth0, interval 126690ms.
Mar 20 11:56:35 ip-10-20-112-233 systemd: Created slice User Slice of ec2-user.
Mar 20 11:56:35 ip-10-20-112-233 systemd: Starting User Slice of ec2-user.
Mar 20 11:56:35 ip-10-20-112-233 systemd-logind: New session 1 of user ec2-user.

Логи с бэкапа:
Mar 20 11:53:48 ip-10-20-111-140 systemd: Reloaded LVS and VRRP High Availability Monitor.
Mar 20 11:53:48 ip-10-20-111-140 Keepalived_healthcheckers[2765]: Got SIGHUP, reloading checker configuration
Mar 20 11:53:48 ip-10-20-111-140 Keepalived_healthcheckers[2765]: Opening file '/etc/keepalived/keepalived.conf'.
Mar 20 11:53:48 ip-10-20-111-140 Keepalived_vrrp[2766]: Registering Kernel netlink reflector
Mar 20 11:53:48 ip-10-20-111-140 Keepalived_vrrp[2766]: Registering Kernel netlink command channel
Mar 20 11:53:48 ip-10-20-111-140 Keepalived_vrrp[2766]: Registering gratuitous ARP shared channel
Mar 20 11:53:48 ip-10-20-111-140 Keepalived_vrrp[2766]: Opening file '/etc/keepalived/keepalived.conf'.
Mar 20 11:53:48 ip-10-20-111-140 Keepalived_vrrp[2766]: VRRP_Script(script1) considered successful on reload
Mar 20 11:53:48 ip-10-20-111-140 Keepalived_vrrp[2766]: VRRP_Script(script2) considered successful on reload
Mar 20 11:53:48 ip-10-20-111-140 Keepalived_vrrp[2766]: SECURITY VIOLATION - scripts are being executed but script_security not enabled.
Mar 20 11:53:48 ip-10-20-111-140 Keepalived_vrrp[2766]: Using LinkWatch kernel netlink reflector...
Mar 20 11:53:48 ip-10-20-111-140 Keepalived_vrrp[2766]: VRRP sockpool: [ifindex(2), proto(112), unicast(1), fd(10,11)]
Mar 20 11:53:49 ip-10-20-111-140 Keepalived_vrrp[2766]: VRRP_Instance(VI_1) Transition to MASTER STATE
Mar 20 11:53:51 ip-10-20-111-140 Keepalived_vrrp[2766]: VRRP_Instance(VI_1) Received advert with higher priority 100, ours 100
Mar 20 11:53:51 ip-10-20-111-140 Keepalived_vrrp[2766]: VRRP_Instance(VI_1) Entering BACKUP STATE
Mar 20 11:53:52 ip-10-20-111-140 root: triggering notify BACKUP event

строка Mar 20 11:53:52 ip-10-20-111-140 root: triggering notify BACKUP event - признак начала выполнения скрипта

Конфиг keepalived 1.3.5.8
vrrp_script script1{
}
vrrp_script script2{
}
vrrp_instance VI_1 {
    state BACKUP
    interface eth0
    virtual_router_id 51
    priority 100 
    unicast_src_ip <LOCAL IP>
    authentication {
        auth_type PASS
        auth_pass <PASS>
    }
    unicast_peer {
    <REMOTE_IP>
    }
    track_script {
        script1
    }
    track_script {
        script2
    }
    track_interface {
        eth0
    }
    notify /etc/keepalived/notify.sh
}


Если зайти на машину руками и перезагрузить сервис то все дальше работает как часы, проблема именно в первом, чистом старте.
Есть какая то неочевидная особенность сервиса которая не описана в документации? Как заставить keepalived выполнить скрипт при первичном переходе в состояние мастера?
  • Вопрос задан
  • 2229 просмотров
Пригласить эксперта
Ваш ответ на вопрос

Войдите, чтобы написать ответ

Войти через центр авторизации
Похожие вопросы