Bug ID 960118
Summary corosync: service randomly fails to start with a faulty redundant channel
Classification openSUSE
Product openSUSE Distribution
Version Leap 42.1
Hardware Other
OS Other
Status NEW
Severity Normal
Priority P5 - None
Component High Availability
Assignee lmb@suse.com
Reporter zzhou@suse.com
QA Contact qa-bugs@suse.de
Found By ---
Blocker ---

Created attachment 660241 [details]
journalctl

HA system is in Virtual Machines. It is configured with the redundant channel
for corosync. 

While, intentionally let one corosync channel blocked from the host system.

OBS131-x220:/work/images # brctl delif virbr0 vnet2

In this case, all NIC instances in VM are active.

Then, corosync service randomly fails to start, as below.

Leap421-01:~ # systemctl stop pacemaker
Leap421-01:~ # systemctl start pacemaker
A dependency job for pacemaker.service failed. See 'journalctl -xn' for
details.
Leap421-01:~ # systemctl start pacemaker
Leap421-01:~ # systemctl stop pacemaker
Leap421-01:~ # systemctl start pacemaker
Leap421-01:~ # systemctl stop pacemaker
Leap421-01:~ # systemctl start pacemaker
A dependency job for pacemaker.service failed. See 'journalctl -xn' for
details.
Leap421-01:~ # systemctl start pacemaker
Leap421-01:~ # systemctl stop pacemaker
Leap421-01:~ # systemctl start pacemaker
A dependency job for pacemaker.service failed. See 'journalctl -xn' for
details.
Leap421-01:~ # systemctl start pacemaker
A dependency job for pacemaker.service failed. See 'journalctl -xn' for
details.
Leap421-01:~ # systemctl start pacemaker
Leap421-01:~ # systemctl stop pacemaker
Leap421-01:~ # systemctl start pacemaker
A dependency job for pacemaker.service failed. See 'journalctl -xn' for
details.
Leap421-01:~ # journalctl -xn
-- Logs begin at Thu 2015-11-05 16:53:16 CST, end at Tue 2015-12-22 21:01:39
CST. --
Dec 22 21:01:39 Leap421-01 corosync[8851]: [QB    ] withdrawing server sockets
Dec 22 21:01:39 Leap421-01 corosync[8851]: [SERV  ] Service engine unloaded:
corosync configuration service
Dec 22 21:01:39 Leap421-01 corosync[8851]: [QB    ] withdrawing server sockets
Dec 22 21:01:39 Leap421-01 corosync[8851]: [SERV  ] Service engine unloaded:
corosync cluster closed process group service v1.01
Dec 22 21:01:39 Leap421-01 corosync[8851]: [QB    ] withdrawing server sockets
Dec 22 21:01:39 Leap421-01 corosync[8851]: [SERV  ] Service engine unloaded:
corosync cluster quorum service v0.1
Dec 22 21:01:39 Leap421-01 corosync[8851]: [SERV  ] Service engine unloaded:
corosync profile loading service
Dec 22 21:01:39 Leap421-01 corosync[8851]: [MAIN  ] Corosync Cluster Engine
exiting normally
Dec 22 21:01:39 Leap421-01 systemd[1]: Failed to start Corosync Cluster Engine.
-- Subject: Unit corosync.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit corosync.service has failed.
-- 
-- The result is failed.
Dec 22 21:01:39 Leap421-01 systemd[1]: Dependency failed for Pacemaker High
Availability Cluster Manager.
-- Subject: Unit pacemaker.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit pacemaker.service has failed.
-- 
-- The result is dependency.


You are receiving this mail because: