On Tue, 2014-07-08 at 11:32 +0200, Richard Brown wrote:
On Tue, 2014-07-08 at 11:24 +0200, steve wrote:
2014/07/08 11:06:39.067080 [ 4035]: 00.ctdb: Consider installing tdbtool or at least tdbdump!
I'd recommend you install tdb-tools and try again
OK. With tdb-tools: node 1 2014/07/08 11:54:32.921389 [ 2856]: CTDB starting on node 2014/07/08 11:54:32.974367 [ 2857]: Starting CTDBD (Version 2.3) as PID: 2857 2014/07/08 11:54:32.985424 [ 2857]: Created PID file /var/run/ctdb/ctdbd.pid 2014/07/08 11:54:32.996422 [ 2857]: Set scheduler to SCHED_FIFO 2014/07/08 11:54:32.997392 [ 2857]: Set runstate to INIT (1) 2014/07/08 11:54:33.789104 [ 2857]: Freeze priority 1 2014/07/08 11:54:33.842150 [ 2857]: Freeze priority 2 2014/07/08 11:54:33.863673 [ 2857]: Freeze priority 3 2014/07/08 11:54:33.899042 [ 2857]: server/ctdb_takeover.c:3239 Released 0 public IPs 2014/07/08 11:54:33.899240 [ 2857]: Set runstate to SETUP (2) 2014/07/08 11:54:34.464217 [ 2857]: Set runstate to FIRST_RECOVERY (3) 2014/07/08 11:54:34.467391 [ 2857]: Keepalive monitoring has been started 2014/07/08 11:54:34.467923 [ 2857]: Monitoring has been started 2014/07/08 11:54:34.482718 [recoverd: 2935]: monitor_cluster starting 2014/07/08 11:54:34.525244 [recoverd: 2935]: server/ctdb_recoverd.c:3483 Initial recovery master set - forcing election 2014/07/08 11:54:34.527061 [ 2857]: Freeze priority 1 2014/07/08 11:54:34.528474 [ 2857]: Freeze priority 2 2014/07/08 11:54:34.529815 [ 2857]: Freeze priority 3 2014/07/08 11:54:34.540343 [ 2857]: This node (0) is now the recovery master 2014/07/08 11:54:35.469934 [ 2857]: CTDB_WAIT_UNTIL_RECOVERED 2014/07/08 11:54:36.472449 [ 2857]: CTDB_WAIT_UNTIL_RECOVERED 2014/07/08 11:54:37.474411 [ 2857]: CTDB_WAIT_UNTIL_RECOVERED 2014/07/08 11:54:37.545470 [recoverd: 2935]: server/ctdb_recoverd.c:1061 Election timed out 2014/07/08 11:54:37.551942 [recoverd: 2935]: The interfaces status has changed on local node 0 - force takeover run 2014/07/08 11:54:37.554449 [recoverd: 2935]: Trigger takeoverrun 2014/07/08 11:54:37.557044 [recoverd: 2935]: Node:0 was in recovery mode. Start recovery process 2014/07/08 11:54:37.562645 [recoverd: 2935]: server/ctdb_recoverd.c:1601 Starting do_recovery 2014/07/08 11:54:37.563916 [recoverd: 2935]: Taking out recovery lock from recovery daemon 2014/07/08 11:54:37.564405 [recoverd: 2935]: Take the recovery lock 2014/07/08 11:54:37.565214 [recoverd: 2935]: ctdb_recovery_lock: Unable to open /cluster/ctbd/lockfile - (No such file or directory) 2014/07/08 11:54:37.566152 [recoverd: 2935]: Unable to get recovery lock - aborting recovery and ban ourself for 300 seconds 2014/07/08 11:54:37.567149 [recoverd: 2935]: Banning node 0 for 300 seconds 2014/07/08 11:54:37.567754 [ 2857]: Banning this node for 300 seconds 2014/07/08 11:54:37.567956 [ 2857]: This node has been banned - forcing freeze and recovery 2014/07/08 11:54:37.568059 [ 2857]: server/ctdb_takeover.c:3239 Released 0 public IPs 2014/07/08 11:54:38.476148 [ 2857]: CTDB_WAIT_UNTIL_RECOVERED 2014/07/08 11:54:39.477229 [ 2857]: CTDB_WAIT_UNTIL_RECOVERED 2014/07/08 11:54:40.478881 [ 2857]: CTDB_WAIT_UNTIL_RECOVERED 2014/07/08 11:54:41.480322 [ 2857]: CTDB_WAIT_UNTIL_RECOVERED 2014/07/08 11:54:42.481311 [ 2857]: CTDB_WAIT_UNTIL_RECOVERED 2014/07/08 11:54:43.482493 [ 2857]: CTDB_WAIT_UNTIL_RECOVERED 2014/07/08 11:54:43.639731 [recoverd: 2935]: Daemon has exited - shutting down client 2014/07/08 11:54:43.640344 [recoverd: 2935]: CTDB recoverd: shutting down node 2 2014/07/08 11:54:12.635083 [ 2590]: CTDB starting on node 2014/07/08 11:54:12.695604 [ 2591]: Starting CTDBD (Version 2.3) as PID: 2591 2014/07/08 11:54:12.708577 [ 2591]: Created PID file /var/run/ctdb/ctdbd.pid 2014/07/08 11:54:12.711440 [ 2591]: Set scheduler to SCHED_FIFO 2014/07/08 11:54:12.712559 [ 2591]: Set runstate to INIT (1) 2014/07/08 11:54:13.641112 [ 2591]: Freeze priority 1 2014/07/08 11:54:13.670636 [ 2591]: Freeze priority 2 2014/07/08 11:54:13.698589 [ 2591]: Freeze priority 3 2014/07/08 11:54:13.744151 [ 2591]: server/ctdb_takeover.c:3239 Released 0 public IPs 2014/07/08 11:54:13.744348 [ 2591]: Set runstate to SETUP (2) 2014/07/08 11:54:14.296494 [ 2591]: Set runstate to FIRST_RECOVERY (3) 2014/07/08 11:54:14.301281 [ 2591]: Keepalive monitoring has been started 2014/07/08 11:54:14.301672 [ 2591]: Monitoring has been started 2014/07/08 11:54:14.336024 [recoverd: 2669]: monitor_cluster starting 2014/07/08 11:54:14.380332 [recoverd: 2669]: server/ctdb_recoverd.c:3483 Initial recovery master set - forcing election 2014/07/08 11:54:14.383895 [ 2591]: Freeze priority 1 2014/07/08 11:54:14.384816 [ 2591]: Freeze priority 2 2014/07/08 11:54:14.385612 [ 2591]: Freeze priority 3 2014/07/08 11:54:14.388271 [ 2591]: This node (1) is now the recovery master 2014/07/08 11:54:15.303181 [ 2591]: CTDB_WAIT_UNTIL_RECOVERED 2014/07/08 11:54:16.304167 [ 2591]: CTDB_WAIT_UNTIL_RECOVERED 2014/07/08 11:54:17.305626 [ 2591]: CTDB_WAIT_UNTIL_RECOVERED 2014/07/08 11:54:17.393629 [recoverd: 2669]: server/ctdb_recoverd.c:1061 Election timed out 2014/07/08 11:54:17.401504 [recoverd: 2669]: The interfaces status has changed on local node 1 - force takeover run 2014/07/08 11:54:17.404523 [recoverd: 2669]: Trigger takeoverrun 2014/07/08 11:54:17.414122 [recoverd: 2669]: Node:1 was in recovery mode. Start recovery process 2014/07/08 11:54:17.415352 [recoverd: 2669]: server/ctdb_recoverd.c:1601 Starting do_recovery 2014/07/08 11:54:17.415863 [recoverd: 2669]: Taking out recovery lock from recovery daemon 2014/07/08 11:54:17.416346 [recoverd: 2669]: Take the recovery lock 2014/07/08 11:54:17.417857 [recoverd: 2669]: Recovery lock taken successfully 2014/07/08 11:54:17.419955 [recoverd: 2669]: ctdb_recovery_lock: Got recovery lock on '/cluster/ctdb/lockfile' 2014/07/08 11:54:17.420958 [recoverd: 2669]: Recovery lock taken successfully by recovery daemon 2014/07/08 11:54:17.422371 [recoverd: 2669]: server/ctdb_recoverd.c:1626 Recovery initiated due to problem with node 0 2014/07/08 11:54:17.431247 [recoverd: 2669]: server/ctdb_recoverd.c:1651 Recovery - created remote databases 2014/07/08 11:54:17.431810 [recoverd: 2669]: server/ctdb_recoverd.c:1658 Recovery - updated db priority for all databases 2014/07/08 11:54:17.433554 [ 2591]: Freeze priority 1 2014/07/08 11:54:17.434268 [ 2591]: Freeze priority 2 2014/07/08 11:54:17.435149 [ 2591]: Freeze priority 3 2014/07/08 11:54:17.436605 [ 2591]: server/ctdb_recover.c:989 startrecovery eventscript has been invoked 2014/07/08 11:54:17.702530 [recoverd: 2669]: server/ctdb_recoverd.c:1695 Recovery - updated flags 2014/07/08 11:54:17.706249 [recoverd: 2669]: server/ctdb_recoverd.c:1739 started transactions on all nodes 2014/07/08 11:54:17.707564 [recoverd: 2669]: server/ctdb_recoverd.c:1752 Recovery - starting database commits 2014/07/08 11:54:17.708919 [recoverd: 2669]: server/ctdb_recoverd.c:1764 Recovery - committed databases 2014/07/08 11:54:17.710593 [recoverd: 2669]: server/ctdb_recoverd.c:1814 Recovery - updated vnnmap 2014/07/08 11:54:17.712654 [recoverd: 2669]: server/ctdb_recoverd.c:1823 Recovery - updated recmaster 2014/07/08 11:54:17.719478 [recoverd: 2669]: server/ctdb_recoverd.c:1840 Recovery - updated flags 2014/07/08 11:54:17.724280 [ 2591]: server/ctdb_recover.c:612 Recovery mode set to NORMAL 2014/07/08 11:54:17.724609 [ 2591]: Thawing priority 1 2014/07/08 11:54:17.724661 [ 2591]: Release freeze handler for prio 1 2014/07/08 11:54:17.725001 [ 2591]: Thawing priority 2 2014/07/08 11:54:17.725039 [ 2591]: Release freeze handler for prio 2 2014/07/08 11:54:17.725177 [ 2591]: Thawing priority 3 2014/07/08 11:54:17.725210 [ 2591]: Release freeze handler for prio 3 2014/07/08 11:54:17.740456 [recoverd: 2669]: server/ctdb_recoverd.c:1849 Recovery - disabled recovery mode 2014/07/08 11:54:17.747439 [recoverd: 2669]: Failed to find node to cover ip 192.168.1.81 2014/07/08 11:54:17.748751 [recoverd: 2669]: Failed to find node to cover ip 192.168.1.80 2014/07/08 11:54:17.753058 [recoverd: 2669]: Disabling ip check for 9 seconds 2014/07/08 11:54:18.096692 [ 2591]: Recovery has finished 2014/07/08 11:54:18.307148 [ 2591]: CTDB_WAIT_UNTIL_RECOVERED 2014/07/08 11:54:18.307710 [ 2591]: server/ctdb_monitor.c:262 wait for pending recoveries to end. Wait one more second. 2014/07/08 11:54:18.466053 [ 2591]: Set runstate to STARTUP (4) 2014/07/08 11:54:18.467942 [recoverd: 2669]: server/ctdb_recoverd.c:1873 Recovery - finished the recovered event 2014/07/08 11:54:18.470367 [recoverd: 2669]: server/ctdb_recoverd.c:1879 Recovery complete 2014/07/08 11:54:18.470421 [recoverd: 2669]: Resetting ban count to 0 for all nodes 2014/07/08 11:54:18.470447 [recoverd: 2669]: Just finished a recovery. New recoveries will now be supressed for the rerecovery timeout (10 seconds) 2014/07/08 11:54:19.308460 [ 2591]: CTDB_WAIT_UNTIL_RECOVERED 2014/07/08 11:54:19.308735 [ 2591]: server/ctdb_monitor.c:262 wait for pending recoveries to end. Wait one more second. 2014/07/08 11:54:20.309499 [ 2591]: CTDB_WAIT_UNTIL_RECOVERED 2014/07/08 11:54:20.309860 [ 2591]: server/ctdb_monitor.c:262 wait for pending recoveries to end. Wait one more second. 2014/07/08 11:54:21.311258 [ 2591]: CTDB_WAIT_UNTIL_RECOVERED 2014/07/08 11:54:21.311583 [ 2591]: server/ctdb_monitor.c:262 wait for pending recoveries to end. Wait one more second. 2014/07/08 11:54:22.312506 [ 2591]: CTDB_WAIT_UNTIL_RECOVERED 2014/07/08 11:54:22.312853 [ 2591]: server/ctdb_monitor.c:262 wait for pending recoveries to end. Wait one more second. 2014/07/08 11:54:23.314261 [ 2591]: CTDB_WAIT_UNTIL_RECOVERED 2014/07/08 11:54:23.314539 [ 2591]: server/ctdb_monitor.c:262 wait for pending recoveries to end. Wait one more second. 2014/07/08 11:54:23.472970 [recoverd: 2669]: Daemon has exited - shutting down client 2014/07/08 11:54:23.483875 [recoverd: 2669]: CTDB recoverd: shutting down -- To unsubscribe, e-mail: opensuse-ha+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-ha+owner@opensuse.org