Mailinglist Archive: opensuse-ha (16 mails)

< Previous Next >
Re: [opensuse-ha] Samba AD domain + CTDB
On Mon, 2014-07-07 at 12:22 +0200, steve wrote:
13.1 nodes with drbd and ocfs2 from the ha-factory repo.

Hi
First time here so please be gentle.
Aim: add second failover file server for our AD domain.

First time here so please be gentle. We have drbd and ocfs2 up on 2
nodes.

We want to add ctbd on top of that for failover. The documentation:
https://ctdb.samba.org/samba.html
makes no mention of AD. In particular, how to join the cluster to the
domain.

1. What next? Do we go straight to configure ctdb?
2. Is the 13.1 ctdb OK?
3. Does it work in AD?
4. Do we have any openSUSE specific samba cluster stuff?

Thanks,
Steve



OK.

First attempt at ctdb:
We have drbd syncing fine to the ocfs2 mounted partitions:

cat /proc/drbd
version: 8.4.4 (api:1/proto:86-101)
GIT-hash: 3c1f46cb19993f98b22fdf7e18958c21ad75176d build by SuSE Build
Service

1: cs:Connected ro:Primary/Primary ds:UpToDate/UpToDate C r-----
ns:85 nr:168 dw:253 dr:1919 al:2 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f
oos:0


On both nodes the ctdb config is:
public_addresses
192.168.1.80/24 eth0
192.168.1.81/24 eth0
nodes
192.168.0.10
192.168.0.11

and drbd:
global {
usage-count yes;
}
common {
protocol C;
}

resource r0 {
net {
protocol C;
allow-two-primaries yes;
}

startup {
become-primary-on both;
}
on smb1 {
device /dev/drbd1;
disk /dev/sdb1;
address 192.168.0.10:7789;
meta-disk internal;
}
on smb2 {
device /dev/drbd1;
disk /dev/sdb1;
address 192.168.0.11:7789;
meta-disk internal;
}
}

CTDB_RECOVERY_LOCK="/cluster/ctbd/lockfile"
CTDB_PUBLIC_INTERFACE=eth0
CTDB_PUBLIC_ADDRESSES=/etc/ctdb/public_addresses
CTDB_LVS_PUBLIC_IP=
CTDB_MANAGES_SAMBA=yes
CTDB_SAMBA_SKIP_SHARE_CHECK=yes
CTDB_NFS_SKIP_SHARE_CHECK=yes
CTDB_MANAGES_WINBIND=yes
CTDB_MANAGES_VSFTPD=yes
CTDB_MANAGES_ISCSI=yes
CTDB_INIT_STYLE=
CTDB_SERVICE_SMB=smb
CTDB_SERVICE_NMB=nmb
CTDB_SERVICE_WINBIND=winbind
CTDB_NODES=/etc/ctdb/nodes
CTDB_NOTIFY_SCRIPT=/etc/ctdb/notify.sh
CTDB_DBDIR=/var/lib/ctdb
CTDB_DBDIR_PERSISTENT=/var/lib/ctdb/persistent
CTDB_EVENT_SCRIPT_DIR=/etc/ctdb/events.d
CTDB_SOCKET=/var/lib/ctdb/ctdb.socket
CTDB_TRANSPORT="tcp"
CTDB_MONITOR_FREE_MEMORY=100
CTDB_START_AS_DISABLED="yes"
CTDB_CAPABILITY_RECMASTER=yes
CTDB_CAPABILITY_LMASTER=yes
NATGW_PUBLIC_IP=
NATGW_PUBLIC_IFACE=
NATGW_DEFAULT_GATEWAY=
NATGW_PRIVATE_IFACE=
NATGW_PRIVATE_NETWORK=
NATGW_NODES=/etc/ctdb/natgw_nodes
CTDB_LOGFILE=/var/log/ctdb/log.ctdb
CTDB_DEBUGLEVEL=2
CTDB_OPTIONS=

The ocfs2 stuff seems OK:
ocfs2_dlmfs on /dlm type ocfs2_dlmfs (rw,relatime)
/dev/drbd1 on /cluster type ocfs2
(rw,relatime,_netdev,heartbeat=local,nointr,data=ordered,errors=remount-ro,atime_quantum=60,coherency=full,user_xattr,acl

logs on starting ctdb:
node 1
2014/07/08 11:06:38.417962 [ 4034]: CTDB starting on node
2014/07/08 11:06:38.469001 [ 4035]: Starting CTDBD (Version 2.3) as PID:
4035
2014/07/08 11:06:38.480647 [ 4035]: Created PID
file /var/run/ctdb/ctdbd.pid
2014/07/08 11:06:38.482263 [ 4035]: Set scheduler to SCHED_FIFO
2014/07/08 11:06:38.483114 [ 4035]: Set runstate to INIT (1)
2014/07/08 11:06:39.066264 [ 4035]: 00.ctdb: WARNING: Cannot check
databases since neither
2014/07/08 11:06:39.067005 [ 4035]: 00.ctdb: 'tdbdump' nor 'tdbtool
check' is available.
2014/07/08 11:06:39.067080 [ 4035]: 00.ctdb: Consider installing
tdbtool or at least tdbdump!
2014/07/08 11:06:39.212555 [ 4035]: Freeze priority 1
2014/07/08 11:06:39.250950 [ 4035]: Freeze priority 2
2014/07/08 11:06:39.270343 [ 4035]: Freeze priority 3
2014/07/08 11:06:39.309777 [ 4035]: server/ctdb_takeover.c:3239 Released
0 public IPs
2014/07/08 11:06:39.310120 [ 4035]: Set runstate to SETUP (2)
2014/07/08 11:06:39.858728 [ 4035]: Set runstate to FIRST_RECOVERY (3)
2014/07/08 11:06:39.862470 [ 4035]: Keepalive monitoring has been
started
2014/07/08 11:06:39.862725 [ 4035]: Monitoring has been started
2014/07/08 11:06:39.903084 [recoverd: 4107]: monitor_cluster starting
2014/07/08 11:06:39.944166 [recoverd: 4107]: server/ctdb_recoverd.c:3483
Initial recovery master set - forcing election
2014/07/08 11:06:39.946208 [ 4035]: Freeze priority 1
2014/07/08 11:06:39.948072 [ 4035]: Freeze priority 2
2014/07/08 11:06:39.949359 [ 4035]: Freeze priority 3
2014/07/08 11:06:39.953538 [ 4035]: This node (0) is now the recovery
master
2014/07/08 11:06:40.864448 [ 4035]: CTDB_WAIT_UNTIL_RECOVERED
2014/07/08 11:06:41.867073 [ 4035]: CTDB_WAIT_UNTIL_RECOVERED
2014/07/08 11:06:42.870148 [ 4035]: CTDB_WAIT_UNTIL_RECOVERED
2014/07/08 11:06:42.964212 [recoverd: 4107]: server/ctdb_recoverd.c:1061
Election timed out
2014/07/08 11:06:42.975829 [recoverd: 4107]: The interfaces status has
changed on local node 0 - force takeover run
2014/07/08 11:06:42.983077 [recoverd: 4107]: Trigger takeoverrun
2014/07/08 11:06:42.986293 [recoverd: 4107]: Node:0 was in recovery
mode. Start recovery process
2014/07/08 11:06:42.987372 [recoverd: 4107]: server/ctdb_recoverd.c:1601
Starting do_recovery
2014/07/08 11:06:42.988486 [recoverd: 4107]: Taking out recovery lock
from recovery daemon
2014/07/08 11:06:42.989591 [recoverd: 4107]: Take the recovery lock
2014/07/08 11:06:42.991088 [recoverd: 4107]: ctdb_recovery_lock: Unable
to open /cluster/ctbd/lockfile - (No such file or directory)
2014/07/08 11:06:42.992301 [recoverd: 4107]: Unable to get recovery lock
- aborting recovery and ban ourself for 300 seconds
2014/07/08 11:06:42.994374 [recoverd: 4107]: Banning node 0 for 300
seconds
2014/07/08 11:06:42.995048 [ 4035]: Banning this node for 300 seconds
2014/07/08 11:06:42.995122 [ 4035]: This node has been banned - forcing
freeze and recovery
2014/07/08 11:06:42.995224 [ 4035]: server/ctdb_takeover.c:3239 Released
0 public IPs
2014/07/08 11:06:43.872389 [ 4035]: CTDB_WAIT_UNTIL_RECOVERED
2014/07/08 11:06:44.873993 [ 4035]: CTDB_WAIT_UNTIL_RECOVERED
2014/07/08 11:06:45.875147 [ 4035]: CTDB_WAIT_UNTIL_RECOVERED
2014/07/08 11:06:46.876290 [ 4035]: CTDB_WAIT_UNTIL_RECOVERED
2014/07/08 11:06:47.877996 [ 4035]: CTDB_WAIT_UNTIL_RECOVERED
2014/07/08 11:06:48.880075 [ 4035]: CTDB_WAIT_UNTIL_RECOVERED
2014/07/08 11:06:49.184701 [recoverd: 4107]: Daemon has exited -
shutting down client
2014/07/08 11:06:49.196972 [recoverd: 4107]: CTDB recoverd: shutting
down

node 2
2014/07/08 11:04:45.733986 [ 6067]: CTDB starting on node
2014/07/08 11:04:45.797095 [ 6068]: Starting CTDBD (Version 2.3) as PID:
6068
2014/07/08 11:04:45.814858 [ 6068]: Created PID
file /var/run/ctdb/ctdbd.pid
2014/07/08 11:04:45.818567 [ 6068]: Set scheduler to SCHED_FIFO
2014/07/08 11:04:45.819687 [ 6068]: Set runstate to INIT (1)
2014/07/08 11:04:46.446776 [ 6068]: 00.ctdb: WARNING: Cannot check
databases since neither
2014/07/08 11:04:46.447162 [ 6068]: 00.ctdb: 'tdbdump' nor 'tdbtool
check' is available.
2014/07/08 11:04:46.447231 [ 6068]: 00.ctdb: Consider installing
tdbtool or at least tdbdump!
2014/07/08 11:04:46.599269 [ 6068]: Freeze priority 1
2014/07/08 11:04:46.654344 [ 6068]: Freeze priority 2
2014/07/08 11:04:46.683954 [ 6068]: Freeze priority 3
2014/07/08 11:04:46.721631 [ 6068]: server/ctdb_takeover.c:3239 Released
0 public IPs
2014/07/08 11:04:46.721781 [ 6068]: Set runstate to SETUP (2)
2014/07/08 11:04:47.342320 [ 6068]: Set runstate to FIRST_RECOVERY (3)
2014/07/08 11:04:47.346243 [ 6068]: Keepalive monitoring has been
started
2014/07/08 11:04:47.346750 [ 6068]: Monitoring has been started
2014/07/08 11:04:47.376362 [recoverd: 6140]: monitor_cluster starting
2014/07/08 11:04:47.420852 [recoverd: 6140]: server/ctdb_recoverd.c:3483
Initial recovery master set - forcing election
2014/07/08 11:04:47.422705 [ 6068]: Freeze priority 1
2014/07/08 11:04:47.429362 [ 6068]: Freeze priority 2
2014/07/08 11:04:47.430955 [ 6068]: Freeze priority 3
2014/07/08 11:04:47.441080 [ 6068]: This node (1) is now the recovery
master
2014/07/08 11:04:48.349129 [ 6068]: CTDB_WAIT_UNTIL_RECOVERED
2014/07/08 11:04:49.351183 [ 6068]: CTDB_WAIT_UNTIL_RECOVERED
2014/07/08 11:04:50.353470 [ 6068]: CTDB_WAIT_UNTIL_RECOVERED
2014/07/08 11:04:50.447487 [recoverd: 6140]: server/ctdb_recoverd.c:1061
Election timed out
2014/07/08 11:04:50.459419 [recoverd: 6140]: The interfaces status has
changed on local node 1 - force takeover run
2014/07/08 11:04:50.468723 [recoverd: 6140]: Trigger takeoverrun
2014/07/08 11:04:50.471716 [recoverd: 6140]: Node:1 was in recovery
mode. Start recovery process
2014/07/08 11:04:50.473419 [recoverd: 6140]: server/ctdb_recoverd.c:1601
Starting do_recovery
2014/07/08 11:04:50.474646 [recoverd: 6140]: Taking out recovery lock
from recovery daemon
2014/07/08 11:04:50.476860 [recoverd: 6140]: Take the recovery lock
2014/07/08 11:04:50.477977 [recoverd: 6140]: Recovery lock taken
successfully
2014/07/08 11:04:50.488958 [recoverd: 6140]: ctdb_recovery_lock: Got
recovery lock on '/cluster/ctdb/lockfile'
2014/07/08 11:04:50.489972 [recoverd: 6140]: Recovery lock taken
successfully by recovery daemon
2014/07/08 11:04:50.491188 [recoverd: 6140]: server/ctdb_recoverd.c:1626
Recovery initiated due to problem with node 0
2014/07/08 11:04:50.492067 [recoverd: 6140]: server/ctdb_recoverd.c:1651
Recovery - created remote databases
2014/07/08 11:04:50.492657 [recoverd: 6140]: server/ctdb_recoverd.c:1658
Recovery - updated db priority for all databases
2014/07/08 11:04:50.493744 [ 6068]: Freeze priority 1
2014/07/08 11:04:50.494532 [ 6068]: Freeze priority 2
2014/07/08 11:04:50.495896 [ 6068]: Freeze priority 3
2014/07/08 11:04:50.503236 [ 6068]: server/ctdb_recover.c:989
startrecovery eventscript has been invoked
2014/07/08 11:04:50.873447 [recoverd: 6140]: server/ctdb_recoverd.c:1695
Recovery - updated flags
2014/07/08 11:04:50.877767 [recoverd: 6140]: server/ctdb_recoverd.c:1739
started transactions on all nodes
2014/07/08 11:04:50.878428 [recoverd: 6140]: server/ctdb_recoverd.c:1752
Recovery - starting database commits
2014/07/08 11:04:50.879417 [recoverd: 6140]: server/ctdb_recoverd.c:1764
Recovery - committed databases
2014/07/08 11:04:50.892275 [recoverd: 6140]: server/ctdb_recoverd.c:1814
Recovery - updated vnnmap
2014/07/08 11:04:50.906800 [recoverd: 6140]: server/ctdb_recoverd.c:1823
Recovery - updated recmaster
2014/07/08 11:04:50.909624 [recoverd: 6140]: server/ctdb_recoverd.c:1840
Recovery - updated flags
2014/07/08 11:04:50.910410 [ 6068]: server/ctdb_recover.c:612 Recovery
mode set to NORMAL
2014/07/08 11:04:50.910595 [ 6068]: Thawing priority 1
2014/07/08 11:04:50.910653 [ 6068]: Release freeze handler for prio 1
2014/07/08 11:04:50.911029 [ 6068]: Thawing priority 2
2014/07/08 11:04:50.911106 [ 6068]: Release freeze handler for prio 2
2014/07/08 11:04:50.911290 [ 6068]: Thawing priority 3
2014/07/08 11:04:50.911362 [ 6068]: Release freeze handler for prio 3
2014/07/08 11:04:50.929367 [recoverd: 6140]: server/ctdb_recoverd.c:1849
Recovery - disabled recovery mode
2014/07/08 11:04:50.937270 [recoverd: 6140]: Failed to find node to
cover ip 192.168.1.81
2014/07/08 11:04:50.938668 [recoverd: 6140]: Failed to find node to
cover ip 192.168.1.80
2014/07/08 11:04:50.946251 [recoverd: 6140]: Disabling ip check for 9
seconds
2014/07/08 11:04:51.275565 [ 6068]: Recovery has finished
2014/07/08 11:04:51.355096 [ 6068]: CTDB_WAIT_UNTIL_RECOVERED
2014/07/08 11:04:51.355531 [ 6068]: server/ctdb_monitor.c:262 wait for
pending recoveries to end. Wait one more second.
2014/07/08 11:04:51.592071 [ 6068]: Set runstate to STARTUP (4)
2014/07/08 11:04:51.593962 [recoverd: 6140]: server/ctdb_recoverd.c:1873
Recovery - finished the recovered event
2014/07/08 11:04:51.595145 [recoverd: 6140]: server/ctdb_recoverd.c:1879
Recovery complete
2014/07/08 11:04:51.595199 [recoverd: 6140]: Resetting ban count to 0
for all nodes
2014/07/08 11:04:51.595224 [recoverd: 6140]: Just finished a recovery.
New recoveries will now be supressed for the rerecovery timeout (10
seconds)
2014/07/08 11:04:52.356771 [ 6068]: CTDB_WAIT_UNTIL_RECOVERED
2014/07/08 11:04:52.357135 [ 6068]: server/ctdb_monitor.c:262 wait for
pending recoveries to end. Wait one more second.
2014/07/08 11:04:53.357738 [ 6068]: CTDB_WAIT_UNTIL_RECOVERED
2014/07/08 11:04:53.358060 [ 6068]: server/ctdb_monitor.c:262 wait for
pending recoveries to end. Wait one more second.
2014/07/08 11:04:54.359481 [ 6068]: CTDB_WAIT_UNTIL_RECOVERED
2014/07/08 11:04:54.359998 [ 6068]: server/ctdb_monitor.c:262 wait for
pending recoveries to end. Wait one more second.
2014/07/08 11:04:55.361130 [ 6068]: CTDB_WAIT_UNTIL_RECOVERED
2014/07/08 11:04:55.361652 [ 6068]: server/ctdb_monitor.c:262 wait for
pending recoveries to end. Wait one more second.
2014/07/08 11:04:56.363109 [ 6068]: CTDB_WAIT_UNTIL_RECOVERED
2014/07/08 11:04:56.363409 [ 6068]: server/ctdb_monitor.c:262 wait for
pending recoveries to end. Wait one more second.
2014/07/08 11:04:56.609997 [recoverd: 6140]: Daemon has exited -
shutting down client
2014/07/08 11:04:56.614990 [recoverd: 6140]: CTDB recoverd: shutting
down

apparmor and firewalls non existent.

192.168.1.80 and 192.168.1.81 are the 'out to lan' interfaces
192.168.0.10 and 192.168.0.11 are the drbd crossover interfaces

Not sure about what we need in:
Public_addresses
or in:
nodes
for the ctdb

Any ideas of where to start to sort this out most welcome.
Thanks folks,
Steve


--
To unsubscribe, e-mail: opensuse-ha+unsubscribe@xxxxxxxxxxxx
To contact the owner, e-mail: opensuse-ha+owner@xxxxxxxxxxxx

< Previous Next >
Follow Ups
References