[Bug 847136] New: LK(8.1.2) Cluster node hang on Unmounting file systems during node reboot.
https://bugzilla.novell.com/show_bug.cgi?id=847136 https://bugzilla.novell.com/show_bug.cgi?id=847136#c0 Summary: LK(8.1.2) Cluster node hang on Unmounting file systems during node reboot. Classification: openSUSE Product: openSUSE 11.4 Version: Factory Platform: x86-64 OS/Version: openSUSE 11.2 Status: NEW Severity: Normal Priority: P5 - None Component: Kernel AssignedTo: kernel-maintainers@forge.provo.novell.com ReportedBy: hongc@netapp.com QAContact: qa-bugs@suse.de Found By: --- Blocker: --- User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/30.0.1599.69 Safari/537.36 When trying to reboot one of the cluster nodes, the node hang at the "Unmounting file systems" and failed to completely shutdown. When editing the /etc/init.d/boot.d/K04boot.localfs file to debug where the host is hanging at the following section: umount -rv $mtab -t no${tmpfs//,/,no},$nofs -O no_netdev $ulist || { rc_status UMOUNT_FAILED=true } rc_status -v1 -r where $mtab=<empty> $tmpfs=tmpfs,ramfs,hugetlbfs,mqueue,usbfs $nofs=nonfs,nonfs3,nosmbfs,nocifs,noafs,noncpfs,nonsysfs,noproc,nocgroup,nocpuse...,nodevtmpfs,nodebugfs,nosecuritfyfs,nodevpts,nopstore,nofuse,nofusectl,nobinfmt_...sc,norpc_pipefs,nonfsd $ulist=/home/smashmnt05 /home/smashmnt03 /home/smashmnt01 Reproducible: Always Steps to Reproduce: 2 - SLES11.2 with Steeleye LifeKeeper Cluster setup (2-node cluster) Kernel - 3.0.13-0.27-default LifeKeeper - 8.1.2 - Host1(ictm-hog) with Dell SAS 7e Firmware 07.15.08.00 Driver 09.100.00.00 (Inbox) BIOS 7.11.10.00 - Host2(ictm-pig) with LSI 9300-8e Firmware 01.250.08.00 Driver 3.00.00.00 (Out-of-Box) BIOS 8.05.00.00 2 - Storage Arrays - Snowmass (DELL MD36xxf) Firmware 98.10.00.61 NVSRAM N26X0-810890-000 - Snowmass (DELL MD36xxf) Firmware 07.84.51.60 NVSRAM N26X0-784890-004 - Setup is direct attach, each node has 2 of the same adapter connection directly to each storage. - Created a cluster host group and created 2 hosts into the cluster host group. - Created 6 volumes from each storage and mapped to the host group. - Installed/Setup Steeleye Likekeeper Cluster on the two hosts, using 2-node setup. - Mounted the nfs filesystems from the cluster on a seperate host. - Started I/O on the seperate host. Actual Results: Node hang during shutdown. Expected Results: Node completely shutdown and comes back online. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=847136 https://bugzilla.novell.com/show_bug.cgi?id=847136#c1 --- Comment #1 from Hong Chung <hongc@netapp.com> 2013-10-23 16:48:14 UTC --- I did some more troubleshooting on our end, it seems to be the issue where the separate host that's mounting the cluster file systems to run I/O is defaulted to use nfs version 4. When changing the mounting points to nfsv3, the node is able to reboot fine and did not run into this unmounting issue. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com