[Bug 673917] New: machine crashes when using samba/cifs
https://bugzilla.novell.com/show_bug.cgi?id=673917 https://bugzilla.novell.com/show_bug.cgi?id=673917#c0 Summary: machine crashes when using samba/cifs Classification: openSUSE Product: openSUSE 11.3 Version: Final Platform: i386 OS/Version: openSUSE 11.3 Status: NEW Severity: Critical Priority: P5 - None Component: Samba AssignedTo: samba-maintainers@SuSE.de ReportedBy: bugzilla@singvogel.com QAContact: samba-maintainers@SuSE.de Found By: --- Blocker: --- User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; de-DE; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13 ( .NET CLR 3.5.30729; .NET4.0E) I'm having a head-less machine housing at a service provider (= no access to console). The contract also includes some few space on a backup-server. The backup-server can be accessed via samba-share. So I'm using cifs to do my backup: a bigfile at the backup-server is generated at backup-server. I mount the backup-server to my machine via mount.cifs. I mount the bigfile on local machine via loop-device as ext2. I do backup to this loop-device via rsync (on local-machine). There was a security update of package "samba-client-3.5.4-5.3.1" on Feb. 10th. Since this time a noticed 4 machine crashes: Feb 11th, 16th, 20th, 22nd I changed the time of the backup on Feb 17th, and again the crash Feb. 20th and 22nd happened shortly after the backup started. I cannot provide more information, as I don't have any access to the console. My logfile (/var/log/messages) only shows this information: Feb 22 04:50:01 ipx20178 /usr/sbin/cron[12167]: (root) CMD (/root/bin/rsync_back up.sh) Feb 22 04:50:22 nest kernel: [172609.017730] CIFS VFS: server 192.168.10.91 of type Samba 3.2.5 returned unexpected error on SMB posix open, disabling posix open support. Check if server update available. In the script is one sleep of 20 seconds and one of 10 seconds during startup. I seems the crash happens between: mount -t cifs -o username=$username,password=$password,rw \ //$backupserver/$username $dst_dir sleep 20 /sbin/losetup ${loop} ${file} Unnecessary to say: it doesn't happen deterministic. Reproducible: Sometimes -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=673917
https://bugzilla.novell.com/show_bug.cgi?id=673917#c1
David Disseldorp
So I'm using cifs to do my backup: a bigfile at the backup-server is generated at backup-server. I mount the backup-server to my machine via mount.cifs. I mount the bigfile on local machine via loop-device as ext2. I do backup to this loop-device via rsync (on local-machine).
So to clarify, your backup is writing to bigfile? This backup method appears cumbersome at best. Is it possible bigfile is being written to or read from by any other local or remote process? i.e. Could more than one CIFS client be connected to the share?
There was a security update of package "samba-client-3.5.4-5.3.1" on Feb. 10th. Since this time a noticed 4 machine crashes: Feb 11th, 16th, 20th, 22nd
What do you mean by machine crash, does the kernel panic? Please provide a core dump: http://opensuse-man-ja.berlios.de/opensuse-html/cha.tuning.kexec.html#cha.tu...
I cannot provide more information, as I don't have any access to the console.
You do not have access to the server console, In this case we are concerned with client information.
My logfile (/var/log/messages) only shows this information:
Feb 22 04:50:01 ipx20178 /usr/sbin/cron[12167]: (root) CMD (/root/bin/rsync_back up.sh) Feb 22 04:50:22 nest kernel: [172609.017730] CIFS VFS: server 192.168.10.91 of type Samba 3.2.5 returned unexpected error on SMB posix open, disabling posix open support. Check if server update available.
Suresh [cc'ed] has looked into cases where broken_posix_open has caused problems in the past and may be able to comment. https://patchwork.kernel.org/patch/63724/ -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=673917
https://bugzilla.novell.com/show_bug.cgi?id=673917#c2
Klaus Singvogel
https://bugzilla.novell.com/show_bug.cgi?id=673917
https://bugzilla.novell.com/show_bug.cgi?id=673917#c3
--- Comment #3 from David Disseldorp
Answers: ..
Yes, "crash" means kernel panic. I see the reboot in logfile /var/log/boot.msg. Core dump feature is enable now. I'll provide data on next crash. What should I provide? dump file or output of debugger?
Just the dump file should be sufficient.
I wonder about the fact that mount.cifs is part of "cifs-utils" package. But only "samba-client" was updated Feb. 10th, no package "cifs-utils". Hopefully there are no links (i.e. library calls) from one to other package, which caused the crashes.
The Linux CIFS client is made up of userspace and kernel components (cifs.ko). -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=673917
https://bugzilla.novell.com/show_bug.cgi?id=673917#c4
Klaus Singvogel
https://bugzilla.novell.com/show_bug.cgi?id=673917
https://bugzilla.novell.com/show_bug.cgi?id=673917#c5
Klaus Singvogel
https://bugzilla.novell.com/show_bug.cgi?id=673917
https://bugzilla.novell.com/show_bug.cgi?id=673917#c6
David Disseldorp
Last night another crash happens. Again there is no dumpfile in /var/crash/.
Do you have enough free space for the dump? You can test the crash dump manually with "echo c >/proc/sysrq-trigger", does this produce a dumpfile? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=673917
https://bugzilla.novell.com/show_bug.cgi?id=673917#c7
--- Comment #7 from Klaus Singvogel
Do you have enough free space for the dump?
You can test the crash dump manually with "echo c >/proc/sysrq-trigger", does
Yes, I think so: # df -h / Filesystem Size Used Avail Use% Mounted on /dev/sda2 7.9G 4.9G 2.6G 66% / # free total used free shared buffers cached Mem: 445992 355428 90564 0 40844 74396 -/+ buffers/cache: 240188 205804 Swap: 1606496 1316 1605180 this produce a dumpfile? Does this reboot the machine? It's a head-less machine and a reboot triggered by a thechnician costs me money (around 30USD). PS: Needless to say, a crash happened last night, no dumpfile. Trying to reproduce it at day time (>10 tries), doesn't crash. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=673917
https://bugzilla.novell.com/show_bug.cgi?id=673917#c8
--- Comment #8 from Suresh Jayaraman
https://bugzilla.novell.com/show_bug.cgi?id=673917
https://bugzilla.novell.com/show_bug.cgi?id=673917#c9
Klaus Singvogel
https://bugzilla.novell.com/show_bug.cgi?id=673917
https://bugzilla.novell.com/show_bug.cgi?id=673917#c10
--- Comment #10 from Klaus Singvogel
https://bugzilla.novell.com/show_bug.cgi?id=673917
https://bugzilla.novell.com/show_bug.cgi?id=673917#c11
Suresh Jayaraman
https://bugzilla.novell.com/show_bug.cgi?id=673917
https://bugzilla.novell.com/show_bug.cgi?id=673917#c12
Klaus Singvogel
And what makes you think that this crash has something to do with CIFS?
The crashes happen during the backup, which is using cifs. I changed the croontab time of backup from 3:40 a.m. to 5:40 a.m., and the next crash happened at 5:41 a.m. (or so). Then I changed it to 6:40 a.m., and the crash happened at 6.41 a.m. (or so). I also saw in the logfiles of backup that it never run to end. It all began shortly after the update to samba-client-3.5.4-5.3.1
can you try to go back to older version of samba-client and see whether the issue is occuring?
I've installed samba-client-3.5.4-5.1.2.i586.rpm now, and disabled automatic updates via zypper. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=673917
https://bugzilla.novell.com/show_bug.cgi?id=673917#c13
--- Comment #13 from Klaus Singvogel
https://bugzilla.novell.com/show_bug.cgi?id=673917
https://bugzilla.novell.com/show_bug.cgi?id=673917#c14
--- Comment #14 from Klaus Singvogel
https://bugzilla.novell.com/show_bug.cgi?id=673917
https://bugzilla.novell.com/show_bug.cgi?id=673917#c15
--- Comment #15 from Suresh Jayaraman
https://bugzilla.novell.com/show_bug.cgi?id=673917
https://bugzilla.novell.com/show_bug.cgi?id=673917#c
Suresh Jayaraman
https://bugzilla.novell.com/show_bug.cgi?id=673917
https://bugzilla.novell.com/show_bug.cgi?id=673917#c16
Klaus Singvogel
https://bugzilla.novell.com/show_bug.cgi?id=673917
https://bugzilla.novell.com/show_bug.cgi?id=673917#c17
--- Comment #17 from Klaus Singvogel
https://bugzilla.novell.com/show_bug.cgi?id=673917
https://bugzilla.novell.com/show_bug.cgi?id=673917#c18
Suresh Jayaraman
https://bugzilla.novell.com/show_bug.cgi?id=673917
https://bugzilla.novell.com/show_bug.cgi?id=673917#c19
Klaus Singvogel
https://bugzilla.novell.com/show_bug.cgi?id=673917
https://bugzilla.novell.com/show_bug.cgi?id=673917#c20
Suresh Jayaraman
https://bugzilla.novell.com/show_bug.cgi?id=673917
https://bugzilla.novell.com/show_bug.cgi?id=673917#c21
--- Comment #21 from Klaus Singvogel
https://bugzilla.novell.com/show_bug.cgi?id=673917
https://bugzilla.novell.com/show_bug.cgi?id=673917#c22
David Disseldorp
participants (1)
-
bugzilla_noreply@novell.com