[Bug 727997] New: fuser -n tcp <socket> fails on open network tcp socket
https://bugzilla.novell.com/show_bug.cgi?id=727997 https://bugzilla.novell.com/show_bug.cgi?id=727997#c0 Summary: fuser -n tcp <socket> fails on open network tcp socket Classification: openSUSE Product: openSUSE 12.1 Version: Beta 1 Platform: VMWare OS/Version: Other Status: NEW Severity: Normal Priority: P5 - None Component: Other AssignedTo: bnc-team-screening@forge.provo.novell.com ReportedBy: d.a.van.delft@gmail.com QAContact: qa@suse.de Found By: --- Blocker: --- User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; nl; rv:1.9.2.23) Gecko/20110920 SUSE/3.6.23-0.2.1 Firefox/3.6.23 Doesn't happen always. Sometimes after the system has been up for a while, fuser stops working correctly: srv096:~ # netstat -l -n -t -p | grep 7002 tcp 0 0 0.0.0.0:7002 0.0.0.0:* LISTEN 2026/java srv096:~ # /bin/fuser -n tcp 7002 srv096:~ # /bin/fuser 7002/tcp srv096:~ # /usr/bin/lsof -i tcp:7002 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME java 2026 acceptatie 71u IPv4 15182 0t0 TCP *:afs3-prserver (LISTEN) Both netstat and lsof show the 7002 socket present, but fuser returns none, and, not shown here, with exit status 1. Reproducible: Sometimes Steps to Reproduce: 1. 2. 3. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=727997
https://bugzilla.novell.com/show_bug.cgi?id=727997#c
Andreas Jaeger
https://bugzilla.novell.com/show_bug.cgi?id=727997
https://bugzilla.novell.com/show_bug.cgi?id=727997#c1
Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=727997
https://bugzilla.novell.com/show_bug.cgi?id=727997#c2
--- Comment #2 from Danny van Delft
https://bugzilla.novell.com/show_bug.cgi?id=727997
https://bugzilla.novell.com/show_bug.cgi?id=727997#c3
--- Comment #3 from Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=727997
https://bugzilla.novell.com/show_bug.cgi?id=727997#c4
--- Comment #4 from Danny van Delft
https://bugzilla.novell.com/show_bug.cgi?id=727997
https://bugzilla.novell.com/show_bug.cgi?id=727997#c5
Andreas Jaeger
https://bugzilla.novell.com/show_bug.cgi?id=727997
https://bugzilla.novell.com/show_bug.cgi?id=727997#c6
Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=727997
https://bugzilla.novell.com/show_bug.cgi?id=727997#c7
--- Comment #7 from Danny van Delft
Danny, please remember to click on the "This comment provides the needed information." when providing information.
I'd do that, if I would know where to click ;-) AFAICS, I don't have that option. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=727997
https://bugzilla.novell.com/show_bug.cgi?id=727997#c8
Andreas Jaeger
https://bugzilla.novell.com/show_bug.cgi?id=727997
https://bugzilla.novell.com/show_bug.cgi?id=727997#c9
--- Comment #9 from Danny van Delft
(In reply to comment #5)
Danny, please remember to click on the "This comment provides the needed information." when providing information.
I'd do that, if I would know where to click ;-) AFAICS, I don't have that option.
Ignore my reply, I just saw that the option is only present when NEEDINFO is on. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=727997
https://bugzilla.novell.com/show_bug.cgi?id=727997#c10
--- Comment #10 from Danny van Delft
Hmmm ... the problem is: I'm not able to reproduce and this is the requirement to debug and fix this.
Do have also an strace *which* works, please specify the options
-s 1024 -f
to strace so I'm able to read the full strings used for open and read the files.
Also a strace output of a fuser which does *not* work for comparision would be perfect (clearly with the same options).
Can do and will do, however to get fuser to work I'd probably need to reboot the machine, after which it works again for unknown duration. So I could not provide you with other info after the boot. So, before I do the reboot, is there any other info in the non-working situation besides a fuser strace that could be helpful, as afterward it might take a while before it fails again. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=727997
https://bugzilla.novell.com/show_bug.cgi?id=727997#c11
--- Comment #11 from Andreas Jaeger
https://bugzilla.novell.com/show_bug.cgi?id=727997
https://bugzilla.novell.com/show_bug.cgi?id=727997#c12
--- Comment #12 from Andreas Jaeger
https://bugzilla.novell.com/show_bug.cgi?id=727997
https://bugzilla.novell.com/show_bug.cgi?id=727997#c13
--- Comment #13 from Danny van Delft
MMh, I see a difference - for you it fails as root. So, you might want to ignore my comments.
Yep, but it fails for non-root too. Normally fuser only reports on the user-owned resources, only root can see all. To make sure this wasn't a permissions problem, I ran the fuser as root as well, and it still fails. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=727997
https://bugzilla.novell.com/show_bug.cgi?id=727997#c14
--- Comment #14 from Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=727997
https://bugzilla.novell.com/show_bug.cgi?id=727997#c15
--- Comment #15 from Danny van Delft
Hmmm ... what does `some duration' mean? Could it be that after `some duration' processes become swapped out including with their inode informations which is used by fuser to compare open inodes with inodes found in /proc/net/tcp
Smells like a new kernel 3.1 ``feature''
Some duration: the last time (= now) a couple of days since boot. The first and only other time I noticed, it was hours since boot. The first time I didn't pay much attention to it though, as all sorts of other networking problems where present as well. Some of them still are. Who knows, may even be related. The system involved has 3G of mem, swap usage at the moment is 10M, I can connect and communicate with the server at 7002, but still fuser fails. Seems unlikely to me that swapping is the cause of fuser failing. Just for "fun" I killed the server at 7002 and restarted it; still no joy from fuser. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=727997
https://bugzilla.novell.com/show_bug.cgi?id=727997#c16
--- Comment #16 from Danny van Delft
https://bugzilla.novell.com/show_bug.cgi?id=727997
https://bugzilla.novell.com/show_bug.cgi?id=727997#c17
Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=727997
https://bugzilla.novell.com/show_bug.cgi?id=727997#c18
--- Comment #18 from Danny van Delft
Hmm ... just a few questions about your setup: I'd like to know which system is used for domain 0 of your VMware setup. Also I'd like to know which kernel's are used for domain 0 and the client hosted on. Then the network setup of both the domain 0 and the client would be perfect.
The at-the-moment-problematic-machine runs under VMware VSphere 4.0. The first time I noticed this problem though, was on a openSUSE 11.3 host with 2.6.34.8-0.2-default kernel running VMWare server Version 2.0.2. I have not seen the problem again on the 11.3 host though, but it got updated to RC2 in the meantime so comparison may be invalid. What would you want to have as the network setup, something from /etc/sysconfig? I'll reboot the problematic machine shortly and supply the strace of a (hopefully) successful fuser. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=727997
https://bugzilla.novell.com/show_bug.cgi?id=727997#c19
--- Comment #19 from Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=727997
https://bugzilla.novell.com/show_bug.cgi?id=727997#c20
--- Comment #20 from Danny van Delft
... then this is more a VMware specific bug, isn't it? network setup means: are there any bridges or some other uncommon devices around on domain 0 and/or client?
The client has no bridges or whatever, just a static ip at eth0, default gw etc. The 11.3 client eth0 device is bridged to a 11.3 network interface, never gave me any problems. For VMWare VSphere I don't know at the moment, as I don't have console access to it. I have made various VM's using openSUSE 11.1 to 12.1, Redhat ES 3, 4, 5 etc. Only openSUSE 12.1 shows the (sometimes) failing fuser symptoms, so as fas as can see it is not related to VMWare. But, as I don't have a spare hardware box around to install 12.1 on, I can't rule it out either. In the meantime, I've rebooted the vm, and now fuser works again. I'll attach the strace previously asked in the hope this may shed some light... -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=727997
https://bugzilla.novell.com/show_bug.cgi?id=727997#c21
--- Comment #21 from Danny van Delft
https://bugzilla.novell.com/show_bug.cgi?id=727997
https://bugzilla.novell.com/show_bug.cgi?id=727997#c22
--- Comment #22 from Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=727997
https://bugzilla.novell.com/show_bug.cgi?id=727997#c23
--- Comment #23 from Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=727997
https://bugzilla.novell.com/show_bug.cgi?id=727997#c
Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=727997
https://bugzilla.novell.com/show_bug.cgi?id=727997#c24
Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=727997
https://bugzilla.novell.com/show_bug.cgi?id=727997#c25
--- Comment #25 from Danny van Delft
Please test out both binaries of attachment #460937 [details] and attachment #460939 [details] but be aware that those binries work different.
I'd like to know how both do their works that is if one or both will fail the same way on your virtual system as the original one.
Will do, the moment it fails again. Since reboot day before yesterday it's still working OK. There is perhaps one relevant change: I've now rebooted with sysvinit, while previously it was systemd. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=727997
https://bugzilla.novell.com/show_bug.cgi?id=727997#c26
--- Comment #26 from Dr. Werner Fink
grep -E 'E(ACCES|PERM)' ~/Downloads/bug-727997/* ~/Downloads/bug-727997/fuser-fail-strace.txt:31399 stat("/data/sites/acceptatie/home/acceptatie/.gvfs", 0x609544) = -1 EACCES (Permission denied)
at a first glance this should not matter. The only difference seems to a NFS(?) share. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=727997
https://bugzilla.novell.com/show_bug.cgi?id=727997#c27
--- Comment #27 from Danny van Delft
https://bugzilla.novell.com/show_bug.cgi?id=727997
https://bugzilla.novell.com/show_bug.cgi?id=727997#c28
--- Comment #28 from Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=727997
https://bugzilla.novell.com/show_bug.cgi?id=727997#c29
--- Comment #29 from Danny van Delft
Then I'd like to know *why* root is not allowd to do a stat(2) system call on this .gvfs file?
It seems normal behaviour, just google around. I also get these kind of access errors while making a backup as root, on all kinds of machines with this .gvfs present. The associated program is /usr/lib/gvfs/gvfs-fuse-daemon. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=727997
https://bugzilla.novell.com/show_bug.cgi?id=727997#c30
--- Comment #30 from Danny van Delft
https://bugzilla.novell.com/show_bug.cgi?id=727997
https://bugzilla.novell.com/show_bug.cgi?id=727997#c31
Danny van Delft
https://bugzilla.novell.com/show_bug.cgi?id=727997
https://bugzilla.novell.com/show_bug.cgi?id=727997#c32
Dr. Werner Fink
After changing init from systemd, where all above occurred, to sysV, the machine has now been up for 6+ days without failing fuser. So perhaps it is systemd related.
Now the question rises: What happens with the VMWare client system if systemd instead of SysVinit is used ... why is fuser not able anymore to do a simple inode/device comparision anymore? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=727997
https://bugzilla.novell.com/show_bug.cgi?id=727997#c33
Frederic Crozat
https://bugzilla.novell.com/show_bug.cgi?id=727997
https://bugzilla.novell.com/show_bug.cgi?id=727997#c34
Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=727997
https://bugzilla.novell.com/show_bug.cgi?id=727997#c35
Danny van Delft
https://bugzilla.novell.com/show_bug.cgi?id=727997
https://bugzilla.novell.com/show_bug.cgi?id=727997#c36
mian hou
https://bugzilla.novell.com/show_bug.cgi?id=727997
https://bugzilla.novell.com/show_bug.cgi?id=727997#c37
--- Comment #37 from Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=727997
https://bugzilla.novell.com/show_bug.cgi?id=727997#c38
--- Comment #38 from mian hou
https://bugzilla.novell.com/show_bug.cgi?id=727997
https://bugzilla.novell.com/show_bug.cgi?id=727997#c39
Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=727997
https://bugzilla.novell.com/show_bug.cgi?id=727997#c40
Dr. Werner Fink
participants (1)
-
bugzilla_noreply@novell.com