[Bug 627972] New: getcwd(2) returns bogus path
http://bugzilla.novell.com/show_bug.cgi?id=627972 http://bugzilla.novell.com/show_bug.cgi?id=627972#c0 Summary: getcwd(2) returns bogus path Classification: openSUSE Product: openSUSE 11.1 Version: Final Platform: x86-64 OS/Version: Other Status: NEW Severity: Critical Priority: P5 - None Component: Kernel AssignedTo: bnc-team-screening@forge.provo.novell.com ReportedBy: koenig@linux.de QAContact: qa@suse.de Found By: --- Blocker: --- recently we updated the kernel from 2.6.27.45-0.1.1 to 2.6.27.48-0.1.1, my $HOME is mounted via autofs. now, for the 1st time, I notice the following strange problems with some shell scripts: for one (of many open;) xterm and the bash within /bin/pwd now returns "koenig/dir" instead of "/home/koenig/dir" in other bash instances which are in the same dir (note esp. the missing / for the broken /bin/pwd output!). in other xterm/bash instances this does not happen at all, while it's 100% reproducable in that one bash. "strace /bin/pwd" shows that that bad string directly comes from getcwd(2): good bash: getcwd("/home/koenig/dir", 4096) = 17 bad bash: getcwd("koenig/dir", 4096) = 11 a "cd $PWD" or "cd subdir" fixes the problem for this shell, but not the parent process/shell (same for interactive shells/tests): $ bash -c "/bin/pwd" koenig/dir $ bash -c "cd $PWD ; /bin/pwd" /home/koenig/dir $ bash -c "cd subdir ; /bin/pwd" /home/koenig/dir/subdir $ /bin/pwd koenig/dir more facts: "stat ." show identical output in shells with good and broken pwd output. that directory is a quite old dir, it was not removed/recreated/renamed/whatsoever in the last decade /home is ext3fs this happend twice today in two different directories! in both cases the 1st error was from acroread (it claims "ERROR: Cannot determine current directory."). for the 1st problem I just did "cd $PWD" (thinking about some "rm dir ; mkdir dir" problem), for the 2nd occurance I started some more testing to see what's coing on. we installed that new kernel on July 29, so it's only running now for 2 office days with at least 2 instances of this new(?) behaviour. autofs did not change recently (install date Mar 16 2009) /proc/self/cwd shows the same broken information for that bash process: lrwxrwxrwx 1 koenig s+c 0 Aug 3 17:55 cwd -> koenig/dir dmesg doesn't show any error or (to me) significant output a closer look at the strace output shows more weird differences: the st_ino=... return value for all stat() and fstat() calls differ. surprisingly, the strace of the "bad" /bin/pwd shows the correct st_ino vaules (compared with "ls -i file" and "stat file" in both a broken and good bash instance -- all show the same inode numbers!) any idea what's going on here ? any problem in the new kernel, like some weird memory corruption or similar ? I'll run a full fsck on that partition overnight -- just in case.... -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=627972
http://bugzilla.novell.com/show_bug.cgi?id=627972#c
Leonardo Chiquitto
http://bugzilla.novell.com/show_bug.cgi?id=627972
http://bugzilla.novell.com/show_bug.cgi?id=627972#c1
--- Comment #1 from Leonardo Chiquitto
http://bugzilla.novell.com/show_bug.cgi?id=627972
http://bugzilla.novell.com/show_bug.cgi?id=627972#c2
Leonardo Chiquitto
http://bugzilla.novell.com/show_bug.cgi?id=627972
http://bugzilla.novell.com/show_bug.cgi?id=627972#c3
Harald Koenig
Harald, considering the previous comment, are you OK with closing it as "won't fix"?
ACK, 11.1 is more or less end-of-life anyway;) but one comment reading your interesting information: I did not find any evidence that autofs on my PC got restarted. I've checked /var/log/messages etc. unfortunately I did not keep the output of ps before rebooting, so I'll have to wait for the next time this will happen to do more checks (or just cross fingers;) -- ticket closed for now... thanks for the info! -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=627972
http://bugzilla.novell.com/show_bug.cgi?id=627972#c4
Leonardo Chiquitto
http://bugzilla.novell.com/show_bug.cgi?id=627972
http://bugzilla.novell.com/show_bug.cgi?id=627972#c5
--- Comment #5 from Harald Koenig
I did not find any evidence that autofs on my PC got restarted.
FYI a quick update: I just had the chance to restart autofs on a suse 11.1 system (did not want to test this on my PC right now -- one never knows...;-) restarting autofs on that system shows this msg in syslog with the old PID of automount: "umount_autofs_indirect: ask umount returned busy /home" I find the same message in my own PC's syslog file: Aug 3 15:18:09 atuin pm-suspend[29642]: Entering suspend. In case of problems, please check /var/log/pm-suspend.log Aug 3 15:18:10 atuin automount[5602]: umount_autofs_indirect: ask umount returned busy /home so you're totally right: the restart of autofs got triggered by a test of suspend2ram for my PC, and it all was about restarting autofs. thanks again! -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=627972
https://bugzilla.novell.com/show_bug.cgi?id=627972#c6
Harald Koenig
Hi Harald, thanks for the bug report. I'm afraid this is a known problem (please see bug #565151).
The summary is:
The problem only happens when AutoFS is restarted. Running processes, with $CWD set to an automounted directory, will get "truncated" results from getcwd().
Here's the status for our supported openSUSE releases:
openSUSE 11.3: fixed.
RUMORS!!! actually today I did the same suspend/resume test with my desktop PC, now running opensuse 11.3 -- and surprise: I slipped into the same bogus behaviour as last year with opensuse 11.1!! atuin > acroread ERROR: Cannot determine current directory. atuin > pwd /home/koenig/dir atuin > /bin/pwd ; echo koenigdir atuin > strace -e getcwd /bin/pwd getcwd("koenigdir", 4096) = 9 please (also?!) note the missing slash between my home dir name "koenig" and the subdir name "dir" ! the 2.6.27 kernel from opensuse 11.1 at least did still print that slash which is now missing too ;-) atuin > uname -a Linux atuin 2.6.34.7-0.7-default #1 SMP 2010-12-13 11:13:53 +0100 x86_64 x86_64 x86_64 GNU/Linux atuin > rpm -q autofs autofs-5.0.5-7.2.x86_64 Harald -- now offline for a reboot... :-((( -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=627972
https://bugzilla.novell.com/show_bug.cgi?id=627972#c
Leonardo Chiquitto
https://bugzilla.novell.com/show_bug.cgi?id=627972
https://bugzilla.novell.com/show_bug.cgi?id=627972#c7
Leonardo Chiquitto
https://bugzilla.novell.com/show_bug.cgi?id=627972
https://bugzilla.novell.com/show_bug.cgi?id=627972#c8
Harald Koenig
Please attach /etc/sysconfig/autofs here.
DEFAULT_BROWSE_MODE=no that'sall.... autofs gets all it's data via NIS: atuin koenig > grep auto /etc/nsswitch.conf automount: nis files atuin koenig > ypmatch /home auto.master auto.home -rw,grpid,hard,intr,nodevs,nosuid atuin koenig > ypmatch koenig auto.home atuin:/net/atuin/fs1/home/& -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=627972
https://bugzilla.novell.com/show_bug.cgi?id=627972#c9
Leonardo Chiquitto
Please attach /etc/sysconfig/autofs here.
DEFAULT_BROWSE_MODE=no
that'sall.... autofs gets all it's data via NIS:
That explains why you're still seeing the problem. You need to add the following line to /etc/sysconfig/autofs: USE_MISC_DEVICE="yes" This is set by default in the sysconfig file shipped with the package, but you removed it for some reason. This means AutoFS is *not* using the misc device (/dev/autofs), the feature that resolves this bug. Although the original bug is fixed (if you have the option explicitly set to "yes"), your comments have made me realize we still have a bug in our init script: if $USE_MISC_DEVICE is not defined, it should be interpreted as "yes" by default (currently this is not the case and that's why you hit the bug again). I'll report this in a new bug and fix it in openSUSE Factory. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=627972
https://bugzilla.novell.com/show_bug.cgi?id=627972#c10
--- Comment #10 from Harald Koenig
That explains why you're still seeing the problem. You need to add the following line to /etc/sysconfig/autofs:
USE_MISC_DEVICE="yes"
ACK! with USE_MISC_DEVICE="yes" and "rcautofs restart" there are no longer getcwd() problems after suspend2ram! thanks a lot for your quick help (and the fix in #684997 -- *please* feed this change into updates for 11.4 and 11.3, too!) -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com