[Bug 611996] New: after reboot: recovering journal
http://bugzilla.novell.com/show_bug.cgi?id=611996 http://bugzilla.novell.com/show_bug.cgi?id=611996#c0 Summary: after reboot: recovering journal Classification: openSUSE Product: openSUSE 11.3 Version: Factory Platform: x86-64 OS/Version: Other Status: NEW Severity: Major Priority: P5 - None Component: Basesystem AssignedTo: bnc-team-screening@forge.provo.novell.com ReportedBy: koenig@linux.de QAContact: qa@suse.de Found By: --- Blocker: --- quite often after (clean?!) reboot I get "recovering journal" for the rootfs. # df -T Filesystem Type 1K-blocks Used Available Use% Mounted on /dev/sda2 ext3 20641788 4005120 15588028 21% / devtmpfs devtmpfs 505176 460 504716 1% /dev tmpfs tmpfs 510924 4 510920 1% /dev/shm /dev/sda3 ext3 169576700 309680 160652968 1% /home # fdisk -l Disk /dev/sda: 200.0 GB, 200049647616 bytes 255 heads, 63 sectors/track, 24321 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x0009110e Device Boot Start End Blocks Id System /dev/sda1 1 262 2104483+ 82 Linux swap / Solaris /dev/sda2 * 263 2873 20972857+ 83 Linux /dev/sda3 2874 24321 172281060 83 Linux from boot.omsg and boot.msg: --- 8< ------ 8< ------ 8< ------ 8< ------ 8< ------ 8< ------ 8< ------ 8< ------ 8< --- Kernel logging (ksyslog) stopped. Kernel log daemon terminating. Boot logging started on /dev/char/../tty1(/dev/console) at Sun Jun 6 15:37:32 2010 Trying manual resume from /dev/disk/by-id/ata-ST3200822AS_5LJ0C1A7-part1 Invoking userspace resume from /dev/disk/by-id/ata-ST3200822AS_5LJ0C1A7-part1 resume: libgcrypt version: 1.4.4 Trying manual resume from /dev/disk/by-id/ata-ST3200822AS_5LJ0C1A7-part1 Invoking in-kernel resume from /dev/disk/by-id/ata-ST3200822AS_5LJ0C1A7-part1 Waiting for device /dev/disk/by-id/ata-ST3200822AS_5LJ0C1A7-part2 to appear: ok fsck from util-linux-ng 2.17.2 [/sbin/fsck.ext3 (1) -- /] fsck.ext3 -a -C0 /dev/sda2 /dev/sda2: recovering journal /dev/sda2: Clearing orphaned inode 886598 (uid=0, gid=0, mode=0100600, size=217016) /dev/sda2: Clearing orphaned inode 615866 (uid=0, gid=0, mode=0100755, size=131400) /dev/sda2: Clearing orphaned inode 544725 (uid=0, gid=0, mode=0100755, size=106096) /dev/sda2: Clearing orphaned inode 544723 (uid=0, gid=0, mode=0100755, size=39952) /dev/sda2: Clearing orphaned inode 541798 (uid=0, gid=0, mode=0100755, size=265864) /dev/sda2: Clearing orphaned inode 541751 (uid=0, gid=0, mode=0100755, size=149975) /dev/sda2: Clearing orphaned inode 541736 (uid=0, gid=0, mode=0100755, size=1670408) /dev/sda2: Clearing orphaned inode 541899 (uid=0, gid=0, mode=0100755, size=108252) /dev/sda2: Clearing orphaned inode 541922 (uid=0, gid=0, mode=0100755, size=42889) /dev/sda2: Clearing orphaned inode 541961 (uid=0, gid=0, mode=0100755, size=61747) /dev/sda2: Clearing orphaned inode 543838 (uid=0, gid=0, mode=0100755, size=52494) /dev/sda2: Clearing orphaned inode 541834 (uid=0, gid=0, mode=0100755, size=135934) /dev/sda2: Clearing orphaned inode 544479 (uid=0, gid=0, mode=0100755, size=47294) /dev/sda2: clean, 147044/1313280 files, 1083196/5242880 blocks fsck succeeded. Mounting root device read-write. Mounting root /dev/disk/by-id/ata-ST3200822AS_5LJ0C1A7-part2 mount -o rw,acl,user_xattr -t ext3 /dev/disk/by-id/ata-ST3200822AS_5LJ0C1A7-part2 /root Boot logging started on /dev/char/../tty1(/dev/console) at Sun Jun 6 15:37:47 2010 --- 8< ------ 8< ------ 8< ------ 8< ------ 8< ------ 8< ------ 8< ------ 8< ------ 8< --- Kernel logging (ksyslog) stopped. Kernel log daemon terminating. Boot logging started on /dev/char/../tty1(/dev/console) at Sun Jun 6 16:10:01 2010 Trying manual resume from /dev/disk/by-id/ata-ST3200822AS_5LJ0C1A7-part1 Invoking userspace resume from /dev/disk/by-id/ata-ST3200822AS_5LJ0C1A7-part1 resume: libgcrypt version: 1.4.4 Trying manual resume from /dev/disk/by-id/ata-ST3200822AS_5LJ0C1A7-part1 Invoking in-kernel resume from /dev/disk/by-id/ata-ST3200822AS_5LJ0C1A7-part1 Waiting for device /dev/disk/by-id/ata-ST3200822AS_5LJ0C1A7-part2 to appear: ok fsck from util-linux-ng 2.17.2 [/sbin/fsck.ext3 (1) -- /] fsck.ext3 -a -C0 /dev/sda2 /dev/sda2: recovering journal /dev/sda2: Clearing orphaned inode 886599 (uid=0, gid=0, mode=0100600, size=217016) /dev/sda2: clean, 147063/1313280 files, 1083469/5242880 blocks fsck succeeded. Mounting root device read-write. Mounting root /dev/disk/by-id/ata-ST3200822AS_5LJ0C1A7-part2 mount -o rw,acl,user_xattr -t ext3 /dev/disk/by-id/ata-ST3200822AS_5LJ0C1A7-part2 /root Boot logging started on /dev/char/../tty1(/dev/console) at Sun Jun 6 16:10:09 2010 --- 8< ------ 8< ------ 8< ------ 8< ------ 8< ------ 8< ------ 8< ------ 8< ------ 8< --- -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=611996
http://bugzilla.novell.com/show_bug.cgi?id=611996#c1
Jon Nelson
http://bugzilla.novell.com/show_bug.cgi?id=611996
http://bugzilla.novell.com/show_bug.cgi?id=611996#c3
Dr. Werner Fink
http://bugzilla.novell.com/show_bug.cgi?id=611996
http://bugzilla.novell.com/show_bug.cgi?id=611996#c4
Jan Kara
http://bugzilla.novell.com/show_bug.cgi?id=611996
http://bugzilla.novell.com/show_bug.cgi?id=611996#c5
Dr. Werner Fink
http://bugzilla.novell.com/show_bug.cgi?id=611996
http://bugzilla.novell.com/show_bug.cgi?id=611996#c6
--- Comment #6 from Harald Koenig
(I'm assuming that /var and /var/log belongs to your root file system:)
correct assumption: # df Filesystem 1K-blocks Used Available Use% Mounted on /dev/sda2 20641788 4196792 15396356 22% / devtmpfs 509484 460 509024 1% /dev tmpfs 510928 4 510924 1% /dev/shm /dev/sda3 169576700 337140 160625508 1% /home
Harald? Please could you add a lines
lsof / > /var/log/root.lsof sync
before the line
mount -no remount,ro / 2> /dev/null
in /etc/init.d/halt and attach the resulting /var/log/root.lsof
attached... boot.msg after reboot does not explicitly show that the journal got read (this is only visible on the console screen:( but the time stamps suggest that it did happen again for this reboot (can only check the console output in late evening tonight...). full root.lsof will be attached... this one might cause the problem ?! rc 4990 root DEL REG 8,2 886616 /var/run/nscd/passwd -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=611996
http://bugzilla.novell.com/show_bug.cgi?id=611996#c7
Harald Koenig
http://bugzilla.novell.com/show_bug.cgi?id=611996
http://bugzilla.novell.com/show_bug.cgi?id=611996#c
Dr. Werner Fink
http://bugzilla.novell.com/show_bug.cgi?id=611996
http://bugzilla.novell.com/show_bug.cgi?id=611996#c8
Dr. Werner Fink
http://bugzilla.novell.com/show_bug.cgi?id=611996
http://bugzilla.novell.com/show_bug.cgi?id=611996#c
Dr. Werner Fink
http://bugzilla.novell.com/show_bug.cgi?id=611996
http://bugzilla.novell.com/show_bug.cgi?id=611996#c9
--- Comment #9 from Harald Koenig
Hmmmm ... the /var/run/nscd/passwd is marked as deleted but it seem not happened here. Question: does a line
fsync /var/run/nscd/passwd 2>/dev/null
before the lsof command help?
tried it (without 2>/dev/null for fast eyes;), but this does not help, (as expected, see below). /var/log/root.lsof again shows rc 5892 root DEL REG 8,2 886616 /var/run/nscd/passwd and dmesg again shows a ~7 seconds which likely is the time for ext4 journal rollback: [ 5.182026] PM: Resume from disk failed. [ 12.091046] kjournald starting. Commit interval 15 seconds [ 12.091210] EXT3-fs (sda2): using internal journal [ 12.091218] EXT3-fs (sda2): mounted filesystem with ordered data mode [ 13.868606] preloadtrace: systemtap: 1.1/0.147, base: ffffffffa0220000, memory: 37data/40text/12ctx/13net/396alloc kb, probes: 44 for trying to fsync(1) a deleted file, try: sleep 999 > /tmp/SLEEP & sleep 1 lsof -p `pidof sleep` | grep SLEEP fsync /tmp/SLEEP rm /tmp/SLEEP lsof -p `pidof sleep` | grep SLEEP fsync /tmp/SLEEP with this output for the final fsync: Usage: fsync file /tmp/SLEEP: No such file or directory this does not perfectly match the /var/run/nscd/passwd example as rc uses map (see the FD column in lsof output showing "DEL" instead of a fd number!) but it shows that without existing directory entry for that file fsync(1) won't work... -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=611996
http://bugzilla.novell.com/show_bug.cgi?id=611996#c
Harald Koenig
http://bugzilla.novell.com/show_bug.cgi?id=611996
http://bugzilla.novell.com/show_bug.cgi?id=611996#c10
--- Comment #10 from Harald Koenig
http://bugzilla.novell.com/show_bug.cgi?id=611996
http://bugzilla.novell.com/show_bug.cgi?id=611996#c11
--- Comment #11 from Harald Koenig
is there any drawback when using unscd instead of nscd ?
just FYI: yes, there is! please have a look at bug #622910 where unscd does not start anymore/at all... -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=611996
http://bugzilla.novell.com/show_bug.cgi?id=611996#c12
--- Comment #12 from Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=611996
https://bugzilla.novell.com/show_bug.cgi?id=611996#c13
Joerg Steffens
https://bugzilla.novell.com/show_bug.cgi?id=611996
https://bugzilla.novell.com/show_bug.cgi?id=611996#c14
--- Comment #14 from Joerg Steffens
https://bugzilla.novell.com/show_bug.cgi?id=611996
https://bugzilla.novell.com/show_bug.cgi?id=611996#c15
--- Comment #15 from Joerg Steffens
https://bugzilla.novell.com/show_bug.cgi?id=611996
https://bugzilla.novell.com/show_bug.cgi?id=611996#c
zj jia
https://bugzilla.novell.com/show_bug.cgi?id=611996
https://bugzilla.novell.com/show_bug.cgi?id=611996#c17
Michael Matz
https://bugzilla.novell.com/show_bug.cgi?id=611996
https://bugzilla.novell.com/show_bug.cgi?id=611996#c18
Dr. Werner Fink
participants (1)
-
bugzilla_noreply@novell.com