[Bug 273354] New: ext3 problems on cryptofs (updated from 10.2)
https://bugzilla.novell.com/show_bug.cgi?id=273354 Summary: ext3 problems on cryptofs (updated from 10.2) Product: openSUSE 10.3 Version: Alpha 3plus Platform: Other OS/Version: Other Status: NEW Severity: Blocker Priority: P5 - None Component: Kernel AssignedTo: kernel-maintainers@forge.provo.novell.com ReportedBy: seife@novell.com QAContact: qa@suse.de CC: lnussel@novell.com, mkoenig@novell.com Since two days, i have lots of problems with my cryptohome, which i installed with 10.2. Filesystem is ext3. I often get things like this in the log: May 10 10:03:28 strolchi kernel: attempt to access beyond end of device May 10 10:03:28 strolchi kernel: sda6: rw=1, want=17592187503960, limit=10490382 May 10 10:03:28 strolchi kernel: Buffer I/O error on device sda6, logical block 8796093751979 May 10 10:03:28 strolchi kernel: lost page write due to I/O error on sda6 May 10 10:03:29 strolchi kernel: Buffer I/O error on device dm-0, logical block 8250 May 10 10:03:29 strolchi kernel: lost page write due to I/O error on dm-0 May 10 10:03:29 strolchi kernel: Aborting journal on device dm-0. May 10 10:03:33 strolchi kernel: ext3_abort called. May 10 10:03:33 strolchi kernel: EXT3-fs error (device dm-0): ext3_journal_start_sb: Detected aborted journal May 10 10:03:33 strolchi kernel: Remounting filesystem read-only After that, i did a fsck, which did find problems, fixed them and re-ran fsck multiple times, until there was no more problem. Later i got: May 10 13:07:56 strolchi kernel: attempt to access beyond end of device May 10 13:07:56 strolchi kernel: sda6: rw=1, want=17592192221596, limit=10490382 May 10 13:07:56 strolchi kernel: Buffer I/O error on device sda6, logical block 8796096110797 May 10 13:07:56 strolchi kernel: lost page write due to I/O error on sda6 May 10 13:07:56 strolchi kernel: Buffer I/O error on device dm-0, logical block 15266 May 10 13:07:56 strolchi kernel: lost page write due to I/O error on dm-0 May 10 13:07:56 strolchi kernel: Aborting journal on device dm-0. May 10 13:07:56 strolchi kernel: journal commit I/O error May 10 13:07:56 strolchi kernel: ext3_abort called. May 10 13:07:56 strolchi kernel: EXT3-fs error (device dm-0): ext3_journal_start_sb: Detected aborted journal May 10 13:07:56 strolchi kernel: Remounting filesystem read-only So i thought "well, e2fsck is just buggy and does not find the corruption", so i backed up the fs to a tarball, did "mkfs -j /dev/mapper/cryptotab_loop0", restored the backup. Now i got: May 10 17:24:55 strolchi kernel: attempt to access beyond end of device May 10 17:24:55 strolchi kernel: sda7: rw=1, want=70368860675912, limit=123652242 May 10 17:24:55 strolchi kernel: Buffer I/O error on device sda7, logical block 8796107584488 May 10 17:24:55 strolchi kernel: lost page write due to I/O error on sda7 May 10 17:25:58 strolchi kernel: attempt to access beyond end of device May 10 17:25:58 strolchi kernel: sda6: rw=1, want=17592189307012, limit=10490382 May 10 17:25:58 strolchi kernel: Buffer I/O error on device sda6, logical block 8796094653505 May 10 17:25:58 strolchi kernel: lost page write due to I/O error on sda6 May 10 17:25:59 strolchi kernel: Buffer I/O error on device dm-0, logical block 3613 May 10 17:25:59 strolchi kernel: lost page write due to I/O error on dm-0 May 10 17:25:59 strolchi kernel: Aborting journal on device dm-0. May 10 17:26:00 strolchi kernel: ext3_abort called. May 10 17:26:00 strolchi kernel: EXT3-fs error (device dm-0): ext3_journal_start_sb: Detected aborted journal May 10 17:26:00 strolchi kernel: Remounting filesystem read-only May 10 17:59:16 strolchi syslog-ng[3056]: STATS: dropped 0 Which is interesting, because sda7 is a different partition. The machine hung hard some time after that (i was away and when i came back, it was dead, no sysrq). my setup: seife@strolchi:~> cat /etc/fstab /dev/disk/by-id/edd-int13_dev80-part2 /boot ext2 acl,user_xattr 1 2 /dev/disk/by-id/edd-int13_dev80-part5 / ext3 acl,user_xattr 1 1 /dev/disk/by-id/edd-int13_dev80-part7 /local ext3 acl,user_xattr 1 2 /dev/disk/by-id/edd-int13_dev80-part3 swap swap defaults 0 0 proc /proc proc defaults 0 0 sysfs /sys sysfs noauto 0 0 debugfs /sys/kernel/debug debugfs noauto 0 0 usbfs /proc/bus/usb usbfs noauto 0 0 devpts /dev/pts devpts mode=0620,gid=5 0 0 /home/seife/local/news /var/spool/news none rw,bind,noauto 0 0 seife@strolchi:~> cat /etc/cryptotab /dev/loop0 /dev/sda6 /home ext3 twofish256 noatime,user_xattr root@strolchi:/# fdisk -l Disk /dev/sda: 80.0 GB, 80026361856 bytes 255 heads, 63 sectors/track, 9729 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/sda1 1 393 3156741 7 HPFS/NTFS /dev/sda2 * 394 399 48195 83 Linux /dev/sda3 400 530 1052257+ 82 Linux swap / Solaris /dev/sda4 531 9729 73890967+ 5 Extended /dev/sda5 531 1379 6819561 83 Linux /dev/sda6 1380 2032 5245191 83 Linux /dev/sda7 2033 9729 61826121 83 Linux Disk /dev/dm-0: 5371 MB, 5371075584 bytes 255 heads, 63 sectors/track, 652 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk /dev/dm-0 doesn't contain a valid partition table I am not sure if this is a kernel problem or a problem with the new crypto setup we are using, so i am taking Ludwig and Matthias into cc: -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=273354 ------- Comment #1 from seife@novell.com 2007-05-10 11:08 MST ------- After removing a file that was written partially this morning, while the machine crashed, i got this in the logs: May 10 19:05:44 strolchi kernel: EXT3-fs error (device sda7): ext3_free_blocks_sb: bit already cleared for block 2048 This is a ext3 that passed e2fsck -f without any complaints :-( I'll recheck the FS. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=273354 ------- Comment #2 from seife@novell.com 2007-05-10 11:18 MST ------- gets worse: eroot@strolchi:/# e2fsck -fy /dev/sda7 e2fsck 1.40-WIP (14-Nov-2006) Pass 1: Checking inodes, blocks, and sizes Inode 3571769, i_size is 113599, should be 3870720. Fix? yes Inode 3571769, i_blocks is 240, should be 248. Fix? yes Inode 3571881, i_size is 356871, should be 3870720. Fix? yes Inode 3571881, i_blocks is 720, should be 728. Fix? yes Inode 3604994, i_size is 7550260, should be 8065024. Fix? yes Inode 3604994, i_blocks is 14776, should be 14784. Fix? yes Inode 3604990 has illegal block(s). Clear? yes Illegal block #955404 (853594774) in inode 3604990. CLEARED. Illegal block #955405 (851470171) in inode 3604990. CLEARED. Illegal block #955406 (3305910490) in inode 3604990. CLEARED. Illegal block #955407 (4138219881) in inode 3604990. CLEARED. Illegal block #955408 (537164793) in inode 3604990. CLEARED. Illegal block #955409 (3195108877) in inode 3604990. CLEARED. Illegal block #955411 (3127640486) in inode 3604990. CLEARED. Illegal block #955412 (655376392) in inode 3604990. CLEARED. Illegal block #955413 (1176584329) in inode 3604990. CLEARED. Illegal block #955414 (1661632691) in inode 3604990. CLEARED. Illegal block #955415 (1771801649) in inode 3604990. CLEARED. Too many illegal blocks in inode 3604990. Clear inode? yes Inode 3605007 has illegal block(s). Clear? yes Illegal block #955416 (2158665021) in inode 3605007. CLEARED. Illegal block #955417 (3603182950) in inode 3605007. CLEARED. Illegal block #955418 (1332491079) in inode 3605007. CLEARED. Illegal block #955419 (376660070) in inode 3605007. CLEARED. Illegal block #955420 (2583944012) in inode 3605007. CLEARED. Illegal block #955421 (230910001) in inode 3605007. CLEARED. Illegal block #955422 (1183000964) in inode 3605007. CLEARED. Illegal block #955423 (409917580) in inode 3605007. CLEARED. Illegal block #955424 (209926452) in inode 3605007. CLEARED. Illegal block #955425 (873999670) in inode 3605007. CLEARED. Illegal block #955426 (1516475233) in inode 3605007. CLEARED. Too many illegal blocks in inode 3605007. Clear inode? yes Inode 3605027 has illegal block(s). Clear? yes Illegal block #955427 (1638176086) in inode 3605027. CLEARED. Illegal block #955428 (3872417947) in inode 3605027. CLEARED. Illegal block #955429 (3663992876) in inode 3605027. CLEARED. Illegal block #955430 (763087902) in inode 3605027. CLEARED. Illegal block #955431 (923568941) in inode 3605027. CLEARED. Illegal block #955432 (1276208270) in inode 3605027. CLEARED. Illegal block #955433 (1982902998) in inode 3605027. CLEARED. Illegal block #955434 (1628436168) in inode 3605027. CLEARED. Illegal block #955435 (1552379838) in inode 3605027. CLEARED. Illegal block #955436 (3603486580) in inode 3605027. CLEARED. Illegal block #955437 (3691147281) in inode 3605027. CLEARED. Too many illegal blocks in inode 3605027. Clear inode? yes Inode 3637327, i_size is 58124, should be 3870720. Fix? yes Inode 3637327, i_blocks is 128, should be 136. Fix? yes Inode 3653715 has illegal block(s). Clear? yes Illegal block #955438 (2165017747) in inode 3653715. CLEARED. Illegal block #955439 (1623197868) in inode 3653715. CLEARED. Illegal block #955440 (3020398269) in inode 3653715. CLEARED. Illegal block #955441 (1235207955) in inode 3653715. CLEARED. Illegal block #955442 (3285845698) in inode 3653715. CLEARED. Illegal block #955443 (2923361805) in inode 3653715. CLEARED. Illegal block #955444 (2337065630) in inode 3653715. CLEARED. Illegal block #955445 (3123219311) in inode 3653715. CLEARED. Illegal block #955446 (2713048430) in inode 3653715. CLEARED. Illegal block #955447 (1275518916) in inode 3653715. CLEARED. Illegal block #955448 (3602155483) in inode 3653715. CLEARED. Too many illegal blocks in inode 3653715. Clear inode? yes Inode 3653721, i_size is 5155899, should be 8065024. Fix? yes Inode 3653721, i_blocks is 10104, should be 10112. Fix? yes Inode 3653716, i_size is 4797287, should be 8065024. Fix? yes Inode 3653716, i_blocks is 9408, should be 9416. Fix? yes Inode 3670174, i_size is 207708, should be 3870720. Fix? yes Inode 3670174, i_blocks is 424, should be 432. Fix? yes Inode 3670102 has illegal block(s). Clear? yes Illegal block #955449 (209432663) in inode 3670102. CLEARED. Illegal block #955450 (2871618871) in inode 3670102. CLEARED. Illegal block #955451 (1664981202) in inode 3670102. CLEARED. Illegal block #955452 (1637372461) in inode 3670102. CLEARED. Illegal block #955453 (2381486013) in inode 3670102. CLEARED. Illegal block #955454 (3042082982) in inode 3670102. CLEARED. Illegal block #955455 (2934271922) in inode 3670102. CLEARED. Illegal block #955456 (779155950) in inode 3670102. CLEARED. Illegal block #955457 (2207583360) in inode 3670102. CLEARED. Illegal block #955458 (3070752448) in inode 3670102. CLEARED. Illegal block #955459 (889647817) in inode 3670102. CLEARED. Too many illegal blocks in inode 3670102. Clear inode? yes Inode 3670108, i_size is 75059, should be 3870720. Fix? yes Inode 3670108, i_blocks is 168, should be 176. Fix? yes Inode 3735972, i_size is 67184, should be 3870720. Fix? yes Inode 3735972, i_blocks is 152, should be 160. Fix? yes Restarting e2fsck from the beginning... Segmentation fault -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=273354 ------- Comment #3 from seife@novell.com 2007-05-10 11:19 MST ------- and now: BUG: unable to handle kernel NULL pointer dereference at virtual address 00000800 printing eip: c014f4c8 *pde = 00000000 Oops: 0000 [#1] SMP last sysfs file: devices/system/cpu/cpu0/cpufreq/scaling_cur_freq Modules linked in: nfs lockd nfs_acl sunrpc kqemu i915 drm autofs4 af_packet ipv6 snd_pcm_oss snd_mixer_oss snd_seq snd_seq_device cpufreq_conservative cpufreq_ondemand cpufreq_userspace cpufreq_powersave acpi_cpufreq speedstep_lib freq_table bay dock pcc_acpi button battery ac apparmor twofish twofish_common cbc blkcipher ohci_hcd dm_crypt ext2 loop dm_mod fuse rfcomm hidp l2cap hci_usb bluetooth pcmcia usbhid hid ff_memless r8169 ipw2200 ohci1394 ieee80211 ieee80211_crypt firmware_class ieee1394 yenta_socket rsrc_nonstatic pcmcia_core rtc_cmos rtc_core snd_intel8x0 snd_ac97_codec i2c_i801 ac97_bus snd_pcm rtc_lib snd_timer iTCO_wdt snd soundcore iTCO_vendor_support snd_page_alloc i2c_core uhci_hcd ehci_hcd usbcore generic piix ide_core shpchp pci_hotplug intel_agp agpgart joydev parport_pc lp parport ext3 mbcache jbd sr_mod cdrom sg edd fan ata_piix libata thermal processor sd_mod scsi_mod CPU: 0 EIP: 0060:[<c014f4c8>] Tainted: G N VLI EFLAGS: 00010097 (2.6.21-3-default #1) EIP is at find_get_pages+0x32/0x55 eax: 4000082c ebx: 00000004 ecx: e34fdf0c edx: 00000800 esi: e34fdefc edi: 0000000e ebp: f7129f58 esp: e34fdec0 ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068 Process e2fsck (pid: 5678, ti=e34fc000 task=f23d3a90 task.ti=e34fc000) Stack: 0000000e 007051ba e34fdef4 007051ba f7129f58 c0153806 e34fdefc c1554160 0000000e c0153e77 0000000e ffffffff 00003774 00000000 00000000 c155ef40 c155ef60 c1554180 c155efa0 00000800 c155efc0 c155efe0 c155f000 c155f020 Call Trace: [<c0153806>] pagevec_lookup+0x1c/0x22 [<c0153e77>] invalidate_mapping_pages+0xbd/0xd2 [<c01876b2>] kill_bdev+0xd/0x20 [<c0187bbb>] __blkdev_put+0x44/0x103 [<c01695ef>] __fput+0xac/0x162 [<c016718d>] filp_close+0x51/0x58 [<c0168136>] sys_close+0x6e/0xa5 [<c0104d14>] sysenter_past_esp+0x5d/0x89 ======================= Code: 53 89 cb 83 ec 04 8d 40 10 8b 74 24 18 e8 b2 f5 15 00 89 f9 8d 45 04 89 f2 89 1c 24 31 db e8 a5 61 07 00 89 f1 89 c7 eb 14 8b 11 <8b> 02 f6 c4 40 74 03 8b 52 0c 90 ff 42 04 43 83 c1 04 39 fb 75 EIP: [<c014f4c8>] find_get_pages+0x32/0x55 SS:ESP 0068:e34fdec0 I think i'll better reboot now. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=273354 seife@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |INVALID ------- Comment #4 from seife@novell.com 2007-05-10 12:24 MST ------- ARGH. a bad memory chip. I did run memtest already this afternoon, but it did not catch it. Now i ran it again and it almost immediately failed. Sorry. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
participants (1)
-
bugzilla_noreply@novell.com