[opensuse-kernel] Having hard time with xfs kernel error inode 0x400018e0 background reclaim flush failed with 117
Hi all. I'm building a new server for customer, with the 11.3 RC ( actually factory ) using kernel 2.6.34-11-default #1 SMP 2010-06-24 00:46:11 +0200 x86_64 x86_64 x86_64 GNU/Linux I'm setting a box hardware composed by 1 phenomx6 and and pci-e adaptec 5805Z 8x sas/sata we have 8 harddrive (velociraptor 10krpm 300Go inside) build in Raid6 chunk 1M this give us one lvm vg called vgadaptec of 1,63Tb now I'm testing filesystem performances see attached script runtest.sh which explain how we do that. I've run the test with ext4 without a glitch, but running it with xfs give me lot's of errors start of the script quotaon: Enforcing group quota already on /dev/mapper/vgadaptec-lvdata_test quotaon: Enforcing user quota already on /dev/mapper/vgadaptec-lvdata_test Sync & clear kernel & disks buffer caches Warming disk buffer caches Running for 120 seconds with load '/usr/share/dbench/client.txt' and minimum warmup 24 secs 15 72586 1226.49 MB/sec warmup 5 sec 15 146510 1210.67 MB/sec warmup 10 sec 15 220343 1205.44 MB/sec warmup 15 sec 15 244440 1001.14 MB/sec warmup 20 sec 15 244440 0.00 MB/sec execute 5 sec 15 244440 0.00 MB/sec execute 10 sec and in dmesg : [ 2438.067625] SGI XFS with ACLs, security attributes, realtime, large block/inode numbers, no debug enabled [ 2438.068607] SGI XFS Quota Management subsystem [ 2438.069211] XFS mounting filesystem dm-5 [ 2438.073058] Ending clean XFS mount for filesystem: dm-5 [ 2438.073236] XFS quotacheck dm-5: Please wait. [ 2438.129583] XFS quotacheck dm-5: Done. [ 2474.128446] Filesystem "dm-5": inode 0x400018e0 background reclaim flush failed with 117 [ 2510.128193] Filesystem "dm-5": inode 0x400018e0 background reclaim flush failed with 117 [ 2546.128115] Filesystem "dm-5": inode 0x400018e0 background reclaim flush failed with 117 [ 2582.128114] Filesystem "dm-5": inode 0x400018e0 background reclaim flush failed with 117 [ 2618.128115] Filesystem "dm-5": inode 0x400018e0 background reclaim flush failed with 117 [ 2654.128040] Filesystem "dm-5": inode 0x400018e0 background reclaim flush failed with 117 [ 2690.128113] Filesystem "dm-5": inode 0x400018e0 background reclaim flush failed with 117 [ 2726.128110] Filesystem "dm-5": inode 0x400018e0 background reclaim flush failed with 117 [ 2762.128114] Filesystem "dm-5": inode 0x400018e0 background reclaim flush failed with 117 [ 2798.128116] Filesystem "dm-5": inode 0x400018e0 background reclaim flush failed with 117 [ 2834.128119] Filesystem "dm-5": inode 0x400018e0 background reclaim flush failed with 117 [ 2870.128112] Filesystem "dm-5": inode 0x400018e0 background reclaim flush failed with 117 [ 2906.128109] Filesystem "dm-5": inode 0x400018e0 background reclaim flush failed with 117 [ 2942.128121] Filesystem "dm-5": inode 0x400018e0 background reclaim flush failed with 117 [ 2978.128113] Filesystem "dm-5": inode 0x400018e0 background reclaim flush failed with 117 [ 3014.128112] Filesystem "dm-5": inode 0x400018e0 background reclaim flush failed with 117 [ 3050.128119] Filesystem "dm-5": inode 0x400018e0 background reclaim flush failed with 117 [ 3086.128118] Filesystem "dm-5": inode 0x400018e0 background reclaim flush failed with 117 [ 3122.128103] Filesystem "dm-5": inode 0x400018e0 background reclaim flush failed with 117 [ 3158.128117] Filesystem "dm-5": inode 0x400018e0 background reclaim flush failed with 117 [ 3194.128117] Filesystem "dm-5": inode 0x400018e0 background reclaim flush failed with 117 [ 3230.128117] Filesystem "dm-5": inode 0x400018e0 background reclaim flush failed with 117 [ 3266.128118] Filesystem "dm-5": inode 0x400018e0 background reclaim flush failed with 117 What the most bad is, we have no way to kill the dbench process, nor issue a sync, (even alt+sysreq+s stop ) and we finally have no other choices to reset hard the machine. We also have a raid5 ( build on soft raid kernel ) with 5x2To WD black sata, which has never crash. Is this known ? Shall I open a new bug for it ? Did I qualify the bug at P1 ? Don't know if this is related to commit 9bf729c0af67897ea8498ce17c29b0683f7f2028 upstream but it reference the error ... I need your expertise guys ... -- Bruno Friedmann openSUSE Member / tigerfoot on irc Ioda-Net Sàrl 2830 Vellerat - Switzerland Tél : ++41 32 435 7171 Fax : ++41 32 435 7172 gsm : ++41 78 802 6760 www.ioda-net.ch Centre de Formation et de Coaching En Ligne www.cfcel.com
looking at kernel commit it seems to be related to that
commit 57817c68229984818fea9e614d6f95249c3fb098
Author: Dave Chinner
On Tue, Jun 29, 2010 at 1:19 PM, Bruno Friedmann
Hi all.
I'm building a new server for customer, with the 11.3 RC ( actually factory ) using kernel 2.6.34-11-default #1 SMP 2010-06-24 00:46:11 +0200 x86_64 x86_64 x86_64 GNU/Linux
Your post made me check the status of xfsprogs (ie. userspace). Unfortunately it looks like 11.3 is only going to have xfsprogs 3.0.1 which I think is a shame. Newer version are out, but most important update I think is 3.1.0 from January or so: The biggest changes in xfsprogs 3.1.0 were optimizations in xfs_repair that lead to a much lower memory usage, and optional use of the blkid library for filesystem detection and retrieving storage topology information. xfs_repair prior to 3.1.0 I believe required a linear amount of RAM for filesystem size. Thus the kernel side of XFS has been ready of 16TB+ partitions for a while, but with only xfsprogs 3.0.1 in 11.3, it requires a large amount of RAM to run the repair tool. To my surprise, I don't even see xfsprogs 3.1.0 (or newer) in OBS yet at all. (Not even a home directory.) Greg -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
On Tue, 29 Jun 2010 15:12:01 -0400
Greg Freemyer
To my surprise, I don't even see xfsprogs 3.1.0 (or newer) in OBS yet at all. (Not even a home directory.)
seife@susi:~/buildservice/home:seife:branches:filesystems/xfsprogs> osc request list -M 42295 State:new By:seife When:2010-06-29T23:20:07 submit: home:seife:branches:filesystems/xfsprogs -> filesystems Descr: update to 3.1.2, should be tested beyond build result ;) I did only fix the build after updating the tarball, so maybe a runtime test would be appropriate before forwarding it to FACTORY ;) -- Stefan Seyfried "Any ideas, John?" "Well, surrounding them's out." -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
On Tue, Jun 29, 2010 at 5:22 PM, Stefan Seyfried
On Tue, 29 Jun 2010 15:12:01 -0400 Greg Freemyer
wrote: To my surprise, I don't even see xfsprogs 3.1.0 (or newer) in OBS yet at all. (Not even a home directory.)
seife@susi:~/buildservice/home:seife:branches:filesystems/xfsprogs> osc request list -M 42295 State:new By:seife When:2010-06-29T23:20:07 submit: home:seife:branches:filesystems/xfsprogs -> filesystems Descr: update to 3.1.2, should be tested beyond build result ;)
I did only fix the build after updating the tarball, so maybe a runtime test would be appropriate before forwarding it to FACTORY ;)
-- Stefan Seyfried
Thanks, Is it too late for this to go into 11.3? I'm not sure I'll need it, but for large (multi-TB) xfs filesystems its a pretty important fix. Greg -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
On 06/30/2010 02:01 AM, Greg Freemyer wrote:
On Tue, Jun 29, 2010 at 5:22 PM, Stefan Seyfried
wrote: On Tue, 29 Jun 2010 15:12:01 -0400 Greg Freemyer
wrote: To my surprise, I don't even see xfsprogs 3.1.0 (or newer) in OBS yet at all. (Not even a home directory.)
seife@susi:~/buildservice/home:seife:branches:filesystems/xfsprogs> osc request list -M 42295 State:new By:seife When:2010-06-29T23:20:07 submit: home:seife:branches:filesystems/xfsprogs -> filesystems Descr: update to 3.1.2, should be tested beyond build result ;)
I did only fix the build after updating the tarball, so maybe a runtime test would be appropriate before forwarding it to FACTORY ;)
-- Stefan Seyfried
Thanks,
Is it too late for this to go into 11.3?
I'm not sure I'll need it, but for large (multi-TB) xfs filesystems its a pretty important fix.
Greg
Good news, after sync factory to lastest this night. we got the RC2 kernel which doesn't bug anymore ... -- Bruno Friedmann -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
participants (3)
-
Bruno Friedmann
-
Greg Freemyer
-
Stefan Seyfried