https://bugzilla.novell.com/show_bug.cgi?id=444597
User bugproxy@us.ibm.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c66
LTC BugProxy changed:
What |Removed |Added
----------------------------------------------------------------------------
URL| |http://
--- Comment #66 from LTC BugProxy 2008-12-05 06:17:21 MST ---
=Comment: #0=================================================
Manas K. Nayak -
---Problem Description---
While executing filesystem stress (bonnie++ , dbench , fs_inod , fs_maim ,
fsstress fsx_linux ,
IOZone ,postmark,tiobench altogether) on all filesystems (say
ext2+ext3+xfs+reiserfs) on
SLES11-beta5 x86_64 using LVM partitions on hs21 m/c, noticed several Call
traces .
Call traces noticed in var/log/messages as below:
Nov 17 12:49:28 mhs21a kernel: 128706 total pagecache pages
Nov 17 12:49:28 mhs21a kernel: 57425 pages in swap cache
Nov 17 12:49:28 mhs21a kernel: Swap cache stats: add 674553, delete 617128,
find 2082816/2129828
Nov 17 12:49:28 mhs21a kernel: Free swap = 7861656kB
Nov 17 12:49:28 mhs21a kernel: Total swap = 9221300kB
Nov 17 12:49:28 mhs21a kernel: 1048576 pages RAM
Nov 17 12:49:28 mhs21a kernel: 35219 pages reserved
Nov 17 12:49:28 mhs21a kernel: 1114666 pages shared
Nov 17 12:49:28 mhs21a kernel: 830980 pages non-shared
Nov 17 12:49:31 mhs21a kernel: fsstress: page allocation failure. order:0,
mode:0x20,
alloc_flags:0x7, pflags:0x400140
Nov 17 12:49:31 mhs21a kernel: Pid: 22167, comm: fsstress Not tainted
2.6.27.5-2-default #1
Nov 17 12:49:31 mhs21a kernel:
Nov 17 12:49:31 mhs21a kernel: Call Trace:
Nov 17 12:49:31 mhs21a kernel: [<ffffffff8020e53e>]
show_trace_log_lvl+0x41/0x58
Nov 17 12:49:31 mhs21a kernel: [<ffffffff804a8247>] dump_stack+0x69/0x6f
Nov 17 12:49:31 mhs21a kernel: [<ffffffff8028e8d2>]
__alloc_pages_internal+0x3eb/0x40b
Nov 17 12:49:31 mhs21a kernel: [<ffffffff802b3934>] kmem_getpages+0x6f/0x12a
Nov 17 12:49:31 mhs21a kernel: [<ffffffff802b42a2>] fallback_alloc+0x15a/0x20a
Nov 17 12:49:31 mhs21a kernel: [<ffffffff802b44be>] __kmalloc+0x16c/0x1a1
Nov 17 12:49:32 mhs21a kernel: [<ffffffffa04ae70d>]
reiserfs_get_block+0xca1/0xf0e [reiserfs]
Nov 17 12:49:32 mhs21a kernel: [<ffffffffa04ae9bb>]
reiserfs_get_blocks_direct_io+0x41/0x95 [reiserfs]
Nov 17 12:49:32 mhs21a kernel: [<ffffffff802e35a7>] do_direct_IO+0x147/0x369
Nov 17 12:49:32 mhs21a kernel: [<ffffffff802e3a81>]
direct_io_worker+0x174/0x309
Nov 17 12:49:32 mhs21a kernel: [<ffffffff802e3e87>]
__blockdev_direct_IO+0x271/0x2c3
Nov 17 12:49:32 mhs21a kernel: [<ffffffffa04aae27>]
reiserfs_direct_IO+0x4c/0x51 [reiserfs]
Nov 17 12:49:32 mhs21a kernel: [<ffffffff80289976>]
generic_file_direct_write+0x101/0x1b4
Nov 17 12:49:32 mhs21a kernel: [<ffffffff80289cbe>]
__generic_file_aio_write_nolock+0x295/0x37d
Nov 17 12:49:32 mhs21a kernel: [<ffffffff8028a114>]
generic_file_aio_write+0x64/0xc4
Nov 17 12:49:32 mhs21a kernel: [<ffffffff802ba8e1>] do_sync_write+0xce/0x113
Nov 17 12:49:32 mhs21a kernel: [<ffffffff802bb1b6>] vfs_write+0xad/0x156
Nov 17 12:49:32 mhs21a kernel: [<ffffffff802bb31b>] sys_write+0x45/0x6e
Nov 17 12:49:32 mhs21a kernel: [<ffffffff8022cab5>] sysenter_dispatch+0x7/0x46
Nov 17 12:49:32 mhs21a kernel: [<00000000ffffe430>] 0xffffe430
Nov 17 12:49:32 mhs21a kernel:
Nov 17 12:49:32 mhs21a kernel: Mem-Info:
Nov 17 12:49:32 mhs21a kernel: Node 0 DMA per-cpu:
Nov 17 12:49:32 mhs21a kernel: CPU 0: hi: 0, btch: 1 usd: 0
Nov 17 12:49:32 mhs21a kernel: CPU 1: hi: 0, btch: 1 usd: 0
Nov 17 12:49:32 mhs21a kernel: CPU 2: hi: 0, btch: 1 usd: 0
Nov 17 12:49:33 mhs21a kernel: CPU 3: hi: 0, btch: 1 usd: 0
Nov 17 12:49:33 mhs21a kernel: Node 0 DMA32 per-cpu:
Nov 17 12:49:33 mhs21a kernel: CPU 0: hi: 186, btch: 31 usd: 80
Nov 17 12:49:33 mhs21a kernel: CPU 1: hi: 186, btch: 31 usd: 176
Nov 17 12:49:33 mhs21a kernel: CPU 2: hi: 186, btch: 31 usd: 157
Nov 17 12:49:33 mhs21a kernel: CPU 3: hi: 186, btch: 31 usd: 97
Nov 17 12:49:33 mhs21a kernel: Node 0 Normal per-cpu:
Nov 17 12:49:33 mhs21a kernel: CPU 0: hi: 186, btch: 31 usd: 161
Nov 17 12:49:33 mhs21a kernel: CPU 1: hi: 186, btch: 31 usd: 168
Nov 17 12:49:33 mhs21a kernel: CPU 2: hi: 186, btch: 31 usd: 158
Nov 17 12:49:33 mhs21a kernel: CPU 3: hi: 186, btch: 31 usd: 136
Nov 17 12:49:33 mhs21a kernel: Active:178587 inactive:93103 dirty:8518
writeback:5145 unstable:0
Nov 17 12:09:31 mhs21a kernel: dbench: page allocation failure. order:0,
mode:0x20, alloc_flags:0x7,
pflags:0x402040
Nov 17 12:09:31 mhs21a kernel: Pid: 1493, comm: dbench Not tainted
2.6.27.5-2-default #1
Nov 17 12:09:31 mhs21a kernel:
Nov 17 12:09:32 mhs21a kernel: Call Trace:
Nov 17 12:09:32 mhs21a kernel: [<ffffffff8020e53e>]
show_trace_log_lvl+0x41/0x58
Nov 17 12:09:32 mhs21a kernel: [<ffffffff804a8247>] dump_stack+0x69/0x6f
Nov 17 12:09:32 mhs21a kernel: [<ffffffff8028e8d2>]
__alloc_pages_internal+0x3eb/0x40b
Nov 17 12:09:32 mhs21a kernel: [<ffffffff802b3934>] kmem_getpages+0x6f/0x12a
Nov 17 12:09:32 mhs21a kernel: [<ffffffff802b42a2>] fallback_alloc+0x15a/0x20a
Nov 17 12:09:32 mhs21a kernel: [<ffffffff802b47b1>]
kmem_cache_alloc+0x137/0x16c
Nov 17 12:09:32 mhs21a kernel: [<ffffffff8024c652>] send_signal+0xed/0x240
Nov 17 12:09:32 mhs21a kernel: [<ffffffff8024d0dc>]
group_send_sig_info+0x48/0x6f
Nov 17 12:09:32 mhs21a kernel: [<ffffffff8024d134>] kill_pid_info+0x31/0x3b
Nov 17 12:09:32 mhs21a kernel: [<ffffffff8024553d>] it_real_fn+0x17/0x1e
Nov 17 12:09:32 mhs21a kernel: [<ffffffff80256d67>]
run_hrtimer_pending+0x78/0x13a
Nov 17 12:09:32 mhs21a kernel: [<ffffffff8024659d>] __do_softirq+0x84/0x115
Nov 17 12:09:32 mhs21a kernel: [<ffffffff8020ddac>] call_softirq+0x1c/0x28
Nov 17 12:09:32 mhs21a kernel: [<ffffffff8020f177>] do_softirq+0x3c/0x81
Nov 17 12:09:32 mhs21a kernel: [<ffffffff802462b4>] irq_exit+0x3f/0x83
Nov 17 12:09:32 mhs21a kernel: [<ffffffff8021cce0>]
smp_apic_timer_interrupt+0x92/0xaa
Nov 17 12:09:32 mhs21a kernel: [<ffffffff8020d523>]
apic_timer_interrupt+0x83/0x90
Nov 17 12:09:32 mhs21a kernel: [<ffffffff804aa371>]
_spin_unlock_irqrestore+0x1d/0x25
Nov 17 12:09:32 mhs21a kernel: [<ffffffff8025472f>] __wake_up_bit+0x28/0x2d
Nov 17 12:09:32 mhs21a kernel: [<ffffffff802929cc>]
shrink_page_list+0x362/0x400
Nov 17 12:09:32 mhs21a kernel: [<ffffffff80292c31>]
shrink_inactive_list+0x1a8/0x481
Nov 17 12:09:32 mhs21a kernel: [<ffffffff8029300d>] shrink_zone+0x103/0x126
Nov 17 12:09:32 mhs21a kernel: [<ffffffff80293360>] shrink_zones+0xe2/0x119
Nov 17 12:09:32 mhs21a kernel: [<ffffffff80293fca>]
do_try_to_free_pages+0x150/0x2a3
Nov 17 12:09:32 mhs21a kernel: [<ffffffff80294201>]
try_to_free_pages+0x60/0x65
Nov 17 12:09:32 mhs21a kernel: [<ffffffff8028e74c>]
__alloc_pages_internal+0x265/0x40b
Nov 17 12:09:32 mhs21a kernel: [<ffffffff80298426>] do_wp_page+0x2d9/0x6a4
Nov 17 12:09:32 mhs21a kernel: [<ffffffff8029a875>]
handle_mm_fault+0x46a/0x53d
Nov 17 12:09:32 mhs21a kernel: [<ffffffff804ac906>] do_page_fault+0x1ef/0x5f0
Nov 17 12:09:32 mhs21a kernel: [<ffffffff804aa61a>] error_exit+0x0/0x70
Please see attached file "var-log-messages-hs21.out" for more information .
Contact Information = Manas K Nayak/maknayak@in.ibm.com
---uname output---
Linux mhs21a 2.6.27.5-2-default #1 SMP 2008-11-11 15:15:33 +0100 x86_64 x86_64
x86_64 GNU/Linux
Machine Type = hs21
---Debugger---
A debugger is not configured
---Steps to Reproduce---
1) create four lvm partitions
2) make 4 file systems (ext2,ext3,xfs,reiserfs)
3) mount each file systems to respective directories(mkdir /EXT2 /EXT3 /XFS
/REISERFS)
example:
mount /dev/mapper/TEST_VG-VOL1 /EXT2
..
NOTE: mount
/dev/mapper/TEST_VG-VOL1 on /EXT2 type ext2 (rw)
/dev/mapper/TEST_VG-VOL2 on /EXT3 type ext3 (rw)
/dev/mapper/TEST_VG-VOL3 on /XFS type xfs (rw)
/dev/mapper/TEST_VG-VOL4 on /REISERFS type reiserfs (rw)
4) untar filesystem tar( filesys.tar.gz) file in /home/test,it will create a
directory called as XFS here)
5) Now rename newly created directory to corresponding directory (say "mv XFS
EXT3" ... like that for all 4 filesystems)
6) now edit the file called run_test.sh for path:
Example: vi /home/test/EXT3/run_tests.sh
SUM_FS_TEST_HOME=/home/test/EXT3; export SUM_FS_TEST_HOME
SUM_FS_TEST_TYPE=ext3; export SUM_FS_TEST_TYPE
SUM_FS_TEST_DIR=/EXT3; export SUM_FS_TEST_DIR
change the correct path as above for all the four filesystem
7) Now execute the run_tests.sh
cd /home/test/EXT2/
/run_tests.sh
cd /home/test/EXT3/
/run_tests.sh
cd /home/test/XFS/
/run_tests.sh
cd /home/test/REISERFS/
/run_tests.sh
---Kernel - Filesystem Component Data---
Stack trace output:
no
Oops output:
no
System Dump Info:
The system is not configured to capture a system dump.
*Additional Instructions for Manas K Nayak/maknayak@in.ibm.com:
-Attach sysctl -a output output to the bug.
=Comment: #1=================================================
Manas K. Nayak -
var/log/messages output on filesystem stress test
=Comment: #2=================================================
Manas K. Nayak -
The system was configured to capture a system dump.But it did not produce any
dump.While system was
under stress ,I could not ssh to the system but was able to ping.
From var/log/messages output ,it looks like call traces got generated against
all the file systems
(say ext2,ext3,xfs,reiserfs).
Thanks...
Manas
=Comment: #3=================================================
Manas K. Nayak -
I executed fs-racer stress test on LVM partitions against SLES11-beta5 on hs21
m/c for a all the
four file systems (ext2,ext3,xfs,reiserfs altogether) and could reproduce
similar "page allocation
failure" .
From /var/log/messages it looks like call traces got generated only against
ext3,xfs filesystems.
Here are some call traces:
Call trace against XFS:
-----------------------------
Nov 20 01:31:12 mhs21a kernel: Free swap = 9221188kB
Nov 20 01:31:12 mhs21a kernel: Total swap = 9221300kB
Nov 20 01:31:12 mhs21a kernel: 1048576 pages RAM
Nov 20 01:31:12 mhs21a kernel: 67987 pages reserved
Nov 20 01:31:12 mhs21a kernel: 410164 pages shared
Nov 20 01:31:12 mhs21a kernel: 586441 pages non-shared
Nov 20 01:31:12 mhs21a kernel: ln: page allocation failure. order:0, mode:0x20,
alloc_flags:0x7,
pflags:0x420000
Nov 20 01:31:12 mhs21a kernel: Pid: 6441, comm: ln Not tainted
2.6.27.5-2-default #1
Nov 20 01:31:12 mhs21a kernel:
Nov 20 01:31:12 mhs21a kernel: Call Trace:
Nov 20 01:31:12 mhs21a kernel: [<ffffffff8020e53e>]
show_trace_log_lvl+0x41/0x58
Nov 20 01:31:12 mhs21a kernel: [<ffffffff804a8247>] dump_stack+0x69/0x6f
Nov 20 01:31:12 mhs21a kernel: [<ffffffff8028e8d2>]
__alloc_pages_internal+0x3eb/0x40b
Nov 20 01:31:12 mhs21a kernel: [<ffffffff802b3934>] kmem_getpages+0x6f/0x12a
Nov 20 01:31:12 mhs21a kernel: [<ffffffff802b42a2>] fallback_alloc+0x15a/0x20a
Nov 20 01:31:12 mhs21a kernel: [<ffffffff802b47b1>]
kmem_cache_alloc+0x137/0x16c
Nov 20 01:31:12 mhs21a kernel: [<ffffffffa002dde2>]
scsi_pool_alloc_command+0x14/0x5b [scsi_mod]
Nov 20 01:31:12 mhs21a kernel: [<ffffffffa002de3b>]
scsi_host_alloc_command+0x12/0x56 [scsi_mod]
Nov 20 01:31:12 mhs21a kernel: [<ffffffffa002df34>]
__scsi_get_command+0xc/0x7a [scsi_mod]
Nov 20 01:31:12 mhs21a kernel: [<ffffffffa002dfd2>] scsi_get_command+0x30/0x96
[scsi_mod]
Nov 20 01:31:12 mhs21a kernel: [<ffffffffa003366f>]
scsi_setup_fs_cmnd+0x69/0xb8 [scsi_mod]
Nov 20 01:31:12 mhs21a kernel: [<ffffffffa0184155>] sd_prep_fn+0x65/0x860
[sd_mod]
Nov 20 01:31:12 mhs21a kernel: [<ffffffff80345483>]
elv_next_request+0x153/0x20c
Nov 20 01:31:12 mhs21a kernel: [<ffffffffa0032abe>] scsi_request_fn+0x88/0x52b
[scsi_mod]
Nov 20 01:31:12 mhs21a kernel: [<ffffffff80345639>] elv_insert+0xfd/0x2b2
Nov 20 01:31:12 mhs21a kernel: [<ffffffff80348265>] __make_request+0x41a/0x499
Nov 20 01:31:12 mhs21a kernel: [<ffffffff803468f2>]
generic_make_request+0x39f/0x3e2
Nov 20 01:31:12 mhs21a kernel: [<ffffffff803469f2>] submit_bio+0xbd/0xc4
Nov 20 01:31:12 mhs21a kernel: [<ffffffffa04df1f0>]
_xfs_buf_ioapply+0x200/0x22b [xfs]
Nov 20 01:31:12 mhs21a kernel: [<ffffffffa04dff39>]
xfs_buf_iorequest+0x36/0x61 [xfs]
Nov 20 01:31:12 mhs21a kernel: [<ffffffffa04cb03a>] xlog_bdstrat_cb+0x16/0x3c
[xfs]
Nov 20 01:31:12 mhs21a kernel: [<ffffffffa04c8fa6>] xlog_sync+0x24a/0x3ec
[xfs]
Nov 20 01:31:12 mhs21a kernel: [<ffffffffa04ca26f>]
xlog_state_sync+0x1a7/0x2e8 [xfs]
Nov 20 01:31:12 mhs21a kernel: [<ffffffffa04ca41b>] _xfs_log_force+0x6b/0x70
[xfs]
Nov 20 01:31:12 mhs21a kernel: [<ffffffffa04ca42b>] xfs_log_force+0xb/0x2a
[xfs]
Nov 20 01:31:12 mhs21a kernel: [<ffffffffa04997a7>]
xfs_alloc_ag_vextent+0x92/0xfd [xfs]
Nov 20 01:31:12 mhs21a kernel: [<ffffffffa0499f72>]
xfs_alloc_vextent+0x2c1/0x3f7 [xfs]
Nov 20 01:31:12 mhs21a kernel: [<ffffffffa04a69d2>]
xfs_bmap_btalloc+0x75a/0x9a8 [xfs]
Nov 20 01:31:12 mhs21a kernel: [<ffffffffa04a98bc>] xfs_bmapi+0x8c9/0x10c0
[xfs]
Nov 20 01:31:13 mhs21a kernel: [<ffffffffa04b26d1>]
xfs_dir2_grow_inode+0xde/0x2ff [xfs]
Nov 20 01:31:13 mhs21a kernel: [<ffffffffa04b3213>]
xfs_dir2_sf_to_block+0x9c/0x53d [xfs]
Nov 20 01:31:13 mhs21a kernel: [<ffffffffa04b9ed1>]
xfs_dir2_sf_addname+0x197/0x2f1 [xfs]
Nov 20 01:31:13 mhs21a kernel: [<ffffffffa04b3052>]
xfs_dir_createname+0xdf/0x157 [xfs]
Nov 20 01:31:13 mhs21a kernel: [<ffffffffa04d91a4>] xfs_symlink+0x69f/0x886
[xfs]
Nov 20 01:31:13 mhs21a kernel: [<ffffffffa04e3a06>] xfs_vn_symlink+0x6e/0xba
[xfs]
Nov 20 01:31:13 mhs21a kernel: [<ffffffff802c2cab>] vfs_symlink+0x127/0x1a1
Nov 20 01:31:13 mhs21a kernel: [<ffffffff802c577e>] sys_symlinkat+0x80/0xd6
Nov 20 01:31:13 mhs21a kernel: [<ffffffff8020c3fa>]
system_call_fastpath+0x16/0x1b
Nov 20 01:31:13 mhs21a kernel: [<00007f8c1c223f67>] 0x7f8c1c223f67
Call traces against EXT3:
--------------------------------
Nov 19 11:32:24 mhs21a kernel: Total swap = 9221300kB
Nov 19 11:32:24 mhs21a kernel: 1048576 pages RAM
Nov 19 11:32:24 mhs21a kernel: 67987 pages reserved
Nov 19 11:32:24 mhs21a kernel: 525953 pages shared
Nov 19 11:32:24 mhs21a kernel: 464543 pages non-shared
Nov 19 11:32:24 mhs21a kernel: Neighbour table overflow.
Nov 19 11:45:40 mhs21a kernel: cat: page allocation failure. order:0,
mode:0x20, alloc_flags:0x7,
pflags:0x400000
Nov 19 11:45:40 mhs21a kernel: Pid: 22650, comm: cat Not tainted
2.6.27.5-2-default #1
Nov 19 11:45:40 mhs21a kernel:
Nov 19 11:45:40 mhs21a kernel: Call Trace:
Nov 19 11:45:40 mhs21a kernel: [<ffffffff8020e53e>]
show_trace_log_lvl+0x41/0x58
Nov 19 11:45:40 mhs21a kernel: [<ffffffff804a8247>] dump_stack+0x69/0x6f
Nov 19 11:45:40 mhs21a kernel: [<ffffffff8028e8d2>]
__alloc_pages_internal+0x3eb/0x40b
Nov 19 11:45:40 mhs21a kernel: [<ffffffff802b3934>] kmem_getpages+0x6f/0x12a
Nov 19 11:45:40 mhs21a kernel: [<ffffffff802b42a2>] fallback_alloc+0x15a/0x20a
Nov 19 11:45:40 mhs21a kernel: [<ffffffff802b3acc>]
kmem_cache_alloc_node+0xdd/0x113
Nov 19 11:45:40 mhs21a kernel: [<ffffffff80349dba>] alloc_io_context+0x16/0x83
Nov 19 11:45:40 mhs21a kernel: [<ffffffff80349e42>]
current_io_context+0x1b/0x29
Nov 19 11:45:40 mhs21a kernel: [<ffffffff8034794c>] get_request+0x71/0x3be
Nov 19 11:45:40 mhs21a kernel: [<ffffffff80347cc8>]
get_request_wait+0x2f/0x1b2
Nov 19 11:45:40 mhs21a kernel: [<ffffffff803481c8>] __make_request+0x37d/0x499
Nov 19 11:45:40 mhs21a kernel: [<ffffffff803468f2>]
generic_make_request+0x39f/0x3e2
Nov 19 11:45:40 mhs21a kernel: [<ffffffff803469f2>] submit_bio+0xbd/0xc4
Nov 19 11:45:40 mhs21a kernel: [<ffffffff802dd0cd>] submit_bh+0xde/0xfe
Nov 19 11:45:40 mhs21a kernel: [<ffffffff802df968>]
__block_write_full_page+0x1ca/0x2b2
Nov 19 11:45:40 mhs21a kernel: [<ffffffffa014904d>]
ext3_ordered_writepage+0xc0/0x134 [ext3]
Nov 19 11:45:40 mhs21a kernel: [<ffffffff8028ea83>] __writepage+0xa/0x25
Nov 19 11:45:40 mhs21a kernel: [<ffffffff8028f383>]
write_cache_pages+0x179/0x2c3
Nov 19 11:45:41 mhs21a kernel: [<ffffffff8028f510>] do_writepages+0x27/0x2d
Nov 19 11:45:41 mhs21a kernel: [<ffffffff802d8ef9>]
__sync_single_inode+0x72/0x259
Nov 19 11:45:41 mhs21a kernel: [<ffffffff802d9228>]
__writeback_single_inode+0x148/0x155
Nov 19 11:45:41 mhs21a kernel: [<ffffffff802d96ca>]
generic_sync_sb_inodes+0x290/0x3f4
Nov 19 11:45:41 mhs21a kernel: [<ffffffff802d9b14>]
writeback_inodes+0xa0/0x108
Nov 19 11:45:41 mhs21a kernel: [<ffffffff8028fc82>]
balance_dirty_pages+0x133/0x2bd
Nov 19 11:45:41 mhs21a kernel: [<ffffffff80287bfa>]
generic_perform_write+0x178/0x1a8
Nov 19 11:45:41 mhs21a kernel: [<ffffffff802897cb>]
generic_file_buffered_write+0x82/0x12c
Nov 19 11:45:41 mhs21a kernel: [<ffffffff80289d72>]
__generic_file_aio_write_nolock+0x349/0x37d
Nov 19 11:45:41 mhs21a kernel: [<ffffffff8028a114>]
generic_file_aio_write+0x64/0xc4
Nov 19 11:45:41 mhs21a kernel: [<ffffffffa01465e5>] ext3_file_write+0x16/0x95
[ext3]
Nov 19 11:45:41 mhs21a kernel: [<ffffffff802ba8e1>] do_sync_write+0xce/0x113
Nov 19 11:45:41 mhs21a kernel: [<ffffffff802bb1b6>] vfs_write+0xad/0x156
Nov 19 11:45:41 mhs21a kernel: [<ffffffff802bb31b>] sys_write+0x45/0x6e
Nov 19 11:45:41 mhs21a kernel: [<ffffffff8020c3fa>]
system_call_fastpath+0x16/0x1b
Nov 19 11:45:41 mhs21a kernel: [<00007f19deda5950>] 0x7f19deda5950
Nov 19 11:45:41 mhs21a kernel:
Nov 19 11:45:41 mhs21a kernel: Mem-Info:
Nov 19 11:45:41 mhs21a kernel: Node 0 DMA per-cpu:
Nov 19 11:45:41 mhs21a kernel: CPU 0: hi: 0, btch: 1 usd: 0
Nov 19 11:45:42 mhs21a kernel: CPU 1: hi: 0, btch: 1 usd: 0
Nov 19 11:45:42 mhs21a kernel: CPU 2: hi: 0, btch: 1 usd: 0
Nov 19 11:45:42 mhs21a kernel: CPU 3: hi: 0, btch: 1 usd: 0
Compared earlier reported call traces and It looks like call traces are
similar only upto the
followings:
Nov 19 11:45:40 mhs21a kernel: [<ffffffff8020e53e>]
show_trace_log_lvl+0x41/0x58
Nov 19 11:45:40 mhs21a kernel: [<ffffffff804a8247>] dump_stack+0x69/0x6f
Nov 19 11:45:40 mhs21a kernel: [<ffffffff8028e8d2>]
__alloc_pages_internal+0x3eb/0x40b
Nov 19 11:45:40 mhs21a kernel: [<ffffffff802b3934>] kmem_getpages+0x6f/0x12a
Nov 19 11:45:40 mhs21a kernel: [<ffffffff802b42a2>] fallback_alloc+0x15a/0x20a
To get more information ,please see attached file of var/log/messages and dmesg
output for fsracer
test results.
Thanks...
Manas
=Comment: #4=================================================
Manas K. Nayak -
dmesg conatins fsracer test results
=Comment: #5=================================================
Manas K. Nayak -
var-log-messages-FSRCAER-on-lvm.out
=Comment: #8=================================================
Manas K. Nayak -
I am trying to reproduce the bug on sles11-beta6.I will update the results once
finished the test.
Thanks...
Manas
=Comment: #9=================================================
Manas K. Nayak -
Hi ,
I could reproduce the bug by executing file system stress test on lvm
partitions against all the
file syetms (say ext2+ext3+xfs+reiserfs) on SLES11-Beta6 using HS21.
Here is the call trace from /var/log/messages:
Dec 2 11:38:36 mhs21a kernel: fsstress: page allocation failure. order:0,
mode:0x10020,
alloc_flags:0x7, pflags:0x400140
Dec 2 11:38:36 mhs21a kernel: Pid: 9378, comm: fsstress Tainted: G
2.6.27.7-4-default #1
Dec 2 11:38:36 mhs21a kernel:
Dec 2 11:38:36 mhs21a kernel: Call Trace:
Dec 2 11:38:36 mhs21a kernel: [<ffffffff8020e53e>]
show_trace_log_lvl+0x41/0x58
Dec 2 11:38:36 mhs21a kernel: [<ffffffff804a8b07>] dump_stack+0x69/0x6f
Dec 2 11:38:36 mhs21a kernel: [<ffffffff8028ee0a>]
__alloc_pages_internal+0x3df/0x3ff
Dec 2 11:38:36 mhs21a kernel: [<ffffffff802b404c>] kmem_getpages+0x6f/0x12a
Dec 2 11:38:36 mhs21a kernel: [<ffffffff802b49af>] fallback_alloc+0x15a/0x20a
Dec 2 11:38:36 mhs21a kernel: [<ffffffff802b41d9>]
kmem_cache_alloc_node+0xd2/0x108
Dec 2 11:38:36 mhs21a kernel: [<ffffffff802b4327>] cache_grow+0x118/0x2c0
Dec 2 11:38:36 mhs21a kernel: [<ffffffff802b4a2b>] fallback_alloc+0x1d6/0x20a
Dec 2 11:38:36 mhs21a kernel: [<ffffffff802b4ea6>]
kmem_cache_alloc+0x12a/0x15f
Dec 2 11:38:37 mhs21a kernel: [<ffffffff8028a90d>] mempool_alloc+0x3f/0xf5
Dec 2 11:38:37 mhs21a kernel: [<ffffffff802e1684>] bvec_alloc_bs+0x7d/0xa4
Dec 2 11:38:37 mhs21a kernel: [<ffffffff802e16f2>] bio_alloc_bioset+0x47/0x90
Dec 2 11:38:37 mhs21a kernel: [<ffffffff802e17ab>] bio_alloc+0x10/0x1f
Dec 2 11:38:37 mhs21a kernel: [<ffffffff802e46aa>] mpage_alloc+0x22/0x78
Dec 2 11:38:37 mhs21a kernel: [<ffffffff802e4ae2>]
__mpage_writepage+0x35f/0x4c0
Dec 2 11:38:37 mhs21a kernel: [<ffffffff8028f8ad>]
write_cache_pages+0x174/0x2be
Dec 2 11:38:37 mhs21a kernel: [<ffffffff802e4766>] mpage_writepages+0x40/0x5d
Dec 2 11:38:37 mhs21a kernel: [<ffffffff8028fa33>] do_writepages+0x20/0x2d
Dec 2 11:38:37 mhs21a kernel: [<ffffffff802d96f9>]
__sync_single_inode+0x72/0x259
Dec 2 11:38:42 mhs21a kernel: [<ffffffff802d9a28>]
__writeback_single_inode+0x148/0x155
Dec 2 11:38:42 mhs21a kernel: [<ffffffff802d9ebb>]
generic_sync_sb_inodes+0x290/0x3f4
Dec 2 11:38:42 mhs21a kernel: [<ffffffff802da300>]
writeback_inodes+0x9b/0x103
Dec 2 11:38:42 mhs21a kernel: [<ffffffff802901ac>]
balance_dirty_pages+0x133/0x2bd
Dec 2 11:38:42 mhs21a kernel: [<ffffffff8028814f>]
generic_perform_write+0x178/0x1a8
Dec 2 11:38:42 mhs21a kernel: [<ffffffff80289d17>]
generic_file_buffered_write+0x82/0x12c
Dec 2 11:38:42 mhs21a kernel: [<ffffffff8028a2be>]
__generic_file_aio_write_nolock+0x349/0x37d
Dec 2 11:38:42 mhs21a kernel: [<ffffffff8028a660>]
generic_file_aio_write+0x64/0xc4
Dec 2 11:38:42 mhs21a kernel: [<ffffffff802bb0f9>] do_sync_write+0xce/0x113
Dec 2 11:38:42 mhs21a kernel: [<ffffffff802bb9ce>] vfs_write+0xad/0x156
Dec 2 11:38:46 mhs21a kernel: [<ffffffff802bbb33>] sys_write+0x45/0x6e
Dec 2 11:38:46 mhs21a kernel: [<ffffffff8022cb55>] sysenter_dispatch+0x7/0x46
Dec 2 11:38:46 mhs21a kernel: [<00000000ffffe430>] 0xffffe430
Dec 2 11:38:46 mhs21a kernel:
Dec 2 11:38:46 mhs21a kernel: Mem-Info:
Dec 2 11:38:46 mhs21a kernel: Node 0 DMA per-cpu:
Dec 2 11:38:46 mhs21a kernel: CPU 0: hi: 0, btch: 1 usd: 0
Dec 2 11:38:46 mhs21a kernel: CPU 1: hi: 0, btch: 1 usd: 0
Dec 2 11:38:46 mhs21a kernel: CPU 2: hi: 0, btch: 1 usd: 0
Dec 2 11:38:46 mhs21a kernel: CPU 3: hi: 0, btch: 1 usd: 0
Dec 2 11:38:47 mhs21a kernel: Node 0 DMA32 per-cpu:
Dec 2 11:38:47 mhs21a kernel: CPU 0: hi: 186, btch: 31 usd: 167
Dec 2 11:38:47 mhs21a kernel: CPU 1: hi: 186, btch: 31 usd: 111
Dec 2 11:38:47 mhs21a kernel: CPU 2: hi: 186, btch: 31 usd: 160
Dec 2 11:38:47 mhs21a kernel: CPU 3: hi: 186, btch: 31 usd: 179
Dec 2 11:38:47 mhs21a kernel: Node 0 Normal per-cpu:
Dec 2 11:38:47 mhs21a kernel: CPU 0: hi: 186, btch: 31 usd: 167
Thanks...
Manas
=Comment: #10=================================================
Manas K. Nayak -
While executing samba stress test (ltp fsstress )and NFSv4 stress test (ltp
fsstress) together on
mounted shares ,on nfsv4 client side I have noticed similar call traces "page
allocation failures"
for rpciod and kswapd processes against sles11-beta6.
NOTE: NFSv4 and samba server has mounted with LVM partitions against all
filesystems
(ext2,ext3,xfs,reiesrfs)
Here is the call traces from /var/log/messages:
Dec 1 23:38:49 mx3500 kernel: rpciod/2: page allocation failure. order:0,
mode:0x0,
alloc_flags:0x5, pflags:0x84208040
Dec 1 23:38:49 mx3500 kernel: Pid: 3781, comm: rpciod/2 Tainted: G
2.6.27.4-2-default #1
Dec 1 23:38:49 mx3500 kernel:
Dec 1 23:38:49 mx3500 kernel: Call Trace:
Dec 1 23:38:49 mx3500 kernel: [<ffffffff8020e42e>]
show_trace_log_lvl+0x41/0x58
Dec 1 23:38:49 mx3500 kernel: [<ffffffff804a06e4>] dump_stack+0x69/0x6f
Dec 1 23:38:49 mx3500 kernel: [<ffffffff8028dcdc>]
__alloc_pages_internal+0x3eb/0x40b
Dec 1 23:38:49 mx3500 kernel: [<ffffffff802b2d49>] kmem_getpages+0x6f/0x12a
Dec 1 23:38:49 mx3500 kernel: [<ffffffff802b36b9>] fallback_alloc+0x15a/0x20a
Dec 1 23:38:49 mx3500 kernel: [<ffffffff802b38d5>] __kmalloc+0x16c/0x1a1
Dec 1 23:38:49 mx3500 kernel: [<ffffffffa03dd8bc>] rpc_malloc+0x44/0x80
[sunrpc]
Dec 1 23:38:49 mx3500 kernel: [<ffffffffa03d77f3>] call_allocate+0xc4/0x13d
[sunrpc]
Dec 1 23:38:49 mx3500 kernel: [<ffffffffa03de023>] __rpc_execute+0x71/0x22b
[sunrpc]
Dec 1 23:38:49 mx3500 kernel: [<ffffffff802502af>] run_workqueue+0xa4/0x14c
Dec 1 23:38:49 mx3500 kernel: [<ffffffff8025042f>] worker_thread+0xd8/0xe7
Dec 1 23:38:49 mx3500 kernel: [<ffffffff802538d1>] kthread+0x47/0x73
Dec 1 23:38:49 mx3500 kernel: [<ffffffff8020d7b9>] child_rip+0xa/0x11
Dec 2 04:06:13 mx3500 kernel: kswapd0: page allocation failure. order:0,
mode:0x20,
alloc_flags:0x7, pflags:0x80a40040
Dec 2 04:06:13 mx3500 kernel: Pid: 34, comm: kswapd0 Tainted: G
2.6.27.4-2-default #1
Dec 2 04:06:15 mx3500 kernel:
Dec 2 04:06:15 mx3500 kernel: Call Trace:
Dec 2 04:06:15 mx3500 kernel: [<ffffffff8020e42e>]
show_trace_log_lvl+0x41/0x58
Dec 2 04:06:15 mx3500 kernel: [<ffffffff804a06e4>] dump_stack+0x69/0x6f
Dec 2 04:06:15 mx3500 kernel: [<ffffffff8028dcdc>]
__alloc_pages_internal+0x3eb/0x40b
Dec 2 04:06:15 mx3500 kernel: [<ffffffff802b2d49>] kmem_getpages+0x6f/0x12a
Dec 2 04:06:15 mx3500 kernel: [<ffffffff802b36b9>] fallback_alloc+0x15a/0x20a
Dec 2 04:06:15 mx3500 kernel: [<ffffffff802b2ee1>]
kmem_cache_alloc_node+0xdd/0x113
Dec 2 04:06:15 mx3500 kernel: [<ffffffff8042a153>] __alloc_skb+0x61/0x1be
Dec 2 04:06:15 mx3500 kernel: [<ffffffff8042ac03>]
__netdev_alloc_skb+0x2c/0x49
Dec 2 04:06:16 mx3500 kernel: [<ffffffffa02ac78b>]
tg3_alloc_rx_skb+0xcd/0x184 [tg3]
Dec 2 04:06:16 mx3500 kernel: [<ffffffffa02b3993>] tg3_rx+0x1d5/0x51f [tg3]
Dec 2 04:06:16 mx3500 kernel: [<ffffffffa02b3d95>] tg3_poll_work+0xb8/0xc6
[tg3]
Dec 2 04:06:16 mx3500 kernel: [<ffffffffa02b3dcd>] tg3_poll+0x2a/0x1ae [tg3]
Dec 2 04:06:16 mx3500 kernel: [<ffffffff8042d9e9>] net_rx_action+0xb5/0x208
Dec 2 04:06:16 mx3500 kernel: [<ffffffff80245a61>] __do_softirq+0x84/0x115
Dec 2 04:06:16 mx3500 kernel: [<ffffffff8020dc9c>] call_softirq+0x1c/0x28
Dec 2 04:06:16 mx3500 kernel: [<ffffffff8020f067>] do_softirq+0x3c/0x81
Dec 2 04:06:16 mx3500 kernel: [<ffffffff80245778>] irq_exit+0x3f/0x83
Dec 2 04:06:16 mx3500 kernel: [<ffffffff8020f349>] do_IRQ+0xbd/0xda
Dec 2 04:06:18 mx3500 kernel: [<ffffffff8020ca2e>] ret_from_intr+0x0/0x29
Dec 2 04:06:18 mx3500 kernel: [<ffffffff802cdc91>] prune_icache+0x6a/0x20d
Dec 2 04:06:18 mx3500 kernel: [<ffffffff802cde49>]
shrink_icache_memory+0x15/0x33
Dec 2 04:06:18 mx3500 kernel: [<ffffffff80292870>] shrink_slab+0xe4/0x157
Dec 2 04:06:18 mx3500 kernel: [<ffffffff80292df3>] balance_pgdat+0x29f/0x3f1
Dec 2 04:06:18 mx3500 kernel: [<ffffffff8029306f>] kswapd+0x12a/0x12c
Dec 2 04:06:18 mx3500 kernel: [<ffffffff802538d1>] kthread+0x47/0x73
Dec 2 04:06:18 mx3500 kernel: [<ffffffff8020d7b9>] child_rip+0xa/0x11
Dec 2 04:06:18 mx3500 kernel:
Thanks...
Manas
=Comment: #15=================================================
Antonio A. Rosales -
*** Bug 50416 has been marked as a duplicate of this bug. ***
=Comment: #16=================================================
Jonathan R. Thomas -
Looks like this occurs with different modes. There is a report on the lkml
citing the same problem
after a mode:0x10 request. We may want to test the patch mentioned here:
http://lkml.org/lkml/2008/10/28/143
=Comment: #17=================================================
Mel Gorman -
Did any of these failures result in the benchmark actually failing? I'm seeing
34 allocation
failures, 18 unique but I do not see out of memory messages, oopses or
benchmark-related segfaults.
The segfaults I did see were related to rpcinfo and crash. I don't think NFS is
involved here but
crash segfaulting may explain why you have no dumps. This is an indication of
the system under heavy
stress but dealing with the situation, albeit in an alarming fashion.
Lets go through some of the reported failures. Each "entry" is a page
allocation failure in the log
and count is how many times that one appeared.
entry 0 count 1
===============================
mhs21a kernel: fsstress: page allocation failure. order:0, mode:0x10020,
alloc_flags:0x7,
pflags:0x400140
mhs21a kernel: Pid comm: fsstress Not tainted 2.6.27.5-2-default #1
mhs21a kernel:
mhs21a kernel: Call Trace:
mhs21a kernel: [<ffffffff8020e53e>] show_trace_log_lvl+0x41/0x58
mhs21a kernel: [<ffffffff804a8247>] dump_stack+0x69/0x6f
mhs21a kernel: [<ffffffff8028e8d2>] __alloc_pages_internal+0x3eb/0x40b
mhs21a kernel: [<ffffffff802b3934>] kmem_getpages+0x6f/0x12a
mhs21a kernel: [<ffffffff802b42a2>] fallback_alloc+0x15a/0x20a
mhs21a kernel: [<ffffffff802b3acc>] kmem_cache_alloc_node+0xdd/0x113
mhs21a kernel: [<ffffffff802b3c1a>] cache_grow+0x118/0x2c0
mhs21a kernel: [<ffffffff802b431e>] fallback_alloc+0x1d6/0x20a
mhs21a kernel: [<ffffffff802b47b1>] kmem_cache_alloc+0x137/0x16c
mhs21a kernel: [<ffffffff8028a3ca>] mempool_alloc+0x48/0xfe
mhs21a kernel: [<ffffffff8037536a>] __sg_alloc_table+0x62/0x102
mhs21a kernel: [<ffffffffa0033201>] scsi_alloc_sgtable+0x24/0x45 [scsi_mod]
mhs21a kernel: [<ffffffffa003323d>] scsi_init_sgtable+0x1b/0x6c [scsi_mod]
mhs21a kernel: [<ffffffffa00334e5>] scsi_init_io+0x21/0x142 [scsi_mod]
<SNIP>
This is a atomic allocation, high priority, that can use zone normal but not
enter the FS. Free
lists were below minimum water marks and per-cpu lists on two cpus were empty
so the allocator could
not do anything. However, glancing at mainline here, I see that when this error
occurs,
BLKPREP_DEFER is returned and the request retried later. Impact - you are
delayed but you don't die.
From the mainline code, I would have expected this warning to be supressed but
something different
might be happening in the SLES kernel. What's the best way to get a look at the
SLES kernel source
again? I forget :(
I'm skipping the second unique failure because it's like the one above except
that the callback is
in a different order which is probably just a mistake.
entry 2 count 1
===============================
mhs21a kernel: fsstress: page allocation failure. order:0, mode:0x20,
alloc_flags:0x7, pflags:0x400040
mhs21a kernel: Pid comm: fsstress Not tainted 2.6.27.5-2-default #1
mhs21a kernel:
mhs21a kernel: Call Trace:
mhs21a kernel: [<ffffffff8020e53e>] show_trace_log_lvl+0x41/0x58
mhs21a kernel: [<ffffffff804a8247>] dump_stack+0x69/0x6f
mhs21a kernel: [<ffffffff8028e8d2>] __alloc_pages_internal+0x3eb/0x40b
mhs21a kernel: [<ffffffff802b3934>] kmem_getpages+0x6f/0x12a
mhs21a kernel: [<ffffffff802b42a2>] fallback_alloc+0x15a/0x20a
mhs21a kernel: [<ffffffff802b3acc>] kmem_cache_alloc_node+0xdd/0x113
mhs21a kernel: [<ffffffff80349dba>] alloc_io_context+0x16/0x83
mhs21a kernel: [<ffffffff80349e42>] current_io_context+0x1b/0x29
mhs21a kernel: [<ffffffff8034794c>] get_request+0x71/0x3be
mhs21a kernel: [<ffffffff80347cc8>] get_request_wait+0x2f/0x1b2
mhs21a kernel: [<ffffffff803481c8>] __make_request+0x37d/0x499
mhs21a kernel: [<ffffffff803468f2>] generic_make_request+0x39f/0x3e2
mhs21a kernel: [<ffffffff803469f2>] submit_bio+0xbd/0xc4
mhs21a kernel: [<ffffffff802e401a>] mpage_bio_submit+0x22/0x26
mhs21a kernel: [<ffffffff802e4073>] mpage_writepages+0x55/0x5d
mhs21a kernel: [<ffffffff8028f509>] do_writepages+0x20/0x2d
mhs21a kernel: [<ffffffff802d8ef9>] __sync_single_inode+0x72/0x259
mhs21a kernel: [<ffffffff802d9228>] __writeback_single_inode+0x148/0x155
mhs21a kernel: [<ffffffff802d96ca>] generic_sync_sb_inodes+0x290/0x3f4
mhs21a kernel: [<ffffffff802d98d0>] sync_inodes_sb+0x8a/0x8f
<SNIP>
Another atomic allocation. Same type of deal whereby we are below watermarks
and there is not much
the allocator can do other than fail. This time, it's get_request_wait() that
has the necessary
smarts to go onto a wait-queue and wait for IO to complete until
current_io_context() gives
something useful. Same as above basically, you wait around a bit but you don't
die. Again, not sure
why this warning is not supressed - possibly an oversight.
Next useful one;
entry 4 count 1
===============================
mhs21a kernel: fsstress: page allocation failure. order:0, mode:0x20,
alloc_flags:0x7, pflags:0x420140
mhs21a kernel: Pid comm: fsstress Not tainted 2.6.27.5-2-default #1
mhs21a kernel:
mhs21a kernel: Call Trace:
mhs21a kernel: [<ffffffff8020e53e>] show_trace_log_lvl+0x41/0x58
mhs21a kernel: [<ffffffff804a8247>] dump_stack+0x69/0x6f
mhs21a kernel: [<ffffffff8028e8d2>] __alloc_pages_internal+0x3eb/0x40b
mhs21a kernel: [<ffffffff802b3934>] kmem_getpages+0x6f/0x12a
mhs21a kernel: [<ffffffff802b42a2>] fallback_alloc+0x15a/0x20a
mhs21a kernel: [<ffffffff802b47b1>] kmem_cache_alloc+0x137/0x16c
mhs21a kernel: [<ffffffffa002dde2>] scsi_pool_alloc_command+0x14/0x5b
[scsi_mod]
mhs21a kernel: [<ffffffffa002de3b>] scsi_host_alloc_command+0x12/0x56
[scsi_mod]
mhs21a kernel: [<ffffffffa002df34>] __scsi_get_command+0xc/0x7a [scsi_mod]
mhs21a kernel: [<ffffffffa002dfd2>] scsi_get_command+0x30/0x96 [scsi_mod]
mhs21a kernel: [<ffffffffa003366f>] scsi_setup_fs_cmnd+0x69/0xb8 [scsi_mod]
mhs21a kernel: [<ffffffffa0184155>] sd_prep_fn+0x65/0x860 [sd_mod]
mhs21a kernel: [<ffffffff80345483>] elv_next_request+0x153/0x20c
mhs21a kernel: [<ffffffffa0032abe>] scsi_request_fn+0x88/0x52b [scsi_mod]
mhs21a kernel: [<ffffffff80348265>] __make_request+0x41a/0x499
mhs21a kernel: [<ffffffff803468f2>] generic_make_request+0x39f/0x3e2
mhs21a kernel: [<ffffffff803469f2>] submit_bio+0xbd/0xc4
mhs21a kernel: [<ffffffffa046b1f0>] _xfs_buf_ioapply+0x200/0x22b [xfs]
mhs21a kernel: [<ffffffffa046bf39>] xfs_buf_iorequest+0x36/0x61 [xfs]
mhs21a kernel: [<ffffffffa045703a>] xlog_bdstrat_cb+0x16/0x3c [xfs]
mhs21a kernel: [<ffffffffa0454fa6>] xlog_sync+0x24a/0x3ec [xfs]
mhs21a kernel: [<ffffffffa04568f8>] xlog_write+0x336/0x4a1 [xfs]
Same atomic request. Same job with watermarks. scsi_setup_fs_cmnd returns
BLKPREP_DEFER in this
case, you get replugged later. Blah blah blah, delayed no die.
Entry 5 is same (setup_scsi_fs_cmnd path) as entry 4 except kblockd hit it
Entry 6 is same (setup_scsi_fs_cmnd path) except XFS was the source this time,
same delaying end
result.
Entry 7 is same as entry 6 except we took a slightly different path, ended up
in same place
Entry 8 scsi_fs_cmnd through yet another path - direct IO this time
Entry 9 same idea
Oh, entry 10 is intereseting
entry 10 count 5
===============================
mhs21a kernel: dbench: page allocation failure. order:0, mode:0x20,
alloc_flags:0x7, pflags:0x402000
mhs21a kernel: Pid comm: dbench Not tainted 2.6.27.5-2-default #1
mhs21a kernel:
mhs21a kernel: Call Trace:
mhs21a kernel: [<ffffffff8020e53e>] show_trace_log_lvl+0x41/0x58
mhs21a kernel: [<ffffffff804a8247>] dump_stack+0x69/0x6f
mhs21a kernel: [<ffffffff8028e8d2>] __alloc_pages_internal+0x3eb/0x40b
mhs21a kernel: [<ffffffff802b3934>] kmem_getpages+0x6f/0x12a
mhs21a kernel: [<ffffffff802b42a2>] fallback_alloc+0x15a/0x20a
mhs21a kernel: [<ffffffff802b47b1>] kmem_cache_alloc+0x137/0x16c
mhs21a kernel: [<ffffffff8024c652>] send_signal+0xed/0x240
mhs21a kernel: [<ffffffff8024d0dc>] group_send_sig_info+0x48/0x6f
mhs21a kernel: [<ffffffff8024d20c>] __kill_pgrp_info+0x42/0x67
mhs21a kernel: [<ffffffff8024d2ee>] kill_something_info+0x84/0xef
mhs21a kernel: [<ffffffff8024d3c3>] sys_kill+0x6a/0x76
mhs21a kernel: [<ffffffff8022cab5>] sysenter_dispatch+0x7/0x46
mhs21a kernel:
Atomic whatever but can't send a signal...... Userspace gets EAGAIN which they
should handle.
Entry 11 is interesting as well
entry 11 count 1
===============================
mhs21a kernel: fsstress: page allocation failure. order:0, mode:0x20,
alloc_flags:0x7, pflags:0x400140
mhs21a kernel: Pid comm: fsstress Not tainted 2.6.27.5-2-default #1
mhs21a kernel:
mhs21a kernel: Call Trace:
mhs21a kernel: [<ffffffff8020e53e>] show_trace_log_lvl+0x41/0x58
mhs21a kernel: [<ffffffff804a8247>] dump_stack+0x69/0x6f
mhs21a kernel: [<ffffffff8028e8d2>] __alloc_pages_internal+0x3eb/0x40b
mhs21a kernel: [<ffffffff802b3934>] kmem_getpages+0x6f/0x12a
mhs21a kernel: [<ffffffff802b42a2>] fallback_alloc+0x15a/0x20a
mhs21a kernel: [<ffffffff802b47b1>] kmem_cache_alloc+0x137/0x16c
mhs21a kernel: [<ffffffff8024c652>] send_signal+0xed/0x240
mhs21a kernel: [<ffffffff8024d0dc>] group_send_sig_info+0x48/0x6f
mhs21a kernel: [<ffffffff8024d134>] kill_pid_info+0x31/0x3b
mhs21a kernel: [<ffffffff8024553d>] it_real_fn+0x17/0x1e
mhs21a kernel: [<ffffffff80256d67>] run_hrtimer_pending+0x78/0x13a
mhs21a kernel: [<ffffffff8024659d>] __do_softirq+0x84/0x115
mhs21a kernel: [<ffffffff8020ddac>] call_softirq+0x1c/0x28
mhs21a kernel: [<ffffffff8020f177>] do_softirq+0x3c/0x81
mhs21a kernel: [<ffffffff802462b4>] irq_exit+0x3f/0x83
mhs21a kernel: [<ffffffff8021cce0>] smp_apic_timer_interrupt+0x92/0xaa
mhs21a kernel: [<ffffffff8020d523>] apic_timer_interrupt+0x83/0x90
mhs21a kernel: [<ffffffff80292ee2>] shrink_inactive_list+0x459/0x481
mhs21a kernel: [<ffffffff8029300d>] shrink_zone+0x103/0x126
mhs21a kernel: [<ffffffff80293360>] shrink_zones+0xe2/0x119
mhs21a kernel: [<ffffffff80293fca>] do_try_to_free_pages+0x150/0x2a3
mhs21a kernel: [<ffffffff80294201>] try_to_free_pages+0x60/0x65
mhs21a kernel: [<ffffffff8028e74c>] __alloc_pages_internal+0x265/0x40b
mhs21a kernel: [<ffffffff802894d2>] find_or_create_page+0x32/0x71
mhs21a kernel: [<ffffffffa046ba01>] _xfs_buf_lookup_pages+0x10b/0x2e0 [xfs]
mhs21a kernel: [<ffffffffa046c5fb>] xfs_buf_get_flags+0x6d/0x147 [xfs]
mhs21a kernel: [<ffffffffa046c6e7>] xfs_buf_read_flags+0x12/0x81 [xfs]
mhs21a kernel: [<ffffffffa0461acd>] xfs_trans_read_buf+0x47/0x2af [xfs]
mhs21a kernel: [<ffffffffa044f057>] xfs_imap_to_bp+0x3f/0xfa [xfs]
mhs21a kernel: [<ffffffffa044f24c>] xfs_itobp+0xa0/0xe7 [xfs]
mhs21a kernel: [<ffffffffa04512ce>] xfs_iread+0x79/0x1eb [xfs]
mhs21a kernel: [<ffffffffa044c8e8>] xfs_iget_core+0x3b1/0x685 [xfs]
mhs21a kernel: [<ffffffffa044cc9e>] xfs_iget+0xe2/0x188 [xfs]
mhs21a kernel: [<ffffffffa0466734>] xfs_lookup+0x79/0xa5 [xfs]
mhs21a kernel: [<ffffffffa046f3d7>] xfs_vn_lookup+0x3c/0x78 [xfs]
Might have lost a clock even there, don't know for sure.
Entry 12 similar to 11.
Entry 13 similar to 11.
Entry 14 similar to 11.
Entry 15 similar to 11.
Entry 16 is in network subsystem. It handles the failure, but not sure what the
consequences are. I
would guess more delays but that's based on belief in that this sort of failure
is handled
everywhere else than evidence. Maybe packets get dropped and resent later
Entry 17 is similar idea to 16
entry 18 count 3
===============================
mhs21a kernel: fsstress: page allocation failure. order:0, mode:0x20,
alloc_flags:0x7, pflags:0x400140
mhs21a kernel: Pid comm: fsstress Not tainted 2.6.27.5-2-default #1
mhs21a kernel:
mhs21a kernel: Call Trace:
mhs21a kernel: [<ffffffff8020e53e>] show_trace_log_lvl+0x41/0x58
mhs21a kernel: [<ffffffff804a8247>] dump_stack+0x69/0x6f
mhs21a kernel: [<ffffffff8028e8d2>] __alloc_pages_internal+0x3eb/0x40b
mhs21a kernel: [<ffffffff802b3934>] kmem_getpages+0x6f/0x12a
mhs21a kernel: [<ffffffff802b42a2>] fallback_alloc+0x15a/0x20a
mhs21a kernel: [<ffffffff802b44be>] __kmalloc+0x16c/0x1a1
mhs21a kernel: [<ffffffffa04ae70d>] reiserfs_get_block+0xca1/0xf0e [reiserfs]
mhs21a kernel: [<ffffffffa04ae9bb>] reiserfs_get_blocks_direct_io+0x41/0x95
[reiserfs]
mhs21a kernel: [<ffffffff802e35a7>] do_direct_IO+0x147/0x369
mhs21a kernel: [<ffffffff802e3a81>] direct_io_worker+0x174/0x309
mhs21a kernel: [<ffffffff802e3e87>] __blockdev_direct_IO+0x271/0x2c3
mhs21a kernel: [<ffffffffa04aae27>] reiserfs_direct_IO+0x4c/0x51 [reiserfs]
mhs21a kernel: [<ffffffff80289976>] generic_file_direct_write+0x101/0x1b4
mhs21a kernel: [<ffffffff80289cbe>]
__generic_file_aio_write_nolock+0x295/0x37d
mhs21a kernel: [<ffffffff8028a114>] generic_file_aio_write+0x64/0xc4
mhs21a kernel: [<ffffffff802ba8e1>] do_sync_write+0xce/0x113
mhs21a kernel: [<ffffffff802bb1b6>] vfs_write+0xad/0x156
mhs21a kernel: [<ffffffff802bb31b>] sys_write+0x45/0x6e
mhs21a kernel: [<ffffffff8022cab5>] sysenter_dispatch+0x7/0x46
mhs21a kernel: [<00000000ffffe430>] 0xffffe430
Ran out of time looking at this one. I suspect userspace gets a 0 and they try
again, but I don't
know fur 100% sure.
Bottom line, I suspect most of not all of these are harmless (unless you view
delays as not
harmless), but alarming if you were reading the logs. Probably should get a
filesystems expert to
double check the analysis to be sure these are really harmless and if they
should be suppressed.
=Comment: #18=================================================
Richard A. Lary -
What's the best way to get a look at the SLES kernel source again? I forget :(
While there may be other ways, you can have a look on one of my power systems.
Miklos, this looks like a dup of our most favorite crash; can you please
double-check?
It does look like the same bug. It should be fixed in -rc1, or you can try the
latest kotd with this change:
* Thu Dec 04 2008 mszeredi@suse.de
- patches.suse/SoN-fix-uninitialized-variable.patch: Fix use of
uninitialized variable in cache_grow() (bnc#444597).
*** This bug has been marked as a duplicate of bug 444597 ***
https://bugzilla.novell.com/show_bug.cgi?id=444597
https://bugzilla.novell.com/show_bug.cgi?id=50416
--
Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.