[Bug 557760] New: rpciod taking 100% of cpu, box *almost* unusable
http://bugzilla.novell.com/show_bug.cgi?id=557760 http://bugzilla.novell.com/show_bug.cgi?id=557760#c0 Summary: rpciod taking 100% of cpu, box *almost* unusable Classification: openSUSE Product: openSUSE 11.2 Version: Final Platform: x86-64 OS/Version: Other Status: NEW Severity: Major Priority: P5 - None Component: Kernel AssignedTo: kernel-maintainers@forge.provo.novell.com ReportedBy: jnelson-suse@jamponi.net QAContact: qa@suse.de Found By: --- Blocker: --- User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.4) Gecko/20091016 SUSE/3.5.4-1.1.2 Firefox/3.5.4 While reading/writing an .iso over NFS (gig-e connection), rpciod/0 started consuming 100% of the CPU, and it stayed that way. The mount is NFSv4. echo 't' > /proc/sysrq-trigger and looking for the process: Nov 23 08:58:53 frank kernel: [ 3169.336539] rpciod/0 R running task 0 1471 2 0x00000008 Nov 23 08:58:53 frank kernel: [ 3169.336539] 0000000000000000 000000006521588b ffff88002c45c610 ffffffffa029b730 Nov 23 08:58:53 frank kernel: [ 3169.336539] ffffffffa0297be0 ffffc90010a43448 ffff88002c53be20 ffffffffa0297c0d Nov 23 08:58:53 frank kernel: [ 3169.336539] 0000000000000010 000000006521588b ffff88002c53be80 ffffffff81088631 Nov 23 08:58:53 frank kernel: [ 3169.336539] Call Trace: Nov 23 08:58:53 frank kernel: [ 3169.336539] Inexact backtrace: Nov 23 08:58:53 frank kernel: [ 3169.336539] Nov 23 08:58:53 frank kernel: [ 3169.336539] [<ffffffffa029b730>] ? rpc_async_schedule+0x0/0x40 [sunrpc] Nov 23 08:58:53 frank kernel: [ 3169.336539] [<ffffffffa029b753>] ? rpc_async_schedule+0x23/0x40 [sunrpc] Nov 23 08:58:53 frank kernel: [ 3169.336539] [<ffffffff81088631>] ? run_workqueue+0xc1/0x1f0 Nov 23 08:58:53 frank kernel: [ 3169.336539] [<ffffffff81088814>] ? worker_thread+0xb4/0x140 Nov 23 08:58:54 frank kernel: [ 3169.336539] [<ffffffff8108f390>] ? autoremove_wake_function+0x0/0x60 Nov 23 08:58:54 frank kernel: [ 3169.336539] [<ffffffff81088760>] ? worker_thread+0x0/0x140 Nov 23 08:58:54 frank kernel: [ 3169.336539] [<ffffffff8108ec26>] ? kthread+0xb6/0xc0 Nov 23 08:58:54 frank kernel: [ 3169.336539] [<ffffffff8100d70a>] ? child_rip+0xa/0x20 Nov 23 08:58:54 frank kernel: [ 3169.336539] [<ffffffff8108eb70>] ? kthread+0x0/0xc0 Nov 23 08:58:54 frank kernel: [ 3169.336539] [<ffffffff8100d700>] ? child_rip+0x0/0x20 Other processes are stuck in 'D' for more than 120s. This appears to be a regression. Easy to reproduce. Client and server are both openSUSE 11.2, and up-to-date as of now. I enabled all of the rpc debug flags for rpc and that was a mistake. I see a lot of this: Nov 23 09:12:27 frank kernel: [ 3855.711413] 16588 0281 -11 ffff88002cc1d800 (null) 0 ffffffffa035ce10 nfsv4 RENEW a:call_reserveresult q:xprt_backlog Nov 23 09:12:27 frank kernel: [ 3855.711446] 16589 0281 -11 ffff88002cc1d800 (null) 0 ffffffffa035ce10 nfsv4 RENEW a:call_reserveresult q:xprt_backlog Nov 23 09:12:27 frank kernel: [ 3855.711482] 16590 0281 -11 ffff88002cc1d800 (null) 0 ffffffffa035ce10 nfsv4 RENEW a:call_reserveresult q:xprt_backlog Nov 23 09:12:27 frank kernel: [ 3855.711517] 16591 0281 -11 ffff88002cc1d800 (null) 0 ffffffffa035ce10 nfsv4 RENEW a:call_reserveresult q:xprt_backlog Nov 23 09:12:27 frank kernel: [ 3855.711550] 16592 0281 -11 ffff88002cc1d800 (null) 0 ffffffffa035ce10 nfsv4 RENEW a:call_reserveresult q:xprt_backlog Nov 23 09:12:27 frank kernel: [ 3855.711582] 16593 0281 -11 ffff88002cc1d800 (null) 0 ffffffffa035ce10 nfsv4 RENEW a:call_reserveresult q:xprt_backlog Nov 23 09:12:27 frank kernel: [ 3855.711619] 16594 0281 -11 ffff88002cc1d800 (null) 0 ffffffffa035ce10 nfsv4 RENEW a:call_reserveresult q:xprt_backlog Nov 23 09:12:27 frank kernel: [ 3855.711653] 16595 0281 -11 ffff88002cc1d800 (null) 0 ffffffffa035ce10 nfsv4 RENEW a:call_reserveresult q:xprt_backlog Nov 23 09:12:27 frank kernel: [ 3855.711686] 16596 0281 -11 ffff88002cc1d800 (null) 0 ffffffffa035ce10 nfsv4 RENEW a:call_reserveresult q:xprt_backlog Nov 23 09:12:27 frank kernel: [ 3855.711722] 16597 0281 -11 ffff88002cc1d800 (null) 0 ffffffffa035ce10 nfsv4 RENEW a:call_reserveresult q:xprt_backlog Nov 23 09:12:27 frank kernel: [ 3855.711756] 16598 0281 -11 ffff88002cc1d800 (null) 0 ffffffffa035ce10 nfsv4 RENEW a:call_reserveresult q:xprt_backlog Nov 23 09:12:27 frank kernel: [ 3855.711788] 16599 0281 -11 ffff88002cc1d800 (null) 0 ffffffffa035ce10 nfsv4 RENEW a:call_reserveresult q:xprt_backlog Nov 23 09:12:27 frank kernel: [ 3855.711821] 16600 0281 -11 ffff88002cc1d800 (null) 0 ffffffffa035ce10 nfsv4 RENEW a:call_reserveresult q:xprt_backlog Nov 23 09:12:27 frank kernel: [ 3855.711853] 16601 0281 -11 ffff88002cc1d800 (null) 0 ffffffffa035ce10 nfsv4 RENEW a:call_reserveresult q:xprt_backlog Nov 23 09:12:27 frank kernel: [ 3855.711886] 16602 0281 -11 ffff88002cc1d800 (null) 0 ffffffffa035ce10 nfsv4 RENEW a:call_reserveresult q:xprt_backlog Nov 23 09:12:27 frank kernel: [ 3855.711918] 16603 0281 -11 ffff88002cc1d800 (null) 0 ffffffffa035ce10 nfsv4 RENEW a:call_reserveresult q:xprt_backlog Nov 23 09:12:27 frank kernel: [ 3855.711951] 16604 0281 -11 ffff88002cc1d800 (null) 0 ffffffffa035ce10 nfsv4 RENEW a:call_reserveresult q:xprt_backlog Nov 23 09:12:27 frank kernel: [ 3855.711983] 16605 0281 -11 ffff88002cc1d800 (null) 0 ffffffffa035ce10 nfsv4 RENEW a:call_reserveresult q:xprt_backlog Nov 23 09:12:27 frank kernel: [ 3855.712016] 16606 0281 -11 ffff88002cc1d800 (null) 0 ffffffffa035ce10 nfsv4 RENEW a:call_reserveresult q:xprt_backlog Nov 23 09:12:27 frank kernel: [ 3855.712048] 16607 0281 -11 ffff88002cc1d800 (null) 0 ffffffffa035ce10 nfsv4 RENEW a:call_reserveresult q:xprt_backlog Nov 23 09:12:27 frank kernel: [ 3855.712079] 16608 0281 -11 ffff88002cc1d800 (null) 0 ffffffffa035ce10 nfsv4 RENEW a:call_reserveresult q:xprt_backlog Nov 23 09:12:27 frank kernel: [ 3855.712115] 16609 0281 -11 ffff88002cc1d800 (null) 0 ffffffffa035ce10 nfsv4 RENEW a:call_reserveresult q:xprt_backlog Nov 23 09:12:27 frank kernel: [ 3855.712148] 16610 0281 -11 ffff88002cc1d800 (null) 0 ffffffffa035ce10 nfsv4 RENEW a:call_reserveresult q:xprt_backlog but a bunch of other stuff too. Reproducible: Always Steps to Reproduce: 1. 2. 3. -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=557760 http://bugzilla.novell.com/show_bug.cgi?id=557760#c Jeff Mahoney <jeffm@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Priority|P5 - None |P3 - Medium CC| |jeffm@novell.com AssignedTo|kernel-maintainers@forge.pr |nfbrown@novell.com |ovo.novell.com | -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=557760 http://bugzilla.novell.com/show_bug.cgi?id=557760#c1 Neil Brown <nfbrown@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED --- Comment #1 from Neil Brown <nfbrown@novell.com> 2009-11-25 00:59:56 UTC --- Thanks for the report. Could you please attach the entire 'sysrq' task list, and several hundred lines from the rpc debug tracing? You excerpts are useful, but having the entire thing can be useful as well. I'll see what I can find. -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=557760 http://bugzilla.novell.com/show_bug.cgi?id=557760#c2 --- Comment #2 from Jon Nelson <jnelson-suse@jamponi.net> 2009-12-01 02:44:54 UTC --- I'll sent what I can, however some trimming might be in order. I have 10,000 lines here. Perhaps this is useful. The following snippet appears *thousands* of times, repeating *exactly* with only the "now" values updating: lines 6687 through 9926 or thereabouts: Nov 23 09:11:07 frank kernel: [ 3855.686461] RPC: worker connecting xprt ffff88002c45c000 to address: addr=192.168.2.1 port=2049 proto=tcp Nov 23 09:11:07 frank kernel: [ 3855.686469] RPC: ffff88002c45c000 connect status 99 connected 0 sock state 7 Nov 23 09:11:07 frank kernel: [ 3855.686474] RPC: 16303 __rpc_wake_up_task (now 4295856217) Nov 23 09:11:07 frank kernel: [ 3855.686478] RPC: 16303 disabling timer Nov 23 09:11:07 frank kernel: [ 3855.686483] RPC: 16303 removed from queue ffff88002c45c2f0 "xprt_pending" Nov 23 09:11:07 frank kernel: [ 3855.686487] RPC: __rpc_wake_up_task done Nov 23 09:11:07 frank kernel: [ 3855.686492] RPC: 16303 __rpc_execute flags=0x1 Nov 23 09:11:07 frank kernel: [ 3855.686495] RPC: 16303 xprt_connect_status: retrying Nov 23 09:11:07 frank kernel: [ 3855.686500] RPC: 16303 call_connect_status (status -11) Nov 23 09:11:07 frank kernel: [ 3855.686504] RPC: 16303 call_transmit (status 0) Nov 23 09:11:07 frank kernel: [ 3855.686508] RPC: 16303 xprt_prepare_transmit Nov 23 09:11:07 frank kernel: [ 3855.686512] RPC: 16303 rpc_xdr_encode (status 0) Nov 23 09:11:07 frank kernel: [ 3855.686517] RPC: 16303 marshaling UNIX cred ffff88002c710a40 Nov 23 09:11:07 frank kernel: [ 3855.686522] RPC: 16303 using AUTH_UNIX cred ffff88002c710a40 to wrap rpc data Nov 23 09:11:07 frank kernel: [ 3855.686527] RPC: 16303 xprt_transmit(131252) Nov 23 09:11:07 frank kernel: [ 3855.686532] RPC: xs_tcp_send_request(131252) = -32 Nov 23 09:11:07 frank kernel: [ 3855.686536] RPC: xs_tcp_state_change client ffff88002c45c000... Nov 23 09:11:07 frank kernel: [ 3855.686541] RPC: state 7 conn 0 dead 0 zapped 1 Nov 23 09:11:07 frank kernel: [ 3855.686545] RPC: disconnected transport ffff88002c45c000 Nov 23 09:11:07 frank kernel: [ 3855.686551] RPC: 16303 call_status (status -32) Nov 23 09:11:07 frank kernel: [ 3855.686555] RPC: 16303 call_bind (status 0) Nov 23 09:11:07 frank kernel: [ 3855.686559] RPC: 16303 call_connect xprt ffff88002c45c000 is not connected Nov 23 09:11:07 frank kernel: [ 3855.686564] RPC: 16303 xprt_connect xprt ffff88002c45c000 is not connected Nov 23 09:11:08 frank kernel: [ 3855.686570] RPC: 16303 sleep_on(queue "xprt_pending" time 4295856217) Nov 23 09:11:08 frank kernel: [ 3855.686575] RPC: 16303 added to queue ffff88002c45c2f0 "xprt_pending" Nov 23 09:11:08 frank kernel: [ 3855.686580] RPC: 16303 setting alarm for 60000 ms Nov 23 09:11:08 frank kernel: [ 3855.686584] RPC: xs_connect delayed xprt ffff88002c45c000 for 0 seconds what follows are these lines: Nov 23 09:12:27 frank kernel: [ 3855.702614] RPC: worker connecting xprt ffff88002c45c000 to address: addr=192.168.2.1 port=2049 proto=tcp Nov 23 09:12:27 frank kernel: [ 3855.702622] RPC: ffff88002c45c000 connect status 99 connected 0 sock state 7 Nov 23 09:12:27 frank kernel: [ 3855.702628] RPC: 16303 __rpc_wake_up_task (now 4295856221) Nov 23 09:12:27 frank kernel: [ 3855.702632] RPC: 16303 disabling timer Nov 23 09:12:27 frank kernel: [ 3855.702636] RPC: 16303 removed from queue ffff88002c45c2f0 "xprt_pending" Nov 23 09:12:27 frank kernel: [ 3855.702641] RPC: __rpc_wake_up_task done Nov 23 09:12:27 frank kernel: [ 3855.702645] RPC: 16303 __rpc_execute flags=0x1 Nov 23 09:12:27 frank kernel: [ 3855.702649] RPC: 16303 xprt_connect_status: retrying Nov 23 09:12:27 frank kernel: [ 3855.702653] RPC: 16303 call_connect_status (status -11) Nov 23 09:12:27 frank kernel: [ 3855.702672] RPC: 16303 call_transmit (status 0) Nov 23 09:12:27 frank kernel: [ 3855.702676] RPC: 16303 xprt_prepare_transmit Nov 23 09:12:27 frank kernel: [ 3855.702680] RPC: 16303 rpc_xdr_encode (status 0) Nov 23 09:12:27 frank kernel: [ 3855.702685] RPC: 16303 marshaling UNIX cred ffff88002c710a40 Nov 23 09:12:27 frank kernel: [ 3855.702690] RPC: 16303 using AUTH_UNIX cred ffff88002c710a40 to wrap rpc data Nov 23 09:12:27 frank kernel: [ 3855.702695] RPC: 16303 xprt_transmit(131252) Nov 23 09:12:27 frank kernel: [ 3855.702700] RPC: xs_tcp_send_request(131252) = -32 Nov 23 09:12:27 frank kernel: [ 3855.702705] RPC: xs_tcp_state_change client ffff88002c45c000... Nov 23 09:12:27 frank kernel: [ 3855.702710] RPC: state 7 conn 0 dead 0 zapped 1 Nov 23 09:12:27 frank kernel: [ 3855.702714] RPC: disconnected transport ffff88002c45c000 Nov 23 09:12:27 frank kernel: [ 3855.702724] -pid- flgs status -client- --rqstp- -timeout ---ops-- Nov 23 09:12:27 frank kernel: [ 3855.702779] 15517 0001 -11 ffff88002f5dd200 (null) 0 ffffffffa035c500 nfsv4 WRITE a:call_reserveresult q:xprt_backlog Nov 23 09:12:27 frank kernel: [ 3855.702826] 15906 0001 -11 ffff88002f5dd200 ffff880025594930 0 ffffffffa035c500 nfsv4 WRITE a:call_status q:xprt_resend Nov 23 09:12:27 frank kernel: [ 3855.702863] 15907 0001 -11 ffff88002f5dd200 ffff8800255943f0 0 ffffffffa035c500 nfsv4 WRITE a:call_status q:xprt_resend Nov 23 09:12:27 frank kernel: [ 3855.702897] 15908 0001 -11 ffff88002f5dd200 ffff880025594fc0 0 ffffffffa035c500 nfsv4 WRITE a:call_status q:xprt_resend Nov 23 09:12:27 frank kernel: [ 3855.702933] 15920 0001 -11 ffff88002f5dd200 ffff8800255953b0 0 ffffffffa035c500 nfsv4 WRITE a:call_status q:xprt_resend Nov 23 09:12:27 frank kernel: [ 3855.702970] 15922 0001 -11 ffff88002f5dd200 ffff880025594d20 0 ffffffffa035c500 nfsv4 WRITE a:call_status q:xprt_resend ... and so on as in the first comment. Prior to lines 6587 as above, there appears to have been some corruption: Nov 23 09:11:05 frank kernel: <48002000 mffffnnecting xprt f8.2 ] RPCabling ti0 "xpr832.9.905609]connect_status (statusepare_transmit Nov 23 09:11:05 frank kernel: [ 3832.905633] RPC: 16303PC: 16303 marshaling UNIX cred ffff88002c710a40 Nov 23 09:11:05 frank kernel: [ 3832.905646] RPC: 16303 using AUTH_UNIX cred ffff88002c710a40 to wrap rpc data Nov 23 09:11:05 frank kernel: [ 3832.905654] RPC: 16303 xprt_transmit(131252) Nov 23 09:11:05 frank kernel: [ 3832.905661] RPC: xs_tcp_send_request(131252) = -32 Nov 23 09:11:05 frank kernel: [ 3832.905667] RPC: xs_tcp_state_change client ffff88002c45c000... Nov 23 09:11:05 frank kernel: [ 3832.905674] RPC: e 7 nectC: 16 RPC: 162.90505715] RPC: e 4295850522) Nov 23 09:11:05 frank kernel: 8002000 fffnnecting xprt ffff8808.2. ] Rabling0 "x832..905790] RPC: connect_status (separe_transmit Nov 23 09:11:05 frank kernel: [ 3832.905816] RPC: 16303 rpc_xdrPC: 16303 RP0a40 32.905843]32 Nov 23 09:11:06 frank kernel: [liene 7 nectC: RPC 1632.90505896] RPC: 16e 4295850522) Nov 23 09:11:06 frank kernel: <8002c4000 mfffnnecting xprt ffff88.2 ] RPC:abling timer Nov 23 09:11:06 frank kernel: <0 "xpr832..905971] connect_status (stepare_transmit Nov 23 09:11:06 frank kernel: [ 3832.905997] RPC: 16303 rpPC: 16303 RP0a4032.906025] RP32 Nov 23 09:11:06 frank kernel: [ 3liene 7 necC: RPC 162.90606077] RPC: 16e 4295850522)8000 ffff8nnecting xprt 8.2 ] RPCabling tim0 "xp832..906154connect_status (staepare_transmit Nov 23 09:11:06 frank kernel: [ 3832.906179] RPC: 16303 rpc_PC: 1630 RP0a4032.906207] 32 Nov 23 09:11:06 frank kernel: [ 383liee 7 necC: 1 RPC 1632.9006260] RPC: 1e 4295850522)8002c4000ffffnnecting xprt f8.2 ] RPabling 0 "xp832.9.90633connect_status epare_transmit Nov 23 09:11:06 frank kernel: [ 3832.906362] RPC: 16303 rpcPC: 1630 RPC0a4032.906389] 32 Nov 23 09:11:06 frank kernel: [ 38liee 7 necC: RPC 1632.9006442] RPC: 1e 4295850522)8002c4000 mffffnnecting xprt ff8.2 ] RPCabling 0 "x832.9.9065connect_status (epare_transmit Nov 23 09:11:06 frank kernel: [ 3832.906542] RPC: 16303PC: 1630 RPC0a40 32.906569] R32 Nov 23 09:11:06 frank kernel: [ 38liee 7 necteC: RPC: 162.9006623] RPCe 4295850522) Nov 23 09:11:06 frank kernel: 8002c45c2f0 "xprt_pendin000 mffffnnecting xprt f8.2 ] RPC:abling ti0 "x832..906699]connect_status (separe_transmit I can still attach the whole thing. Oh, look. More NFS weirdness: [jnelson@worklaptop:~] ls -lah /multimedia/trace -rw-r--r-- 1 4294967294 4294967294 997K 2009-11-30 20:43 /multimedia/trace [jnelson@worklaptop:~] sigh. -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=557760 http://bugzilla.novell.com/show_bug.cgi?id=557760#c3 Neil Brown <nfbrown@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |NEEDINFO Info Provider| |jnelson-suse@jamponi.net --- Comment #3 from Neil Brown <nfbrown@novell.com> 2009-12-08 01:12:21 UTC --- I didn't want "the whole thing" I requested two things quite explicitly, and you have chosen to send me something different. Sigh. It appears that the client tries to establish a connection to the server, receives EAGAIN suggesting that the connection hasn't completed yet, but then seems to proceed as if it has completed. It tries to write the first message, and receives EPIPE indicating that the connection isn't ready, which is not very surprising. Then the client tries to connect again immediately. So there could be two problems here. 1/ the client proceeds with the connection even though it received EAGAIN. 2/ the client retried immediately after getting EPIPE. The first seems to be deliberate, thanks to mainline commit 2a4919919a97911b0aa4b9f5ac1eab90ba87652b I don't really understand that patch though. The second is a problem I've seen before. I'll attach a patch that might help. Are you able to compile your own SUSE kernel with this patch? If not I'll try to get it into kotd for you to test. NeilBroown -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=557760 http://bugzilla.novell.com/show_bug.cgi?id=557760#c4 --- Comment #4 from Neil Brown <nfbrown@novell.com> 2009-12-08 01:13:17 UTC --- Created an attachment (id=331476) --> (http://bugzilla.novell.com/attachment.cgi?id=331476) Patch to delay reconnects. -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=557760 http://bugzilla.novell.com/show_bug.cgi?id=557760#c5 --- Comment #5 from Jon Nelson <jnelson-suse@jamponi.net> 2009-12-08 02:00:55 UTC --- I sent you what I could. The rest was corrupted or just the same lines over and over again. The process list was also corrupted. The corruption did not happen on-disk but in the kernel log buffers. I can compile my own kernel, but it will be a few days before I'm able to test. If it isn't hard, a kotd kernel would be preferred, but it is not a deal breaker. -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=557760 http://bugzilla.novell.com/show_bug.cgi?id=557760#c6 --- Comment #6 from Neil Brown <nfbrown@novell.com> 2009-12-08 03:21:08 UTC --- Was thinking particularly of the process list... no matter.. I have submitted a slightly revised version of the patch so it should appear in the 11.2 kotd in a day or so. -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=557760 http://bugzilla.novell.com/show_bug.cgi?id=557760#c7 Jon Nelson <jnelson-suse@jamponi.net> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |ASSIGNED Info Provider|jnelson-suse@jamponi.net | --- Comment #7 from Jon Nelson <jnelson-suse@jamponi.net> 2009-12-13 22:34:24 UTC --- I tried the KOTD kernel today (2.6.31.7-0.0.0.8.a22d080-desktop). While *reading* a large .iso file (over NFS!), dd reported 86.1 MB/s. That is really fantastic. I mean really really awesome. Previously, I would average 1/3 of that. Smaller files still hurt pretty bad, and my average (reading from raid6) is still only 45-50MB/s, but that's pretty reasonable I think. And what's more I didn't get any rpciod issues. Reading from one NFS share and writing to another results in /reaaaaaly/ bad NFS performance: iptraf reports highly variable flow rates, typically below 2MB/s with occasional short burts to 5MB/s. When reading I'm able to achieve much more than that (see above). However, rpciod no longer consuming 100% CPU and hanging the box. If it goes funky again I'll report here. At this rate it will be many hours before it is done copying. -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=557760 http://bugzilla.novell.com/show_bug.cgi?id=557760#c8 --- Comment #8 from Jon Nelson <jnelson-suse@jamponi.net> 2009-12-13 23:07:03 UTC --- I thought I'd check something. To eliminate weirdness due to reading/writing at the same time, I did a simple 'dd' test: dd if=/dev/zero of=somefile to write to the NFS share. Average rate: 2.9MB/s. After dropping the (client) caches, I can read [some] large files at 80+ MB/s. -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=557760 http://bugzilla.novell.com/show_bug.cgi?id=557760#c9 Neil Brown <nfbrown@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |NEEDINFO Info Provider| |jnelson-suse@jamponi.net --- Comment #9 from Neil Brown <nfbrown@novell.com> 2009-12-14 23:34:48 UTC --- I'm surprised that your read speeds improved. Either it is pure coincidence, or there is something very strange going on. The very low write speeds make me lean towards something very strange happening. It might help to get a tcpdump trace (tcpdump -s 0) of traffic while a write is being attempted to see what sort of requests are being sent, and how they are responded to. It might also be interesting to see if switching to UDP made writes go faster. I would recommend it as a work-around, but the data gathered by trying it could be helpful. -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=557760 http://bugzilla.novell.com/show_bug.cgi?id=557760#c10 Jon Nelson <jnelson-suse@jamponi.net> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |ASSIGNED Info Provider|jnelson-suse@jamponi.net | --- Comment #10 from Jon Nelson <jnelson-suse@jamponi.net> 2009-12-15 02:26:59 UTC --- I would ignore the read speeds for now. The write speeds are pretty bad. Can NFSv4 work over UDP? I just re-tested. On an otherwise totally idle server and client, with no other meaningful traffic on the switch whatsoever: 101911552 bytes (102 MB) copied, 59.5694 s, 1.7 MB/s The server's storage looks like this: 4x SATA disks arranged as a raid6, lvm on top of that, and jfs (for this filesystem) on top of that. Local I/O for comparison (caches dropped): 33.2MB/s. Not awesome but substantially better than 1.7MB/s. I gave UDP a try. It did not go well. I got a *lot* of this: [ 2535.582524] nfs: RPC call returned error 88 [ 2535.582543] nfs: RPC call returned error 88 [ 2535.582554] nfs: RPC call returned error 88 [ 2535.582563] nfs: RPC call returned error 88 slightly less than 3000 lines worth, AND, the following gem: jnelson@frank:~> dd if=/dev/zero of=/multimedia/foo bs=1k count=100000 dd: closing output file `/multimedia/foo': Socket operation on non-socket jnelson@frank:~> Whaaa? and then lots and lots of I/O errors. So I gave up on UDP (NFSv4). How large of a tcpdump would you need? My last (quick) test was: 101963776 bytes (102 MB) copied, 103.186 s, 988 kB/s Mount options: rw,noatime,nodiratime,hard,intr,noacl,rsize=32768,wsize=32768 Raw TCP sends from this box: 111303.89kB/s -- 109MB/s for blasting TCP from the client to the server as fast as they can each send/receive. I don't think it's the hardware. ;-) (as a side note, setting the MTU on the client to 8200 makes performance DROP considerably: to a mere 72MB/s) I'm at your disposal. -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=557760 http://bugzilla.novell.com/show_bug.cgi?id=557760#c11 Neil Brown <nfbrown@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |NEEDINFO Info Provider| |jnelson-suse@jamponi.net --- Comment #11 from Neil Brown <nfbrown@novell.com> 2009-12-15 04:05:38 UTC --- Thanks for ruling out a lot of possibilities - it really does help. There is no good reason by NFSv4 should not work over UDP (particularly on a fast local link) but the spec almost explicitly excludes UDP, and so no-one tests it, and so you get the sort of garbage that you got ... oh well. How large a tcpdump? I would want to see a few dozen writes at least which could be as little as a few hundred packets, but should be much more... Let's say 2000 packets. That should be enough to give a reasonable picture without making the capture file too large. -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=557760 http://bugzilla.novell.com/show_bug.cgi?id=557760#c12 Neil Brown <nfbrown@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |ASSIGNED Info Provider|jnelson-suse@jamponi.net | --- Comment #12 from Neil Brown <nfbrown@novell.com> 2009-12-16 02:32:48 UTC --- Jon sent me a trace out-of-band. It shows approximately 35 NFSv4 requests: PUT_FH / WRITE(32K) / GETATTR and 2 PUT_FH / COMMIT / GETATTR which is what you would expect. It appears that there was some packet loss. The headers that I see only cover about 1Meg of writes, but the SEQ numbers show closer to 4.7Meg of writes. This over a time of 6.4 seconds, so about 740MB/sec. The only interesting thing I see is that none of the NFS packets started at the beginning of an IP packet - they all started in the middle. This means that the NFS client never pushed out an NFS packet before it had another packet ready to write. This is surprising. You would expect (particularly when just writing lots of zeros), the NFS client would come to a point when it needed to wait for a reply from the server that the data was safe. When it waits, it should push out any pending writes to make sure the request has actually arrived at the server so that it is reasonable to wait. But I see no evidence of that happening. I do see a number of IP packets with PSH set meaning they were pushed out, but they appear to be in the middle of NFS packets, suggesting that the push trigger happened at a lower level in the stack. It could be that the pushed NFS packets were among those that were not captured, but I think that is unlikely. So my guess is that the NFS client is queuing up lots of writes but they aren't getting out into the TCP connection promptly. The NFS client then waits for a reply for a request that the TCP layer hasn't sent yet. Eventually a timer in the TCP layer triggers the PUSH, the NFS request is sent, handled, replied to, and the client continues. This would explain the very slow write throughput. I'll see if I can find justification for the theory in the code. -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=557760 http://bugzilla.novell.com/show_bug.cgi?id=557760#c13 Neil Brown <nfbrown@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |NEEDINFO Info Provider| |jnelson-suse@jamponi.net --- Comment #13 from Neil Brown <nfbrown@novell.com> 2009-12-16 06:16:55 UTC --- The code looks like it is doing the right thing. I wonder if you have a network card which does segmentation offload, and there is something wrong with that offload and it isn't pushed packets out properly. Is there any chance of trying a different network card in the client machine? -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=557760 http://bugzilla.novell.com/show_bug.cgi?id=557760#c14 Jon Nelson <jnelson-suse@jamponi.net> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |ASSIGNED Info Provider|jnelson-suse@jamponi.net | --- Comment #14 from Jon Nelson <jnelson-suse@jamponi.net> 2009-12-16 15:03:37 UTC --- The hardware is an R8169. On boot, this is what ethtool has to say: frank:~ # ethtool -k eth1 Offload parameters for eth1: Cannot get device flags: Operation not supported rx-checksumming: on tx-checksumming: off scatter-gather: off tcp segmentation offload: off udp fragmentation offload: off generic segmentation offload: off large receive offload: off frank:~ # So I change it: frank:~ # ethtool -K eth1 rx off Don't forget the (non-NFS) TCP rates gathered in comment 10: Raw TCP sends from this box: 111303.89kB/s -- 109MB/s for blasting TCP from the client to the server as fast as they can each send/receive. I don't think it's the hardware. ;-) jnelson@frank:~> dd if=/dev/zero of=/multimedia/foo bs=1k count=100000 100000+0 records in 100000+0 records out 102400000 bytes (102 MB) copied, 61.5018 s, 1.7 MB/s jnelson@frank:~> The ethtool change made no difference. I tried a tg3. Same exact problem/performance/whatever. I enabled a bit of rpc debugging. What I see are *sets* of loglines like this: Dec 16 08:59:48 frank kernel: [ 943.207724] RPC: write space: waking waiting task on xprt ffff88002c11d000 Dec 16 08:59:48 frank kernel: [ 943.207746] RPC: 4543 xprt_prepare_transmit Dec 16 08:59:48 frank kernel: [ 943.207749] RPC: 4543 xprt_transmit(131252) Dec 16 08:59:48 frank kernel: [ 943.207762] RPC: 4543 xmit complete Dec 16 08:59:48 frank kernel: [ 943.207765] RPC: 4544 xprt_prepare_transmit Dec 16 08:59:48 frank kernel: [ 943.207769] RPC: 4544 xprt_transmit(131252) Dec 16 08:59:49 frank kernel: [ 943.330798] RPC: 4505 xid 934fea59 complete (132 bytes received) Dec 16 08:59:49 frank kernel: [ 943.330920] RPC: 4505 release request ffff880013524fc0 Dec 16 08:59:49 frank kernel: [ 943.330933] RPC: 4546 reserved req ffff880013524fc0 xid 9c4fea59 Dec 16 08:59:49 frank kernel: [ 943.330947] RPC: 4546 xprt_prepare_transmit Dec 16 08:59:49 frank kernel: [ 943.330952] RPC: 4546 failed to lock transport ffff88002c11d000 Dec 16 08:59:49 frank kernel: [ 943.375795] RPC: write space: waking waiting task on xprt ffff88002c11d000 Dec 16 08:59:49 frank kernel: [ 943.375841] RPC: 4544 xprt_prepare_transmit Dec 16 08:59:49 frank kernel: [ 943.375846] RPC: 4544 xprt_transmit(131252) Dec 16 08:59:49 frank kernel: [ 943.375860] RPC: 4544 xmit complete Dec 16 08:59:49 frank kernel: [ 943.375875] RPC: 4545 xprt_prepare_transmit Dec 16 08:59:49 frank kernel: [ 943.375882] RPC: 4545 xprt_transmit(131252) Dec 16 08:59:49 frank kernel: [ 943.513595] RPC: 4506 xid 944fea59 complete (132 bytes received) Dec 16 08:59:49 frank kernel: [ 943.513727] RPC: 4506 release request ffff880013524e70 Dec 16 08:59:49 frank kernel: [ 943.513748] RPC: 4547 reserved req ffff880013524e70 xid 9d4fea59 Dec 16 08:59:49 frank kernel: [ 943.513754] RPC: 4547 xprt_prepare_transmit Dec 16 08:59:49 frank kernel: [ 943.513764] RPC: 4547 failed to lock transport ffff88002c11d000 Dec 16 08:59:49 frank kernel: [ 943.552714] RPC: write space: waking waiting task on xprt ffff88002c11d000 Dec 16 08:59:49 frank kernel: [ 943.552734] RPC: 4545 xprt_prepare_transmit Dec 16 08:59:49 frank kernel: [ 943.552737] RPC: 4545 xprt_transmit(131252) Dec 16 08:59:49 frank kernel: [ 943.552744] RPC: 4545 xmit complete Dec 16 08:59:49 frank kernel: [ 943.552754] RPC: 4546 xprt_prepare_transmit Dec 16 08:59:49 frank kernel: [ 943.552758] RPC: 4546 xprt_transmit(131252) Dec 16 08:59:49 frank kernel: [ 943.680157] RPC: 4507 xid 954fea59 complete (132 bytes received) Dec 16 08:59:49 frank kernel: [ 943.680247] RPC: 4507 release request ffff8800135247e0 Dec 16 08:59:49 frank kernel: [ 943.680265] RPC: 4548 reserved req ffff8800135247e0 xid 9e4fea59 Dec 16 08:59:49 frank kernel: [ 943.680270] RPC: 4548 xprt_prepare_transmit Dec 16 08:59:49 frank kernel: [ 943.680275] RPC: 4548 failed to lock transport ffff88002c11d000 Dec 16 08:59:49 frank kernel: [ 943.720442] RPC: write space: waking waiting task on xprt ffff88002c11d000 Dec 16 08:59:49 frank kernel: [ 943.720466] RPC: 4546 xprt_prepare_transmit Dec 16 08:59:49 frank kernel: [ 943.720469] RPC: 4546 xprt_transmit(131252) Dec 16 08:59:49 frank kernel: [ 943.720476] RPC: 4546 xmit complete Dec 16 08:59:49 frank kernel: [ 943.720485] RPC: 4547 xprt_prepare_transmit Dec 16 08:59:49 frank kernel: [ 943.720489] RPC: 4547 xprt_transmit(131252) Dec 16 08:59:49 frank kernel: [ 943.841025] RPC: 4508 xid 964fea59 complete (132 bytes received) Dec 16 08:59:49 frank kernel: [ 943.841216] RPC: 4508 release request ffff8800135242a0 Dec 16 08:59:49 frank kernel: [ 943.841229] RPC: 4549 reserved req ffff8800135242a0 xid 9f4fea59 Dec 16 08:59:49 frank kernel: [ 943.841243] RPC: 4549 xprt_prepare_transmit Dec 16 08:59:49 frank kernel: [ 943.841247] RPC: 4549 failed to lock transport ffff88002c11d000 when I set the rpc_debug level to 2: Dec 16 09:01:35 frank kernel: [ 1021.205749] RPC: 4681 call_status (status -11) Dec 16 09:01:35 frank kernel: [ 1021.205751] RPC: 4681 call_transmit (status 0) Dec 16 09:01:36 frank kernel: [ 1021.205754] RPC: 4681 rpc_xdr_encode (status 0) Dec 16 09:01:36 frank kernel: [ 1021.207315] RPC: 4675 call_status (status 132) Dec 16 09:01:36 frank kernel: [ 1021.207317] RPC: 4675 call_decode (status 132) Dec 16 09:01:36 frank kernel: [ 1021.207321] RPC: 4675 call_decode result 131072 Dec 16 09:01:36 frank kernel: [ 1021.207326] RPC: rpc_release_client(ffff88001a189e00) Dec 16 09:01:36 frank kernel: [ 1021.207337] RPC: 4691 call_reserveresult (status -11) Dec 16 09:01:36 frank kernel: [ 1021.207339] RPC: 4691 call_reserve (status 0) Dec 16 09:01:36 frank kernel: [ 1021.207341] RPC: 4691 call_reserveresult (status 0) Dec 16 09:01:36 frank kernel: [ 1021.207343] RPC: 4691 call_allocate (status 0) Dec 16 09:01:36 frank kernel: [ 1021.207345] RPC: 4691 call_bind (status 0) Dec 16 09:01:36 frank kernel: [ 1021.207347] RPC: 4691 call_connect xprt ffff88002c11d000 is connected Dec 16 09:01:36 frank kernel: [ 1021.207349] RPC: 4691 call_transmit (status 0) Dec 16 09:01:36 frank kernel: [ 1021.208135] RPC: 4681 call_status (status -11) Dec 16 09:01:36 frank kernel: [ 1021.208137] RPC: 4681 call_transmit (status 0) Dec 16 09:01:36 frank kernel: [ 1021.208152] RPC: 4682 call_status (status -11) Dec 16 09:01:36 frank kernel: [ 1021.208154] RPC: 4682 call_transmit (status 0) Dec 16 09:01:36 frank kernel: [ 1021.208165] RPC: 4682 rpc_xdr_encode (status 0) Dec 16 09:01:36 frank kernel: [ 1021.210237] RPC: 4676 call_status (status 132) Dec 16 09:01:36 frank kernel: [ 1021.210240] RPC: 4676 call_decode (status 132) Dec 16 09:01:36 frank kernel: [ 1021.210245] RPC: 4676 call_decode result 131072 Dec 16 09:01:36 frank kernel: [ 1021.210250] RPC: rpc_release_client(ffff88001a189e00) Dec 16 09:01:36 frank kernel: [ 1021.210256] RPC: 4682 call_status (status -11) Dec 16 09:01:36 frank kernel: [ 1021.210258] RPC: 4682 call_transmit (status 0) Dec 16 09:01:36 frank kernel: [ 1021.210272] RPC: 4692 call_reserveresult (status -11) Dec 16 09:01:36 frank kernel: [ 1021.210274] RPC: 4692 call_reserve (status 0) Dec 16 09:01:36 frank kernel: [ 1021.210276] RPC: 4692 call_reserveresult (status 0) Dec 16 09:01:36 frank kernel: [ 1021.210278] RPC: 4692 call_allocate (status 0) Dec 16 09:01:36 frank kernel: [ 1021.210280] RPC: 4692 call_bind (status 0) Dec 16 09:01:36 frank kernel: [ 1021.210282] RPC: 4692 call_connect xprt ffff88002c11d000 is connected Dec 16 09:01:36 frank kernel: [ 1021.210284] RPC: 4692 call_transmit (status 0) and when I really crank it up: Dec 16 09:03:15 frank kernel: [ 1054.434423] RPC: 5098 call_reserveresult (status -11) Dec 16 09:03:15 frank kernel: [ 1054.434427] RPC: 5098 call_reserve (status 0) Dec 16 09:03:15 frank kernel: [ 1054.434431] RPC: 5098 call_reserveresult (status 0) Dec 16 09:03:15 frank kernel: [ 1054.434435] RPC: 5098 call_allocate (status 0) Dec 16 09:03:15 frank kernel: [ 1054.434440] RPC: 5098 call_bind (status 0) Dec 16 09:03:15 frank kernel: [ 1054.434444] RPC: 5098 call_connect xprt ffff88002c11d000 is connected Dec 16 09:03:15 frank kernel: [ 1054.434448] RPC: 5098 call_transmit (status 0) Dec 16 09:03:15 frank kernel: [ 1054.434452] RPC: 5096 call_status (status -11) Dec 16 09:03:15 frank kernel: [ 1054.434456] RPC: 5096 call_transmit (status 0) Dec 16 09:03:15 frank kernel: [ 1054.434460] RPC: 5096 rpc_xdr_encode (status 0) Dec 16 09:03:15 frank kernel: [ 1054.437199] RPC: 5090 call_status (status 132) Dec 16 09:03:15 frank kernel: [ 1054.437204] RPC: 5090 call_decode (status 132) Dec 16 09:03:15 frank kernel: [ 1054.437210] RPC: 5090 call_decode result 131072 Dec 16 09:03:15 frank kernel: [ 1054.437219] RPC: rpc_release_client(ffff88001a189e00) Dec 16 09:03:15 frank kernel: [ 1054.437235] RPC: 5096 call_status (status -11) Dec 16 09:03:15 frank kernel: [ 1054.437239] RPC: 5096 call_transmit (status 0) Dec 16 09:03:15 frank kernel: [ 1054.437258] RPC: 5099 call_reserveresult (status -11) Dec 16 09:03:15 frank kernel: [ 1054.437262] RPC: 5099 call_reserve (status 0) Dec 16 09:03:15 frank kernel: [ 1054.437271] RPC: 5099 call_reserveresult (status 0) Dec 16 09:03:15 frank kernel: [ 1054.437275] RPC: 5099 call_allocate (status 0) Dec 16 09:03:15 frank kernel: [ 1054.437279] RPC: 5099 call_bind (status 0) Dec 16 09:03:15 frank kernel: [ 1054.437283] RPC: 5099 call_connect xprt ffff88002c11d000 is connected Dec 16 09:03:16 frank kernel: [ 1054.437294] RPC: 5099 call_transmit (status 0) Dec 16 09:03:16 frank kernel: [ 1054.437299] RPC: 5097 call_status (status -11) Dec 16 09:03:16 frank kernel: [ 1054.437303] RPC: 5097 call_transmit (status 0) Dec 16 09:03:16 frank kernel: [ 1054.437307] RPC: 5097 rpc_xdr_encode (status 0) Dec 16 09:03:16 frank kernel: [ 1054.439908] RPC: 5091 call_status (status 132) Dec 16 09:03:16 frank kernel: [ 1054.439912] RPC: 5091 call_decode (status 132) Dec 16 09:03:16 frank kernel: [ 1054.439919] RPC: 5091 call_decode result 131072 Dec 16 09:03:16 frank kernel: [ 1054.439927] RPC: rpc_release_client(ffff88001a189e00) Dec 16 09:03:16 frank kernel: [ 1054.439943] RPC: 5097 call_status (status -11) Dec 16 09:03:16 frank kernel: [ 1054.439947] RPC: 5097 call_transmit (status 0) Dec 16 09:03:16 frank kernel: [ 1054.439966] RPC: 5100 call_reserveresult (status -11) Dec 16 09:03:16 frank kernel: [ 1054.439970] RPC: 5100 call_reserve (status 0) Dec 16 09:03:16 frank kernel: [ 1054.439973] RPC: 5100 call_reserveresult (status 0) Dec 16 09:03:16 frank kernel: [ 1054.439982] RPC: 5100 call_allocate (status 0) Dec 16 09:03:16 frank kernel: [ 1054.439986] RPC: 5100 call_bind (status 0) Dec 16 09:03:16 frank kernel: [ 1054.439990] RPC: 5100 call_connect xprt ffff88002c11d000 is connected Dec 16 09:03:16 frank kernel: [ 1054.439995] RPC: 5100 call_transmit (status 0) Dec 16 09:03:16 frank kernel: [ 1054.440025] RPC: 5098 call_status (status -11) Dec 16 09:03:16 frank kernel: [ 1054.440028] RPC: 5098 call_transmit (status 0) Dec 16 09:03:16 frank kernel: [ 1054.440032] RPC: 5098 rpc_xdr_encode (status 0) Dec 16 09:03:16 frank kernel: [ 1054.442815] RPC: 5092 call_status (status 132) Dec 16 09:03:16 frank kernel: [ 1054.442820] RPC: 5092 call_decode (status 132) Dec 16 09:03:16 frank kernel: [ 1054.442829] RPC: 5092 call_decode result 131072 Dec 16 09:03:16 frank kernel: [ 1054.442840] RPC: rpc_release_client(ffff88001a189e00) Dec 16 09:03:16 frank kernel: [ 1054.442849] RPC: 5098 call_status (status -11) Dec 16 09:03:16 frank kernel: [ 1054.442853] RPC: 5098 call_transmit (status 0) Dec 16 09:03:16 frank kernel: [ 1054.442877] RPC: 5101 call_reserveresult (status -11) Dec 16 09:03:16 frank kernel: [ 1054.442885] RPC: 5101 call_reserveresult (status 0) Dec 16 09:03:16 frank kernel: [ 1054.442889] RPC: 5101 call_allocate (status 0) Dec 16 09:03:16 frank kernel: [ 1054.442899] RPC: 5101 call_bind (status 0) Dec 16 09:03:16 frank kernel: [ 1054.442903] RPC: 5101 call_connect xprt ffff88002c11d000 is connected Dec 16 09:03:16 frank kernel: [ 1054.442907] RPC: 5101 call_transmit (status 0) Dec 16 09:03:16 frank kernel: [ 1054.442911] RPC: 5099 call_status (status -11) Dec 16 09:03:16 frank kernel: [ 1054.442915] RPC: 5099 call_transmit (status 0) Dec 16 09:03:16 frank kernel: [ 1054.442925] RPC: 5099 rpc_xdr_encode (status 0) Dec 16 09:03:16 frank kernel: [ 1054.482089] RPC: 5099 call_status (status -11) Dec 16 09:03:16 frank kernel: [ 1054.482092] RPC: 5099 call_transmit (status 0) Dec 16 09:03:16 frank kernel: [ 1054.482100] RPC: 5100 call_status (status -11) Dec 16 09:03:16 frank kernel: [ 1054.482102] RPC: 5100 call_transmit (status 0) Dec 16 09:03:17 frank kernel: [ 1054.482104] RPC: 5100 rpc_xdr_encode (status 0) Dec 16 09:03:17 frank kernel: [ 1054.829555] RPC: 4849 call_status (status 132) Dec 16 09:03:17 frank kernel: [ 1054.829563] RPC: 4849 call_decode (status 132) Dec 16 09:03:17 frank kernel: [ 1054.829577] RPC: 4849 call_decode result 131072 Dec 16 09:03:17 frank kernel: [ 1054.829592] RPC: rpc_release_client(ffff88001a189e00) Dec 16 09:03:17 frank kernel: [ 1054.829603] RPC: 5102 call_reserveresult (status -11) Dec 16 09:03:17 frank kernel: [ 1054.829617] RPC: 5102 call_reserve (status 0) Dec 16 09:03:17 frank kernel: [ 1054.829622] RPC: 5102 call_reserveresult (status 0) Dec 16 09:03:17 frank kernel: [ 1054.829633] RPC: 5102 call_allocate (status 0) Dec 16 09:03:17 frank kernel: [ 1054.829638] RPC: 5102 call_bind (status 0) Dec 16 09:03:17 frank kernel: [ 1054.829643] RPC: 5102 call_connect xprt ffff88002c11d000 is connected Dec 16 09:03:17 frank kernel: [ 1054.829654] RPC: 5102 call_transmit (status 0) Dec 16 09:03:17 frank kernel: [ 1054.871559] RPC: 5100 call_status (status -11) Dec 16 09:03:17 frank kernel: [ 1054.871566] RPC: 5100 call_transmit (status 0) Dec 16 09:03:17 frank kernel: [ 1054.871593] RPC: 5101 call_status (status -11) Dec 16 09:03:17 frank kernel: [ 1054.871605] RPC: 5101 call_transmit (status 0) Dec 16 09:03:17 frank kernel: [ 1054.871610] RPC: 5101 rpc_xdr_encode (status 0) Dec 16 09:03:17 frank kernel: [ 1055.084218] RPC: 4850 call_status (status 132) Dec 16 09:03:17 frank kernel: [ 1055.084225] RPC: 4850 call_decode (status 132) Dec 16 09:03:17 frank kernel: [ 1055.084238] RPC: 4850 call_decode result 131072 Dec 16 09:03:17 frank kernel: [ 1055.084261] RPC: rpc_release_client(ffff88001a189e00) Dec 16 09:03:17 frank kernel: [ 1055.084278] RPC: 5103 call_reserveresult (status -11) Dec 16 09:03:17 frank kernel: [ 1055.084283] RPC: 5103 call_reserve (status 0) Dec 16 09:03:17 frank kernel: [ 1055.084288] RPC: 5103 call_reserveresult (status 0) Dec 16 09:03:17 frank kernel: [ 1055.084299] RPC: 5103 call_allocate (status 0) Dec 16 09:03:17 frank kernel: [ 1055.084304] RPC: 5103 call_bind (status 0) Dec 16 09:03:17 frank kernel: [ 1055.084309] RPC: 5103 call_connect xprt ffff88002c11d000 is connected Dec 16 09:03:17 frank kernel: [ 1055.084319] RPC: 5103 call_transmit (status 0) Note the timing. No lines have been altered or redacted. Hope this is useful. -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=557760 http://bugzilla.novell.com/show_bug.cgi?id=557760#c15 --- Comment #15 from Jon Nelson <jnelson-suse@jamponi.net> 2009-12-16 15:35:44 UTC --- One thing that might be useful. With the tg3, I was getting a fair number of frame errors. All hardware offloading is turned off in both the sender and receive, MTU is 1500, etc.... RX packets:108084 errors:422 dropped:0 overruns:0 frame:431 Trying a completely different machine, a laptop with built-in Intel GiG-E (although the non-jumbo-frame version), same bad performance (however, no frame errors). I suppose it could be the switch, but again, non-NFS traffic just *flies*. -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=557760 http://bugzilla.novell.com/show_bug.cgi?id=557760#c16 Swamp Workflow Management <swamp@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status Whiteboard| |maint:released:11.2:29469 --- Comment #16 from Swamp Workflow Management <swamp@suse.com> 2010-01-04 10:53:23 UTC --- Update released for: kernel-debug, kernel-debug-base, kernel-debug-base-debuginfo, kernel-debug-debuginfo, kernel-debug-debugsource, kernel-debug-devel, kernel-debug-devel-debuginfo, kernel-default, kernel-default-base, kernel-default-base-debuginfo, kernel-default-debuginfo, kernel-default-debugsource, kernel-default-devel, kernel-default-devel-debuginfo, kernel-desktop, kernel-desktop-base, kernel-desktop-base-debuginfo, kernel-desktop-debuginfo, kernel-desktop-debugsource, kernel-desktop-devel, kernel-desktop-devel-debuginfo, kernel-pae, kernel-pae-base, kernel-pae-base-debuginfo, kernel-pae-debuginfo, kernel-pae-debugsource, kernel-pae-devel, kernel-pae-devel-debuginfo, kernel-source, kernel-source-vanilla, kernel-syms, kernel-trace, kernel-trace-base, kernel-trace-base-debuginfo, kernel-trace-debuginfo, kernel-trace-debugsource, kernel-trace-devel, kernel-trace-devel-debuginfo, kernel-vanilla, kernel-vanilla-base, kernel-vanilla-base-debuginfo, kernel-vanilla-debuginfo, kernel-vanilla-debugsource, kernel-vanilla-devel, kernel-vanilla-devel-debuginfo, kernel-xen, kernel-xen-base, kernel-xen-base-debuginfo, kernel-xen-debuginfo, kernel-xen-debugsource, kernel-xen-devel, kernel-xen-devel-debuginfo, preload-kmp-default, preload-kmp-desktop Products: openSUSE 11.2 (debug, i586, x86_64) -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=557760 http://bugzilla.novell.com/show_bug.cgi?id=557760#c17 Neil Brown <nfbrown@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |NEEDINFO InfoProvider| |jnelson-suse@jamponi.net --- Comment #17 from Neil Brown <nfbrown@novell.com> 2010-04-20 10:55:59 UTC --- We don't seem to be making any progress here... Is this still an issue for you? Is it possible to double-check that it is not the switch? -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=557760 http://bugzilla.novell.com/show_bug.cgi?id=557760#c18 Neil Brown <nfbrown@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |RESOLVED InfoProvider|jnelson-suse@jamponi.net | Resolution| |NORESPONSE --- Comment #18 from Neil Brown <nfbrown@novell.com> 2010-05-05 23:59:13 UTC --- Closing 'no response'. Sorry I haven't been able to help more fully. If more info becomes available, please reopen. -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com