[Bug 879071] New: Stalls during dd on an external USB disk

older
[Bug 764864] New: Suspend to RAM...

bugzilla_noreply＠novell.com

21 May 2014 21 May '14

11:51

https://bugzilla.novell.com/show_bug.cgi?id=879071 https://bugzilla.novell.com/show_bug.cgi?id=879071#c0 Summary: Stalls during dd on an external USB disk Classification: openSUSE Product: openSUSE Factory Version: 13.2 Milestone 0 Platform: x86-64 OS/Version: Other Status: NEW Severity: Normal Priority: P5 - None Component: Kernel AssignedTo: mhocko@suse.com ReportedBy: jslaby@suse.com QAContact: qa-bugs@suse.de Found By: --- Blocker: --- Created an attachment (id=591354) --> (http://bugzilla.novell.com/attachment.cgi?id=591354) /proc/meminfo and /proc/vmstat every-second snapshots I am using Tumbleweed (i.e. 13.1 + some newer packages). The kernel is 3.14.2-25.g1474ea5-desktop. When I run: $ dd if=/dev/zero of=bubak bs=$((4*1024*1024)) on a mounted USB partition over USB 2.0, the stalls begin. The disk is this (smartclt says): Model Family: Hitachi/HGST Travelstar Z7K500 Device Model: HITACHI HTS725050A7E630 Serial Number: TF1500Y9H2N95B LU WWN Device Id: 5 000cca 662cf4c6f Firmware Version: GH2ZB390 User Capacity: 500 107 862 016 bytes [500 GB] Sector Sizes: 512 bytes logical, 4096 bytes physical Rotation Rate: 7200 rpm Mounted as: type ext4 (rw,nosuid,nodev,relatime,data=ordered,uhelper=udisks2) Both noop and cfq schedulers do this. So I did (in /dev/shm/vm): while :; do TIME=`date +%s.%N` echo $TIME cat /proc/meminfo >meminfo-$TIME cat /proc/vmstat >vmstat-$TIME sleep 1 done and see: 1400672643.810787699 1400672644.817812969 1400672645.822746359 1400672646.834695436 1400672647.843612029 1400672648.853972001 1400672649.862598851 1400672650.871373444 1400672652.182838629 <-- stalled 1400672653.192253464 1400672654.200850674 1400672655.212085244 1400672656.221593476 1400672657.230324099 1400672659.141308093 <-- stalled 1400672660.149899029 1400672662.259573479 <-- stalled 1400672663.269444609 1400672664.279327742 1400672665.287736541 1400672666.295795513 1400672667.305839304 1400672668.314697122 1400672669.325072842 1400672670.334228258 1400672671.344690614 1400672672.353744075 1400672673.361565807 1400672674.371823780 1400672675.683192247 1400672676.693375982 1400672677.702441380 1400672678.712842427 1400672679.721433233 1400672680.729992233 1400672681.739432910 1400672682.745694320 1400672683.753713095 Then I interrupted dd and this loop. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

Show replies by date

bugzilla_noreply＠novell.com

21 May 21 May

12:59

New subject: [Bug 879071] Stalls during dd on an external USB disk

https://bugzilla.novell.com/show_bug.cgi?id=879071 https://bugzilla.novell.com/show_bug.cgi?id=879071#c1 Michal Hocko changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |jack@suse.com, | |mgorman@suse.com --- Comment #1 from Michal Hocko 2014-05-21 14:59:10 CEST --- Adding Mel and Jack. Mel had a systemtap script to track the latencies caused by the writeback. I am looking into meminfo and vmstat data. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

bugzilla_noreply＠novell.com

15:35

New subject: [Bug 879071] Stalls during dd on an external USB disk

https://bugzilla.novell.com/show_bug.cgi?id=879071 https://bugzilla.novell.com/show_bug.cgi?id=879071#c2 --- Comment #2 from Jan Kara 2014-05-21 15:35:17 UTC --- Created an attachment (id=591429) --> (http://bugzilla.novell.com/attachment.cgi?id=591429) Script to watch for stalled processes Can you install systemtap on your machine, debuginfo from the kernel you are running, and run this script together with the workload generating stalls? It should report what's happening in more detail... -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

bugzilla_noreply＠novell.com

22 May 22 May

13:25

New subject: [Bug 879071] Stalls during dd on an external USB disk

https://bugzilla.novell.com/show_bug.cgi?id=879071 https://bugzilla.novell.com/show_bug.cgi?id=879071#c3 Michal Hocko changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |mhocko@suse.com --- Comment #3 from Michal Hocko 2014-05-22 15:25:42 CEST --- Here are some counters from vmstat data. First line is an aboslute number and all the following lines are diff to previous. time nr_dirtied nr_dirty nr_written allocstall pgsteal_direct_dma32 pgscan_direct_dma32 pgsteal_kswapd_dma32/pgscan_kswapd_dma32[%] 1400672643.810787699 39520729 10 37833453 18685 1707363 2696767 81.71 1.00703 0 0 0 0 0 0 0.00 1.00493 4 4 0 0 0 0 0.00 1.01195 34697 26244 7000 6 219 256 43.79 1.00892 2936 -5021 10145 5 370 370 45.74 1.01036 13107 -95 10057 8 271 515 40.17 1.00863 14373 4594 9495 3 302 302 64.25 1.00877 6894 -3427 10284 8 193 910 34.88 1.31147 11541 -1310 13539 16 895 914 53.85 <<< 1.00941 14969 3980 9916 10 389 421 55.14 1.0086 4763 -5595 10430 2 176 920 31.37 1.01123 15491 5180 10169 8 512 512 54.18 1.00951 11298 942 10215 9 407 1038 56.67 1.00873 3703 -5740 10168 4 238 276 22.54 1.91098 12459 -6928 19335 10 394 1415 53.45 <<< 1.00859 19852 9406 10229 22 1096 1890 55.47 2.10967 18650 -3028 21341 9 699 1668 49.37 <<< 1.00987 7760 -2037 10243 13 488 775 44.11 1.00988 17009 6786 9863 14 745 1372 63.44 As we can see kswapd doesn't reclaim fast enough so we are permanently in the direct reclaim. kswapd effectivity (reclaimed/scanned) drops down a lot before stalls. I assume that kswapd sees a lot of dirty pages which are not pushed to the storage fast enough. The writeback speed seems to be quite constant ~10k/s so the IO doesn't seem to be the main problem here. This looks like the writer is not throttled enough wrt. to the device speed. So this looks like the old and known problem that dirty_{ratio,bytes} is too high for the slow device :/ There is a per-bdi max_ratio tunable which might help here. Jack will surely know more about this. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

bugzilla_noreply＠novell.com

13:38

New subject: [Bug 879071] Stalls during dd on an external USB disk

https://bugzilla.novell.com/show_bug.cgi?id=879071 https://bugzilla.novell.com/show_bug.cgi?id=879071#c4 --- Comment #4 from Jiri Slaby 2014-05-22 13:38:41 UTC --- (In reply to comment #3)

...

So this looks like the old and known problem that dirty_{ratio,bytes} is too high for the slow device :/

Oh, thanks. Since the stalls were unbearable, I set the limits a long time ago and it helped significantly: # cat /proc/sys/vm/dirty_bytes 209715200 # cat /proc/sys/vm/dirty_ratio 0 My sysctl.conf says: vm.dirty_writeback_centisecs = 3000 vm.laptop_mode = 5 vm.dirty_bytes = 209715200 But as you can see, I can still see the stalls under some circumstances :(. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

bugzilla_noreply＠novell.com

14:32

New subject: [Bug 879071] Stalls during dd on an external USB disk

https://bugzilla.novell.com/show_bug.cgi?id=879071 https://bugzilla.novell.com/show_bug.cgi?id=879071#c5 --- Comment #5 from Michal Hocko 2014-05-22 16:32:04 CEST --- Yeah, 200M seems to be too high for your USB device. Kernel should be more clever and distinguish slow and fast devices. This is a tricky thing though. Dave has some good points here: http://marc.info/?l=linux-mm&m=138362477725151&w=2 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

bugzilla_noreply＠novell.com

14:45

New subject: [Bug 879071] Stalls during dd on an external USB disk

https://bugzilla.novell.com/show_bug.cgi?id=879071 https://bugzilla.novell.com/show_bug.cgi?id=879071#c6 --- Comment #6 from Michal Hocko 2014-05-22 16:45:18 CEST --- Anyway, has the systemtap script pointed to something? Maybe we are pointlessly congestion waiting... Direct reclaim is quite efficient efficient (reclaimed/scanned): [...] 1.00877 97.92 1.31147 92.40 1.00941 19.13 1.0086 100.00 1.01123 39.21 1.00951 86.23 1.00873 27.84 1.91098 57.99 1.00859 41.91 2.10967 62.97 1.00987 54.30 Interestingly the low effectiveness doesn't lead to bugger stalls. So it is possible that the reclaimer gets into congestion_wait at the times when we see bigger stalls. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

bugzilla_noreply＠novell.com

27 May 27 May

09:06

New subject: [Bug 879071] Stalls during dd on an external USB disk

https://bugzilla.novell.com/show_bug.cgi?id=879071 https://bugzilla.novell.com/show_bug.cgi?id=879071#c7 --- Comment #7 from Jiri Slaby 2014-05-27 09:06:21 UTC --- Ok, the output is now like this: time 1401181527: 65 (kswapd0) Stalled: 1192 ms: shrink_zone Guessing: IO_WritebackSlow -nr_dirty 24648 -nr_writeback 1139 -nr_vmscan_write 245818 - /sys/block/sda/stat 1728888 57441 58820653 2074616 2228003 550615 79379345 42479053 0 1452209 44568377 - /sys/block/sdb/stat 3768 1729 369338 150446 119257 766 28157544 45687048 33 385155 45849460 +nr_dirty 21617 -3031 +nr_writeback 4507 3368 +nr_vmscan_write 245818 0 + /sys/block/sda/stat 1728913 57441 58821597 2074633 2228003 550615 79379345 42479053 0 1452214 44568394 + /sys/block/sdb/stat 3768 1729 369338 150446 119329 766 28174024 45703108 151 385354 45867245 [<ffffffff8116100e>] congestion_wait+0x6e/0x110 [<ffffffff81155bca>] shrink_inactive_list+0x4aa/0x4f0 [<ffffffff81156261>] shrink_lruvec+0x2f1/0x610 [<ffffffff811565e6>] shrink_zone+0x66/0x190 [<ffffffff8160aea5>] kretprobe_trampoline+0x0/0x4b [<00000000000d6dc2>] 0xd6dc2 [<ffffffffffffffff>] 0xffffffffffffffff time 1401181531: 65 (kswapd0) Stalled: 1108 ms: shrink_zone Guessing: IO_WritebackInProgress -nr_dirty 21347 -nr_writeback 3981 -nr_vmscan_write 245878 - /sys/block/sda/stat 1729405 57441 58854533 2074856 2228007 550710 79380137 42479068 0 1452326 44568632 - /sys/block/sdb/stat 3768 1729 369338 150446 120722 767 28503640 46259247 135 389359 46438951 +nr_dirty 21512 165 +nr_writeback 3955 -26 +nr_vmscan_write 245878 0 + /sys/block/sda/stat 1729415 57441 58854989 2074861 2228007 550710 79380137 42479068 0 1452328 44568637 + /sys/block/sdb/stat 3768 1729 369338 150446 120792 767 28520232 46287995 129 389559 46467403 [<ffffffff8116100e>] congestion_wait+0x6e/0x110 [<ffffffff81155bca>] shrink_inactive_list+0x4aa/0x4f0 [<ffffffff81156261>] shrink_lruvec+0x2f1/0x610 [<ffffffff811565e6>] shrink_zone+0x66/0x190 [<ffffffff8160aea5>] kretprobe_trampoline+0x0/0x4b [<00000000000d5fc3>] 0xd5fc3 [<ffffffffffffffff>] 0xffffffffffffffff time 1401181531: 1558 (psi-plus) Stalled: 1400 ms: shrink_zone Guessing: IO_WritebackInProgress -nr_dirty 21512 -nr_writeback 3955 -nr_vmscan_write 245878 - /sys/block/sda/stat 1729415 57441 58854989 2074861 2228007 550710 79380137 42479068 0 1452328 44568637 - /sys/block/sdb/stat 3768 1729 369338 150446 120793 767 28520472 46288447 128 389560 46467531 +nr_dirty 18482 -3030 +nr_writeback 2707 -1248 +nr_vmscan_write 245878 0 + /sys/block/sda/stat 1729483 57444 58859549 2074899 2228035 550712 79381705 42479239 0 1452357 44568845 + /sys/block/sdb/stat 3768 1729 369338 150446 120932 772 28553480 46344916 91 389959 46520862 [<ffffffff8116100e>] congestion_wait+0x6e/0x110 [<ffffffff81155bca>] shrink_inactive_list+0x4aa/0x4f0 [<ffffffff81156261>] shrink_lruvec+0x2f1/0x610 [<ffffffff811565e6>] shrink_zone+0x66/0x190 [<ffffffff8160aea5>] kretprobe_trampoline+0x0/0x4b [<ffff88011e5e7b08>] 0xffff88011e5e7b08 [<ffffffff8115726a>] try_to_free_pages+0xda/0x1c0 [<ffffffff8160aea5>] kretprobe_trampoline+0x0/0x4b [<ffffffff81055920>] copy_process.part.24+0x130/0x1ba0 [<ffffffff81194497>] kmem_cache_alloc+0x207/0x460 [<ffffffff811b10ae>] alloc_file+0x1e/0xc0 [<ffffffff81057551>] do_fork+0xd1/0x300 [<ffffffff81610b09>] stub_clone+0x69/0x90 [<ffffffff816107ad>] system_call_fastpath+0x1a/0x1f [<ffffffffffffffff>] 0xffffffffffffffff -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

bugzilla_noreply＠novell.com

11:22

New subject: [Bug 879071] Stalls during dd on an external USB disk

https://bugzilla.novell.com/show_bug.cgi?id=879071 https://bugzilla.novell.com/show_bug.cgi?id=879071#c8 --- Comment #8 from Michal Hocko 2014-05-27 13:22:35 CEST --- (In reply to comment #7)

...

Ok, the output is now like this:

time 1401181527: 65 (kswapd0) Stalled: 1192 ms: shrink_zone [...] time 1401181531: 65 (kswapd0) Stalled: 1108 ms: shrink_zone [...]

These two are in the background reclaim so no stalls on other allocators. But the last one is a userspace path which ends up in the direct reclaim and stalls on the congestion wait. So this is what I expected in comment 6

...

time 1401181531: 1558 (psi-plus) Stalled: 1400 ms: shrink_zone Guessing: IO_WritebackInProgress -nr_dirty 21512 -nr_writeback 3955 -nr_vmscan_write 245878 - /sys/block/sda/stat 1729415 57441 58854989 2074861 2228007 550710 79380137 42479068 0 1452328 44568637 - /sys/block/sdb/stat 3768 1729 369338 150446 120793 767 28520472 46288447 128 389560 46467531 +nr_dirty 18482 -3030 +nr_writeback 2707 -1248 +nr_vmscan_write 245878 0 + /sys/block/sda/stat 1729483 57444 58859549 2074899 2228035 550712 79381705 42479239 0 1452357 44568845 + /sys/block/sdb/stat 3768 1729 369338 150446 120932 772 28553480 46344916 91 389959 46520862 [<ffffffff8116100e>] congestion_wait+0x6e/0x110 [<ffffffff81155bca>] shrink_inactive_list+0x4aa/0x4f0 [<ffffffff81156261>] shrink_lruvec+0x2f1/0x610 [<ffffffff811565e6>] shrink_zone+0x66/0x190 [<ffffffff8160aea5>] kretprobe_trampoline+0x0/0x4b [<ffff88011e5e7b08>] 0xffff88011e5e7b08 [<ffffffff8115726a>] try_to_free_pages+0xda/0x1c0 [<ffffffff8160aea5>] kretprobe_trampoline+0x0/0x4b [<ffffffff81055920>] copy_process.part.24+0x130/0x1ba0 [<ffffffff81194497>] kmem_cache_alloc+0x207/0x460 [<ffffffff811b10ae>] alloc_file+0x1e/0xc0 [<ffffffff81057551>] do_fork+0xd1/0x300 [<ffffffff81610b09>] stub_clone+0x69/0x90 [<ffffffff816107ad>] system_call_fastpath+0x1a/0x1f [<ffffffffffffffff>] 0xffffffffffffffff

Mel, you have said that you had a patch which should reduce congestion waits from direct reclaim, right? That being said, the primary problem is still that we shouldn't get into direct reclaim in the first place. The writer should get throttled before it dirties more memory than we can write back in the reasonable time. But a secondary problem is that direct reclaim might be throttled even when not necessary. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

bugzilla_noreply＠novell.com

12:05

New subject: [Bug 879071] Stalls during dd on an external USB disk

https://bugzilla.novell.com/show_bug.cgi?id=879071 https://bugzilla.novell.com/show_bug.cgi?id=879071#c9 --- Comment #9 from Michal Hocko 2014-05-27 14:05:54 CEST --- Just for reference. The congestion wait was called here: /* * In addition, if kswapd scans pages marked marked for * immediate reclaim and under writeback (nr_immediate), it * implies that pages are cycling through the LRU faster than * they are written so also forcibly stall. */ if ((nr_unqueued_dirty == nr_taken || nr_immediate)) congestion_wait(BLK_RW_ASYNC, HZ/10); which tells us that all pages which have been isolated from LRUs by the direct reclaimer are dirty and not under writeback. So the writeback is clearly not fast enough. I am wondering whether this stall is really appropriate for the direct reclaim because it can get throttled again right after in wait_iff_congested. Would it make sense to reduce the above congestion_wait only for kswapd? Anyway even with something like that in place we could still wait on the iff_congested so I am not sure how much helpful this would be. If we had more direct reclaimers and one would manage to free some pages then throttling another one doesn't sound seem necessary... -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

bugzilla_noreply＠novell.com

28 May 28 May

12:50

New subject: [Bug 879071] Stalls during dd on an external USB disk

https://bugzilla.novell.com/show_bug.cgi?id=879071 https://bugzilla.novell.com/show_bug.cgi?id=879071#c10 --- Comment #10 from Michal Hocko 2014-05-28 14:50:41 CEST --- Created an attachment (id=592430) --> (http://bugzilla.novell.com/attachment.cgi?id=592430) patch to test I am not sure this is a right thing to do but Jiri, could you give it a try, please? The patch would help only if direct reclaimer trips over both congestion_wait and wait_iff_congested. It is a question whether this is happening here though. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

bugzilla_noreply＠novell.com

30 May 30 May

09:40

New subject: [Bug 879071] Stalls during dd on an external USB disk

https://bugzilla.novell.com/show_bug.cgi?id=879071 https://bugzilla.novell.com/show_bug.cgi?id=879071#c11 --- Comment #11 from Jan Kara 2014-05-30 09:40:29 UTC --- Just a comment regarding writeback. Looking at nr_written numbers we seem to be writing about 40 MB/s to the disk so that looks pretty reasonably. Also the writeback code is throttling the writer so that we have ~100 MB of dirty pages - well below the 200 MB upper limit set by dirty_bytes and I'd say on the lower end of what the writeback throughput to disk requires to keep reasonable bandwidth. So writeback code / throttling works as designed here. Looking at the memory stats the machine seems relatively memory constrained - it has 4 GB of ram. Out of that 3 GB are allocated for anonymous memory, another 500 MB in shmem pages, and 140 MB for slab & page tables. So that leaves relatively tight (~350 MB) maneuvering space for page cache, free space reserves, etc. So I'm not surprised reclaim sees the whole reclaim batch of dirty pages in the LRU. I'm not enough of a reclaim expert to judge where exactly is a problem. One note regarding the stall report - the process has spent 1.4s in shrink_zone(), that's for sure. The stack trace pointing into shrink_inactive_list() is just one of the places where the process was sleeping, it may have slept also in other places under shrink_zone() (and likely it has slept several times because one sleep takes only 0.1s there). Just to keep that in mind. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

bugzilla_noreply＠novell.com

13:28

New subject: [Bug 879071] Stalls during dd on an external USB disk

https://bugzilla.novell.com/show_bug.cgi?id=879071 https://bugzilla.novell.com/show_bug.cgi?id=879071#c12 --- Comment #12 from Michal Hocko 2014-05-30 15:28:02 CEST --- (In reply to comment #11) [...]

...

Looking at the memory stats the machine seems relatively memory constrained - it has 4 GB of ram. Out of that 3 GB are allocated for anonymous memory, another 500 MB in shmem pages, and 140 MB for slab & page tables. So that leaves relatively tight (~350 MB) maneuvering space for page cache, free space reserves, etc. So I'm not surprised reclaim sees the whole reclaim batch of dirty pages in the LRU. I'm not enough of a reclaim expert to judge where exactly is a problem.

This is definitely a good observation Jan! I was so focused on the reclaim statistics that I've completely missed the whole picture. meminfo-1400672652.182838629: SwapCached: 22384 kB Active(anon): 2578936 kB Inactive(anon): 769128 kB Active(file): 77848 kB Inactive(file): 137296 kB SwapTotal: 1952764 kB SwapFree: 594984 kB Dirty: 83996 kB So the swap is full from 70%. Inactive anonymous LRU is quite low but not enough to trigger active anon aging (with ~4G the ratio is 5-6). Big chunk of the file LRU is dirty so it seems that you are really right and the stalls are a result of the memory pressure. Now the question is, does it happen only under such situations? Another obvious question is. Can we do better and are such stalls so unexpected to happen considering the load? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

bugzilla_noreply＠novell.com

2 Jun 2 Jun

14:11

New subject: [Bug 879071] Stalls during dd on an external USB disk

https://bugzilla.novell.com/show_bug.cgi?id=879071 https://bugzilla.novell.com/show_bug.cgi?id=879071#c13 --- Comment #13 from Michal Hocko 2014-06-02 16:11:17 CEST --- One more thing. Although there might be some throttling on the writer side I guess we are not doing it aggressively enough. Here is the nr_dirtied vs nr_written comparison (nr_dirtied and nr_written are per delay. diff = nr_dirtied - nr_written and total_diff is cumulative diff) delay nr_dirtied nr_written diff total_diff 1.01195 34697 7000 27697 27701 1.00892 2936 10145 -7209 20492 1.01036 13107 10057 3050 23542 1.00863 14373 9495 4878 28420 1.00877 6894 10284 -3390 25030 1.31147 11541 13539 -1998 23032 1.00941 14969 9916 5053 28085 1.0086 4763 10430 -5667 22418 1.01123 15491 10169 5322 27740 1.00951 11298 10215 1083 28823 1.00873 3703 10168 -6465 22358 1.91098 12459 19335 -6876 15482 1.00859 19852 10229 9623 25105 2.10967 18650 21341 -2691 22414 1.00987 7760 10243 -2483 19931 we can clearly see that dirtier is ~24k pages ahead of writeback on average which is ~90M so around 2s worth of writeback. Btw. I do not see dirty throttling checking global_dirtyable_memory if dirty_bytes is used. Is this OK? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

bugzilla_noreply＠novell.com

15:11

New subject: [Bug 879071] Stalls during dd on an external USB disk

...

One more thing. Although there might be some throttling on the writer side I guess we are not doing it aggressively enough.

Here is the nr_dirtied vs nr_written comparison (nr_dirtied and nr_written are per delay. diff = nr_dirtied - nr_written and total_diff is cumulative diff) delay nr_dirtied nr_written diff total_diff 1.01195 34697 7000 27697 27701 1.00892 2936 10145 -7209 20492 1.01036 13107 10057 3050 23542 1.00863 14373 9495 4878 28420 1.00877 6894 10284 -3390 25030 1.31147 11541 13539 -1998 23032 1.00941 14969 9916 5053 28085 1.0086 4763 10430 -5667 22418 1.01123 15491 10169 5322 27740 1.00951 11298 10215 1083 28823 1.00873 3703 10168 -6465 22358 1.91098 12459 19335 -6876 15482 1.00859 19852 10229 9623 25105 2.10967 18650 21341 -2691 22414 1.00987 7760 10243 -2483 19931

we can clearly see that dirtier is ~24k pages ahead of writeback on average which is ~90M so around 2s worth of writeback. Correct. And that is really working as designed - we have dirty_bytes set to 200 MB so writeback is free to keep upto 200 MB of dirty pages if it sees that fit. We have to keep non-trivial amount of dirty pages in order to be able to

...

Btw. I do not see dirty throttling checking global_dirtyable_memory if dirty_bytes is used. Is this OK? I am not sure. The idea has been that when dirty_bytes is set at X, then you know you can have X amount of dirty memory. However when global_dirtyable_memory() is less than say 2*dirty_bytes, it gets really strange and maybe we should take that into account. I guess you can try

https://bugzilla.novell.com/show_bug.cgi?id=879071 https://bugzilla.novell.com/show_bug.cgi?id=879071#c14 --- Comment #14 from Jan Kara 2014-06-02 15:11:42 UTC --- (In reply to comment #13) form large enough IOs and keep hardware busy. The experiments have shown that you need at least a second or so worth of writeback cached to level out bumps in disk performance, IO completion bursts etc. proposing something like that upstream (or I can :). -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

bugzilla_noreply＠novell.com

15:26

New subject: [Bug 879071] Stalls during dd on an external USB disk

https://bugzilla.novell.com/show_bug.cgi?id=879071 https://bugzilla.novell.com/show_bug.cgi?id=879071#c15 --- Comment #15 from Michal Hocko 2014-06-02 17:26:52 CEST --- (In reply to comment #14)

...

...
we can clearly see that dirtier is ~24k pages ahead of writeback on average which is ~90M so around 2s worth of writeback. Correct. And that is really working as designed - we have dirty_bytes set to 200 MB so writeback is free to keep upto 200 MB of dirty pages if it sees that fit. We have to keep non-trivial amount of dirty pages in order to be able to

(In reply to comment #13) [...] form large enough IOs and keep hardware busy. The experiments have shown that you need at least a second or so worth of writeback cached to level out bumps in disk performance, IO completion bursts etc.

Thanks for the clarification.

...

...
Btw. I do not see dirty throttling checking global_dirtyable_memory if dirty_bytes is used. Is this OK? I am not sure. The idea has been that when dirty_bytes is set at X, then you know you can have X amount of dirty memory.

OK, I suspected something like this. But I am afraid that more we will encourage users to use dirty_bytes because dirty_ratio sucks with a lot of memory the more strange issues we will see.

...

However when global_dirtyable_memory() is less than say 2*dirty_bytes, it gets really strange and maybe we should take that into account. I guess you can try proposing something like that upstream (or I can :).

I guess this can be double checked by setting dirty_ratio to 4% to have the similar setting except that global_dirtyable_memory is not ignored. Could you give it a try Jiri? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

bugzilla_noreply＠novell.com

5 Jun 5 Jun

13:08

New subject: [Bug 879071] Stalls during dd on an external USB disk

https://bugzilla.novell.com/show_bug.cgi?id=879071 https://bugzilla.novell.com/show_bug.cgi?id=879071#c16 --- Comment #16 from Michal Hocko 2014-06-05 15:07:57 CEST --- There is another report[1] which seems to confirm my theory in comment 9. Jiri, is there any chance you could test with the patch from comment 10, please? --- [1] http://marc.info/?l=linux-mm&m=140196802619773&w=2 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

bugzilla_noreply＠novell.com

21 Jul 21 Jul

07:18

New subject: [Bug 879071] Stalls during dd on an external USB disk

https://bugzilla.novell.com/show_bug.cgi?id=879071 https://bugzilla.novell.com/show_bug.cgi?id=879071#c17 --- Comment #17 from Michal Hocko 2014-07-21 09:18:26 CEST --- any news here? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

bugzilla_noreply＠novell.com

14:33

New subject: [Bug 879071] Stalls during dd on an external USB disk

https://bugzilla.novell.com/show_bug.cgi?id=879071 https://bugzilla.novell.com/show_bug.cgi?id=879071#c18 --- Comment #18 from Jiri Slaby 2014-07-21 14:33:08 UTC --- (In reply to comment #17)

...

any news here?

Good news actually. With the kswapd patch above, I am running 3.15.5 under high memory pressure and no stalls on dd. Brilliant! -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

bugzilla_noreply＠novell.com

14:48

New subject: [Bug 879071] Stalls during dd on an external USB disk

https://bugzilla.novell.com/show_bug.cgi?id=879071 https://bugzilla.novell.com/show_bug.cgi?id=879071#c19 --- Comment #19 from Michal Hocko 2014-07-21 16:48:45 CEST --- (In reply to comment #18)

...

(In reply to comment #17)

...
any news here?

Good news actually. With the kswapd patch above,

Do you mean patch from comment 10 or the one referenced from the upstream discussion in comment 16.

...

I am running 3.15.5 under high memory pressure and no stalls on dd. Brilliant!

In any case good to hear. We just have to find a way how to route the fix into code streams properly. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

bugzilla_noreply＠novell.com

14:53

New subject: [Bug 879071] Stalls during dd on an external USB disk

https://bugzilla.novell.com/show_bug.cgi?id=879071 https://bugzilla.novell.com/show_bug.cgi?id=879071#c20 --- Comment #20 from Jiri Slaby 2014-07-21 14:53:50 UTC --- (In reply to comment #19)

...

(In reply to comment #18)

...
(In reply to comment #17)

...
any news here?

Good news actually. With the kswapd patch above,

Do you mean patch from comment 10 or the one referenced from the upstream discussion in comment 16.

For quite some time, I have been using the latter only. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

bugzilla_noreply＠novell.com

15:03

New subject: [Bug 879071] Stalls during dd on an external USB disk

https://bugzilla.novell.com/show_bug.cgi?id=879071 https://bugzilla.novell.com/show_bug.cgi?id=879071#c21 --- Comment #21 from Michal Hocko 2014-07-21 17:03:44 CEST --- OK, I will backport http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=b7... to our post 3.11 trees. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

bugzilla_noreply＠novell.com

15:29

New subject: [Bug 879071] Stalls during dd on an external USB disk

https://bugzilla.novell.com/show_bug.cgi?id=879071 https://bugzilla.novell.com/show_bug.cgi?id=879071#c22 --- Comment #22 from Michal Hocko 2014-07-21 17:29:12 CEST --- As it turned out SLE12 already has this patch. I have pushed it to openSUSE-13.1 other branches are not affected as they are older than 3.11 when this particular problem has been introduced (by e2be15f6c3ee mm: vmscan: stall page reclaim and writeback pages based on dirty/writepage pages encountered). -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

bugzilla_noreply＠novell.com

15:31

New subject: [Bug 879071] Stalls during dd on an external USB disk

https://bugzilla.novell.com/show_bug.cgi?id=879071 https://bugzilla.novell.com/show_bug.cgi?id=879071#c23 --- Comment #23 from Michal Hocko 2014-07-21 17:31:06 CEST --- Created an attachment (id=599313) --> (http://bugzilla.novell.com/attachment.cgi?id=599313) backport for openSUSE13.1 Jiri, I think this is worth backporting to 3.12 stable tree. Should I post it to the stable ML or can you pick the patch up from openSUSE-13.1 branch (this would cost you some editing). Whatever fits better for you. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

bugzilla_noreply＠novell.com

15:40

New subject: [Bug 879071] Stalls during dd on an external USB disk

https://bugzilla.novell.com/show_bug.cgi?id=879071 https://bugzilla.novell.com/show_bug.cgi?id=879071#c24 --- Comment #24 from Jiri Slaby 2014-07-21 15:39:52 UTC --- (In reply to comment #22)

...

As it turned out SLE12 already has this patch. I have pushed it to openSUSE-13.1 other branches are not affected as they are older than 3.11 when this particular problem has been introduced (by e2be15f6c3ee mm: vmscan: stall page reclaim and writeback pages based on dirty/writepage pages encountered).

I have pushed it also to the stable branch: fdb2dde88465..edc5ddf28550 stable -> stable (In reply to comment #23)

...

Jiri, I think this is worth backporting to 3.12 stable tree. Should I post it to the stable ML

Yes, post it, please. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

bugzilla_noreply＠novell.com

17:31

New subject: [Bug 879071] Stalls during dd on an external USB disk

https://bugzilla.novell.com/show_bug.cgi?id=879071 https://bugzilla.novell.com/show_bug.cgi?id=879071#c25 Michal Hocko changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED --- Comment #25 from Michal Hocko 2014-07-21 19:31:03 CEST --- (In reply to comment #24) [...]

...

(In reply to comment #23)

...
Jiri, I think this is worth backporting to 3.12 stable tree. Should I post it to the stable ML

Yes, post it, please.

done I guess we are done here -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

bugzilla_noreply＠novell.com

22 Jul 22 Jul

15:49

New subject: [Bug 879071] Stalls during dd on an external USB disk

https://bugzilla.novell.com/show_bug.cgi?id=879071 https://bugzilla.novell.com/show_bug.cgi?id=879071#c Swamp Workflow Management changed: What |Removed |Added ---------------------------------------------------------------------------- Status Whiteboard| |obs:running:2919:important -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

bugzilla_noreply＠novell.com

11 Aug 11 Aug

10:07

New subject: [Bug 879071] Stalls during dd on an external USB disk

https://bugzilla.novell.com/show_bug.cgi?id=879071 https://bugzilla.novell.com/show_bug.cgi?id=879071#c26 --- Comment #26 from Swamp Workflow Management 2014-08-11 10:07:04 UTC --- openSUSE-SU-2014:0985-1: An update that solves 14 vulnerabilities and has two fixes is now available. Category: security (important) Bug References: 768714,851686,855657,866101,867531,867723,879071,880484,882189,883518,883724,883795,884840,885422,885725,886629 CVE References: CVE-2014-0100,CVE-2014-0131,CVE-2014-2309,CVE-2014-3917,CVE-2014-4014,CVE-2014-4171,CVE-2014-4508,CVE-2014-4652,CVE-2014-4653,CVE-2014-4654,CVE-2014-4655,CVE-2014-4656,CVE-2014-4667,CVE-2014-4699 Sources used: openSUSE 13.1 (src): cloop-2.639-11.13.1, crash-7.0.2-2.13.1, hdjmod-1.28-16.13.1, ipset-6.21.1-2.17.1, iscsitarget-1.4.20.3-13.13.1, kernel-docs-3.11.10-21.3, kernel-source-3.11.10-21.1, kernel-syms-3.11.10-21.1, ndiswrapper-1.58-13.1, pcfclock-0.44-258.13.1, vhba-kmp-20130607-2.14.1, virtualbox-4.2.18-2.18.1, xen-4.3.2_01-21.1, xtables-addons-2.3-2.13.1 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

bugzilla_noreply＠novell.com

19 Feb 19 Feb

06:18

New subject: [Bug 879071] Stalls during dd on an external USB disk

http://bugzilla.novell.com/show_bug.cgi?id=879071 Swamp Workflow Management changed: What |Removed |Added ---------------------------------------------------------------------------- Whiteboard|obs:running:2919:important | -- You are receiving this mail because: You are on the CC list for the bug.

3361

Age (days ago)

3635

Last active (days ago)

List overview

Download

28 comments

1 participants

participants (1)

bugzilla_noreply＠novell.com

[Bug 879071] New: Stalls during dd on an external USB disk

tags

participants (1)