[Bug 719416] New: writing to usb flashdisk uses too much cpu and makes system unresponsive
https://bugzilla.novell.com/show_bug.cgi?id=719416 https://bugzilla.novell.com/show_bug.cgi?id=719416#c0 Summary: writing to usb flashdisk uses too much cpu and makes system unresponsive Classification: openSUSE Product: openSUSE 12.1 Version: Milestone 5 Platform: x86-64 OS/Version: SuSE Other Status: NEW Severity: Major Priority: P5 - None Component: Basesystem AssignedTo: bnc-team-screening@forge.provo.novell.com ReportedBy: diego.ercolani@gmail.com QAContact: qa@suse.de Found By: --- Blocker: --- User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:6.0) Gecko/20100101 Firefox/6.0 This is an annoying bug: I have a system capable of heavy cpuload (AMD Phenom(tm) II X4 980 Processor with 8GB of ram) but If I try to copy huge files to flashdisk (usb pen) I notice a quick uprising of system load (over 3) and usage of cpu (over 60%) The annoying thing (also usage of system resource isn't too good) is that while system is copying files (1GB each), it becames unusable and irresponsive for a while (20 seconds) from time to time. Probably this is due to the "flush procedure" that from time to time is going to sync the filesystem on the flash device. But... how about the io/scheduler ? Reproducible: Always Steps to Reproduce: 1. having 3 files of 1 GB each 2. from terminal: copy 1.sample 2.sample 3.sample /media/FLASHDISK 3. use desktop, browse internet, etc. while copying -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=719416
https://bugzilla.novell.com/show_bug.cgi?id=719416#c
zj jia
https://bugzilla.novell.com/show_bug.cgi?id=719416
https://bugzilla.novell.com/show_bug.cgi?id=719416#c1
Ismail Donmez
https://bugzilla.novell.com/show_bug.cgi?id=719416
https://bugzilla.novell.com/show_bug.cgi?id=719416#c2
--- Comment #2 from Ismail Donmez
https://bugzilla.novell.com/show_bug.cgi?id=719416
https://bugzilla.novell.com/show_bug.cgi?id=719416#c3
Bernhard Wiedemann
https://bugzilla.novell.com/show_bug.cgi?id=719416
https://bugzilla.novell.com/show_bug.cgi?id=719416#c4
--- Comment #4 from Bernhard Wiedemann
https://bugzilla.novell.com/show_bug.cgi?id=719416
https://bugzilla.novell.com/show_bug.cgi?id=719416#c5
Bernhard Wiedemann
https://bugzilla.novell.com/show_bug.cgi?id=719416
https://bugzilla.novell.com/show_bug.cgi?id=719416#c6
Jan Kara
https://bugzilla.novell.com/show_bug.cgi?id=719416
https://bugzilla.novell.com/show_bug.cgi?id=719416#c7
--- Comment #7 from Mel Gorman
Mel, I think this is for you since you are solving THP stalls upstream anyway...
Still working on it. I'm hoping to have a series out later in the week. I'll post a reference here to the patches when I do. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=719416
https://bugzilla.novell.com/show_bug.cgi?id=719416#c8
--- Comment #8 from Mel Gorman
Still working on it(In reply to comment #6)
Mel, I think this is for you since you are solving THP stalls upstream anyway...
Still working on it. I'm hoping to have a series out later in the week. I'll post a reference here to the patches when I do.
Patches were posted to the usual lists as you can see here https://lkml.org/lkml/2011/12/1/293 . I also placed the patches on http://www.csn.ul.ie/~mel/postings/thpstall-20111202/ . It would be much appreciated if they could be tested and either reported here on Bugzilla or on the lists. Thanks. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=719416
https://bugzilla.novell.com/show_bug.cgi?id=719416#c9
Mark Fairbairn
https://bugzilla.novell.com/show_bug.cgi?id=719416
https://bugzilla.novell.com/show_bug.cgi?id=719416#c10
--- Comment #10 from Mel Gorman
I too see this. In addition the writing speed to the usb stick varies greatly when compared to the write speed in Opensuse 11.4. When transferring a video file over usb speeds vary between 4mb down to around 10kb and sometimes stop altogether for a few seconds.
Noted
System is unresponsive (unable to launch any apps) until approx 20 seconds after write to usb stick has completed.
Can you test the patches linked on comment 9 please? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=719416
https://bugzilla.novell.com/show_bug.cgi?id=719416#c11
--- Comment #11 from Bernhard Wiedemann
https://bugzilla.novell.com/show_bug.cgi?id=719416
https://bugzilla.novell.com/show_bug.cgi?id=719416#c12
--- Comment #12 from Mark Fairbairn
https://bugzilla.novell.com/show_bug.cgi?id=719416
https://bugzilla.novell.com/show_bug.cgi?id=719416#c13
--- Comment #13 from Mel Gorman
Have done a little testing with your patched vanilla kernel. System has gained its responsive during transfers to USB stick.
Thanks for reporting.
Transfer speeds have also somewhat improved - although are still roughly 20% - 50% slower than speeds achieved with stock opensuse kernel in OS 11.4 (depending on usb stick used).
Ok, that is interesting. I do not have an explanation for it at the moment but when the stall issues get ironed out, I'll try find the time to investigate - no promises though. Writeback speeds to USB are not high on my priority list unfortunately.
In the little testing I have done though - speeds have never fallen to the ridiculous levels achieved with the stock Opensuse 12.1 kernel (ie. between 0 and 1 mb)
The patches have stalled upstream due to lack of review but I'll get a fix pushed eventually which should worm its way via stable to opensuse. However, I would expect that the fix would only be present in 3.2-stable. If openSUSE 12.x stays on 3.1 for a long time it could be a problem as pushing all the necessary patches to 3.1-stable will be a tough sell. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=719416
https://bugzilla.novell.com/show_bug.cgi?id=719416#c14
--- Comment #14 from Diego Ercolani
https://bugzilla.novell.com/show_bug.cgi?id=719416
https://bugzilla.novell.com/show_bug.cgi?id=719416#c15
Bernhard Wiedemann
https://bugzilla.novell.com/show_bug.cgi?id=719416
https://bugzilla.novell.com/show_bug.cgi?id=719416#c16
Marcus Meissner
https://bugzilla.novell.com/show_bug.cgi?id=719416
https://bugzilla.novell.com/show_bug.cgi?id=719416#c17
--- Comment #17 from Mel Gorman
Can we get a maintenance update for kernel-desktop-3.1* without CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS ? maybe would also be useful for kernel-default as long as fixes like the ones from comment 8 are not included
This completely fell off my radar after I made comment #13. The specific fixes were upstream and I generally run either upstream or the SLES kernel on my own desktop depending on which I'm working on at the time so I forgot about it. FWIW, I did investigate the DD throughput problem upstream and came to the conclusion that the main differences between kernels are down to timing. Once the IO was sync, or the time to call sync after dd async completed the write speeds were roughly the same. On disabling CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS, I would rather not except as an absolute last resort. It's enabled by default on other distros and upstream and personally I would prefer to see it continue to be tested. I run it by default on my own machine for example and upstream any fixes. 3.1.10 is the final upstream stable kernel and these patches were always too weighty for -stable anyway. Hence, I'm attaching a backported series that incorporates my fixes and an important readahead fix that affects interactivity, particularly if the machine has an SSD. I'm running an openSUSE kernel with these patches applied on my laptop at the moment. The backport was a tad difficult and it's not quite the same as upstream because of the need to preserve KABI but so far so good. Bernhard, would you be willing to build kernel rpms for wider testing before this is pushed to the kernel tree please? The patch is against the opensuse git kernel tree as opposed to the upstream kernel git tree. If/when they get built, I would appreciate it if people with interactivity problems under IO would test the rpms and report here what their experience was. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=719416
https://bugzilla.novell.com/show_bug.cgi?id=719416#c18
--- Comment #18 from Bernhard Wiedemann
https://bugzilla.novell.com/show_bug.cgi?id=719416
https://bugzilla.novell.com/show_bug.cgi?id=719416#c19
--- Comment #19 from Marcus Meissner
https://bugzilla.novell.com/show_bug.cgi?id=719416
https://bugzilla.novell.com/show_bug.cgi?id=719416#c20
--- Comment #20 from Bernhard Wiedemann
https://bugzilla.novell.com/show_bug.cgi?id=719416
https://bugzilla.novell.com/show_bug.cgi?id=719416#c21
Michal Kubeček
https://bugzilla.novell.com/show_bug.cgi?id=719416
https://bugzilla.novell.com/show_bug.cgi?id=719416#c22
Mel Gorman
https://bugzilla.novell.com/show_bug.cgi?id=719416
https://bugzilla.novell.com/show_bug.cgi?id=719416#c23
Wolfgang Rosenauer
https://bugzilla.novell.com/show_bug.cgi?id=719416
https://bugzilla.novell.com/show_bug.cgi?id=719416#c24
--- Comment #24 from Karl Eichwalder
Not sure if that's the same. I just had a big issue copying a lot of files to a USB mass storage. It was a MP3 collection and the load was increasing slowly with more and more other processes going stale into D state and in the end the whole cp process didn't proceed anymore.
For me it failed miserably, when copying a Garmin map file (~1GB) to a "new" Garmin device (with USB 2.0). Copying the first 200 or 300 MB was reasonably fast, but then everything slowed down and the machine sometimes even looked frozen. Interestingly, from time to time it recovered and it finally succeeded (after more than an hour). I'm not that sure about the exact numbers, because I did some testing on the machine at the same time... Hardware: x86_64, Core(TM)2 CPU 6320 @ 1.86GHz, with 3 GB RAM. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=719416
https://bugzilla.novell.com/show_bug.cgi?id=719416#c25
--- Comment #25 from Marcus Meissner
https://bugzilla.novell.com/show_bug.cgi?id=719416
https://bugzilla.novell.com/show_bug.cgi?id=719416#c26
--- Comment #26 from Wolfgang Rosenauer
https://bugzilla.novell.com/show_bug.cgi?id=719416
https://bugzilla.novell.com/show_bug.cgi?id=719416#c
Swamp Workflow Management
https://bugzilla.novell.com/show_bug.cgi?id=719416
https://bugzilla.novell.com/show_bug.cgi?id=719416#c27
Karl Eichwalder
https://bugzilla.novell.com/show_bug.cgi?id=719416
https://bugzilla.novell.com/show_bug.cgi?id=719416#c28
--- Comment #28 from Jan Kara
https://bugzilla.novell.com/show_bug.cgi?id=719416
https://bugzilla.novell.com/show_bug.cgi?id=719416#c29
--- Comment #29 from Karl Eichwalder
Karl, thanks for looking into this. AFAIK 12.1 kernel update should be released soon (QA is running now) so I'm not sure it's still worth it...
Ok, in this case, we can probably skip the release notes exercise ;) [I was just bitten again, knowingly, though.] -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=719416
https://bugzilla.novell.com/show_bug.cgi?id=719416#c
Swamp Workflow Management
https://bugzilla.novell.com/show_bug.cgi?id=719416
https://bugzilla.novell.com/show_bug.cgi?id=719416#c30
--- Comment #30 from Swamp Workflow Management
https://bugzilla.novell.com/show_bug.cgi?id=719416
https://bugzilla.novell.com/show_bug.cgi?id=719416#c31
Jeff Mahoney
https://bugzilla.novell.com/show_bug.cgi?id=719416
https://bugzilla.novell.com/show_bug.cgi?id=719416#c32
--- Comment #32 from Michal Kubeček
https://bugzilla.novell.com/show_bug.cgi?id=719416
https://bugzilla.novell.com/show_bug.cgi?id=719416#c33
--- Comment #33 from Mark Fairbairn
https://bugzilla.novell.com/show_bug.cgi?id=719416
https://bugzilla.novell.com/show_bug.cgi?id=719416#c34
--- Comment #34 from Mel Gorman
no system freeze when transferring files.
Good news.
Reasonably good transfer speeds. However, since kernel update I have noticed ridiculously incorrect transfer speeds are being reported. e.g. transfer 500mb media file (mp4) to usb stick and transfer speed of 75mb is reported. Transfer is completed in about five seconds according to system tray. However, transfer is not really complete and continues with no notifications for about another minute. - not sure if this is related in anyway to the original bug report or applied fix.
The 75MB/sec is probably only taking into account the time needed to copy the data to page cache and deferring the sync to the flusher or the safe removal of the device (i.e. an unmount). In an earlier comment you stated that OpenSUSE 11.4 transferred at a rate of about 8-9 MB/s. For a 500MB file, that would take about a minute which roughly matches what you're seeing now. I suspect that you were doing a sync copy on 11.4 and that's the difference. What are you using to copy the file? It could be some change in the desktop manager. It's not something I would notice myself as I don't use any of the desktop environment file managers. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=719416
https://bugzilla.novell.com/show_bug.cgi?id=719416#c35
--- Comment #35 from Mark Fairbairn
https://bugzilla.novell.com/show_bug.cgi?id=719416
https://bugzilla.novell.com/show_bug.cgi?id=719416#c36
--- Comment #36 from Mel Gorman
yes, overall transfer speed is similar to 11.4 so my previous concerns have been addressed.
Good.
Just using a split window in dolphin to drag and drop file from home partition to usb pendrive. The transfer speed shown in the notification tray is not an issue for me - but perhaps a separate minor bug to be addressed at some stage as it does give somewhat misleading information reporting the transfer as finished when in fact it is still going on. This could potentially cause file damage / lost file if the usb pendrive is just pulled without first using the safely remove action.
Yes, I understand what you're saying. I have a vague recollection that desktop environments used to identify when they were copying to USB and use a sync copy (either using fdatasync or O_SYNC, not sure which). Maybe that regressed or maybe it is my imagination.
But again - probably a separate bug if you do not believe it is related to the latest changes in the kernel
I believe it is unrelated to the latest kernel changes and I do not consider it a kernel bug because the kernel is doing what it's meant to be doing. My recommendation would be to file a bug against dolphin saying that file copies to USB (well, media that can be unplugged in general) should be synchronous to reduce the likelihood of data loss if the user pulls the key before the data is synchronised. Is that reasonable? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=719416
https://bugzilla.novell.com/show_bug.cgi?id=719416#c37
--- Comment #37 from Jan Kara
https://bugzilla.novell.com/show_bug.cgi?id=719416
https://bugzilla.novell.com/show_bug.cgi?id=719416#c38
--- Comment #38 from Michal Kubeček
https://bugzilla.novell.com/show_bug.cgi?id=719416
https://bugzilla.novell.com/show_bug.cgi?id=719416#c39
--- Comment #39 from Karl Eichwalder
https://bugzilla.novell.com/show_bug.cgi?id=719416
https://bugzilla.novell.com/show_bug.cgi?id=719416#c40
--- Comment #40 from Karl Eichwalder
https://bugzilla.novell.com/show_bug.cgi?id=719416
https://bugzilla.novell.com/show_bug.cgi?id=719416#c41
--- Comment #41 from Kai Dupke
Users are not stupid.
Users might not be stupid but some of them do not know enough or things might be handled under some rush. I recently got a key with only the half of the presentation from a usually well skilled, experienced contact. My assumption was that this was an accident but such things happen. It will happen more likely if you do things asynchronous. What other OS vendors do? They show the progress in the window the task was initiated and not in some total different region on the screen. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=719416
https://bugzilla.novell.com/show_bug.cgi?id=719416#c42
--- Comment #42 from Michal Hocko
Created an attachment (id=488809) --> (http://bugzilla.novell.com/attachment.cgi?id=488809) [details] vmstat.log.gz
requested by Michal Hocko.
grep pswpout bug-719416_vmstat.log | uniq -c 3 pswpout 4170 719 pswpout 5472 27 pswpout 5540 5 pswpout 5612 1 pswpout 5788 908 pswpout 5903 755 pswpout 6024 There is one big peak at the very beginning (after 3s) when we swap out 1300 pages. Which is quite a lot. Other than that we do not swap out a lot (~100) but it is always done in burst which might affect interactivity. So there is something wrong going on with swapping. We are already investigating a similar issue with SLE11SP1. I will follow up here once we come up with something. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=719416
https://bugzilla.novell.com/show_bug.cgi?id=719416#c43
--- Comment #43 from Michal Hocko
So there is something wrong going on with swapping. We are already investigating a similar issue with SLE11SP1. I will follow up here once we come up with something.
I will not populate this bug even more so let's move to a separate bug (bug 763399). I suspect that your issue, Karl, is caused by an excessive compaction which mangles aging of the LRU lists. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=719416
https://bugzilla.novell.com/show_bug.cgi?id=719416#c44
Jiri Slaby
participants (1)
-
bugzilla_noreply@novell.com