Hi Stephan, On Mon, 2009-01-26 at 10:58 +0100, Stephan Kulow wrote:
* how does preload differ from sreadahead ?
Wow - thanks :-) nice comparison; so to re-frame it: * similarities: + both pre-load only file data => reading the inodes, crawling the directory structures etc. is all done synchronously, without much parallelism or I/O sorting [ modulo sreadahead's 4x threads ]. * differences: + sreadahead lets boot continue while pre-fetching to interleave CPU/sleep-intensive loads [ eg. boot.udev+ in your chart ], preload instead defers the work so we get better seek behaviour on rotating media. + sreadahead only forces in the parts of the files we know are used, preload forces in the whole file - in practise this makes no difference [you assert]. + there are several phases of preload, a single phase for sreadahead. another might be: + sreadahead-pack is slow, opening ~all files on the system to call 'mincore' on them all [ presumably also vandalising it's results to some degree while doing so ;-] Other queries:
- uses boot order (by means of kernel patch for ext3) - needs pregenerated lists from boots not using sreadahead vs. - uses predefined list that can be regenerated without changing boot
So - preload allows you to re-generate when booting with preload ? that sounds pretty neat - how do you elide I/O caused by preload itself ? by process-id [ it seems the tools parse strace output to generate the preload lists ]. Reading the preload code, it looks rather nice :-) I guess my only concern is keeping the preload data itself up-to-date: apparently we don't ship it in SLED11, and eg. my /etc/preload.d/OpenOffice is obsolete. At some level, it seems a shame that we cannot ask the kernel to dump all block-level I/O generated post boot, elide pages un-touched since they were pre-loaded [ presumably that is ~easy enough to detect ], do some quick & dirty sort on that and save it for next boot. Presumably that would require work, a new API etc. Naturally, that's not going to work wonderfully for machines with RAM-size << boot-time-working-set, but simple & no worse than sreadahead.
That's different to my laptop hdd. According to the sources, sreadahead doesn't need the kernel patch, it only optimzes the order. But still, calling sreadahead early doesn't improve my boot time. And it's much harder to deploy too, more suited to show cases ;)
Heh :-) well, I can believe sreadahead is optimised for SSDs. Having said that - I don't see why preload shouldn't work just as nicely on SSDs - but there (of course), it would make perfect sense to do the I/O at a low priority in the background surely (?) As a crazy idea - do you think SSDs and Rotating media are converses - ie. if in an SSD world, it makes sense to run preload at a really low I/O priority; perhaps in a rotating world, it makes sense to run preload at an incredibly high priority [ while letting boot continue in parallel at the worlds lowest I/O prio ] ?
* do you take the moblin route: + of running preload asynchronously at the lowest I/O priority + of growing /sys/proc/sda/queue/nr_requests to 1024+ [ supposedly so the fairness code works ;-)
Doesn't change _anything_ - it defaults to 128 and the queue never gets that long. At least neither with sreadahead nor with preload. And yes, I tested (of course I used the correct the name - around /sys/devices/pci0000:00/0000:00:1f.2/host2/target2:0:0/2:0:0:0/block/sda/queue/nr_requests,
Ah-well, some missing punctuation fluff, mine is: ;-) /sys/devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/queue/nr_requests Out of interest, how long does the queue get ?
But nice anecdote.
It was an anecdote ? let me make it one : Arjan said this was a good idea, what should I know ? ;-) Looking at your boot-chart; it seems you're using blktace to profile the first few preload runs, and stapio for the later ones, yet prepare_preload seems to work on strace output - is there a new way to prepare the preload output. Thanks, Michael. -- michael.meeks@novell.com <><, Pseudo Engineer, itinerant idiot -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-factory+help@opensuse.org