Re: [Bulk] Re: [opensuse] OT: Developer jobs (was defrag)

16 Feb 2010

      Lubos Lunak said the following on 02/16/2010 05:16 PM:
...
- Just because I'm not a kernel developer does not mean I'm clueless or even 
stupid. In fact, I usually provide evidence for facts presented in my posts, 
and the major factor of KDE startup time at the time of writing that was 
inefficient filesystem layout of data.
Tell me, since I don't have a time machine, was it the case back when
you wrote that in 2006 that KDM and KDE used the directives in
/etc/preload/kdm and /etc/preload.d/kde  ?

I ask this because, as I've said, the overhead of opening a file is
often a lot greater than reading the file, no matter how fragmented it
is, because of the name resolution in order to access the i-node.

In general, there's a good chance the file, no matter how fragmented it
is, is in the same cylinder group.  But its name segments could be all
over the place.

The idea behind the preload is that if the files are all opened and the
name paths are in the name patch cache then the relevant applications
will start faster.  The cost of this is a long, long delay in the
(first) initial start-up - which is what you describe in your article.

I'm not saying your omission was in ignorance, back in 2006, since I
don't have a time machine to go back and install - what was it, 10 point
something? - and look to see if preloading was in use.

Perhaps someone knows.

But I do note you didn't mention preloading except as an aside towards
the end, and even then in a rather deprecatory manner, and you seem to
think that its a matter of 'caching' - well if you call the algorithm
whereby pages are no longer in use are queued for release 'a cache',
then yes, but as I've tried tried explain, its not really a cache like a
buffer cache because file IO is no longer buffered, its mapped and 'load
on demand'.

You are quite correct when you say "But do you know any today's
application that reads just one file?"  Looking at /etc/profile.d/kde
gives a good indication off what files KDE is going to use.

You go on to say "The thing that should be talked instead should be
linearizing, i.e. making sure that related files (not one, files) are
one contiguous area of the disk."  Again correct; that's what modern
file systems, starting with the Berkeley Fast File System of early 1980
vintage, are about - "cylinder grouping" of files and putting the file
data near the i-node.

The contrast in head motion between the old V7 file system which had the
i-node at the beginning and the data at the end, and the FFS, was
dramatic.  Heck, even the difference in head motion with the 'inverted
V7 FS', where the layout went

   +===============+-----------+------------+==========================+
   |  system data  | sys-inode | usr-inode  |   usr data               |
   +===============+-----------+------------+==========================+

reduced head motion.

However you do not mention the load-on-demand nature of virtual memory.
Which is very important.  Once the files are opened and mapped the VM
takes over.  As I said ...

<quote>
We "map the file".  And the libraries.
So a program is 'loaded'.  Well, no, its mapped in.  Switch to suer
space and go to "__start()".  Oops! Page fault - bring in that page.
There was a time the smarts said bring in the next few as well,
(http://en.wikipedia.org/wiki/Paging#Anticipatory_paging) but lets
face it, the first thing the program does is call its initialization
code, which is way over there ... more page faults ... then the command
line scanner ... over there.
</quote>

(This is explained at
http://en.wikipedia.org/wiki/Demand_paging#Unix_implementation
<quote>
The operating system maps the executable file (and its dependent
libraries) into the newly created program's virtual address space,
without actually allocating any physical RAM for the contents of those
files. Since executable code is usually read-only and shared, the
program literally runs from the page cache.
</quote>

So there is a fair bit of crazy-paging.  And as I went on to say, the
paging algorithm has a 'principle of locality' built in.  So it tends to
retain those initial pages even when they have run their one-time-only code.

See http://en.wikipedia.org/wiki/Demand_paging and the various links
from there for various views of what that is all about.

A really, really good -hypothetical-application developer would have a
really, really, good -hypothetical- VM system call at the end of such
routines to tell the VM that there was no need to retain the page of
code this routine was in.

Ohh, look!    -- But then I'm contaminated by having written for the VAX
as well :-(

I don't think you fully realise the why and wherefore of preload and the
cost of file name resolution. Name-paths get cached, yes.

Personally I think that the preloading and name caching adequately
answers your question "why does second start-up of KDE need only roughly
one quarter of time the first start-up needs?"

I might also mention in passing that file systems which use a b-tree
directory structure let the kernel do faster name lookup.
...
- Just because you think filesystems are only about files does not mean it's 
the only thing that is read from the disk or that filesystem fragmentation is 
only about having each single file in a single continuous area. I explained 
that in the blog post.
Indeed, its not "just" - as I keep saying, name lookup and opening the
file, manipulation of i-nodes is a big issue.  Which is why there is
pre-loading.

I don't think you explained that in the post.

But then again that was 2006, this is now.

There's also a follow-up:
http://rudd-o.com/en/linux-and-free-software/about-boot-time-optimization-in...

I must admit here that I'm confused:  are we talking about
boot-to-get-to-the-login-prompt, that is getting the kernel into memory,
running init, running all the appropriate files in /etc/init.d, or are
we talking about STARTUP of something like KDE **after login**.??

But fewer files? (Item #1)  Eliminating the preload won't eliminate the
need to open those files.  It will just defer it.  Maybe that will make
some kinds of start-up feel snapper at the cost of 'pauses' later.

But smaller files? (Item #2) Does that mean 64-bit code will take
longer?  Should we go back to 16-bit machines?

Item #6 is really playing around with the Virtual memory and
swapping/paging.  There's one argument which says that code pages don't
need to be paged-out, just freed, because they are code and can be paged
in - again - when needed from the original file - which is mapped
anyway.  So the cost of writing code pages to disk doesn't make sense.
The other argument revolves around somehow determining the working set
and - which doesn't make sense in a heavily multi-tasked setting - and
keeping it around as a contiguous block.

Some history of the Linux VM is at
http://www.usenix.org/event/usenix01/freenix01/full_papers/riel/riel_html/in...

Tuning the VM
http://www.cyberciti.biz/faq/tag/linux-swappiness/
http://www.cyberciti.biz/faq/linux-kernel-tuning-virtual-memory-subsystem/

-- 
To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org
For additional commands, e-mail: opensuse+help@opensuse.org

Re: [Bulk] Re: [opensuse] OT: Developer jobs (was defrag)

Anton Aylward