Re: [Bulk] Re: [Bulk] Re: [opensuse] OT: Developer jobs (was defrag)

17 Feb 2010

      Carlos E. R. said the following on 02/16/2010 08:10 PM:
...
On Tuesday, 2010-02-16 at 18:59 -0500, Anton Aylward wrote:
...
I might also mention in passing that file systems which use a b-tree
directory structure let the kernel do faster name lookup.
reiserfs?
Among others ... :-)
...
What we are talking about reminds me of databases. If something needs
loading dozens of files, and there is a cost to finding them, then perhaps
we are using the wrong tools.
Or perhaps the strategy would be to write a single file compiling the
contents of all those files (configuration or data). But then, there would
be the need of detecting if one of those files has changed and the
compendium has to be regenerated, and that would be costly if needed to be
run on every start.
So that would be a database :-)
Not quite.
Its a library module.
All those separately compiled bits put in one file with an index.

The trouble is that you don't want ALL the possible code in the one
library - its get to be humongous and difficult to maintain ad
regenerate, so you break it up into bits that are related - or not
related depending on how you look at it.  (e.g. gnome libraries vs kde
libraries.)

So, go look at what gets pre-loaded in /etc/preload.d/kde for example.

Another way to do it is to zip (or tar.gz) up all those graphics files
that make up the "themes" and decoration for KDE ...
...
...
Item #6 is really playing around with the Virtual memory and
swapping/paging.  There's one argument which says that code pages don't
need to be paged-out, just freed, because they are code and can be paged
in - again - when needed from the original file - which is mapped
anyway.  So the cost of writing code pages to disk doesn't make sense.
Actually, that is what windows code does.
Actually that's hat Linux does, as I understand it.
And what most OS's do.
...
But, methinks, if the code needs to be re-read several times during
the time, perhaps it would be faster to load instead from a
pagefile/swapspace, which is contiguous by design.
It all depends.
What are you reloading?
And how much?

Paging from the code files (see the references I gave) which are already
memory mapped, isn't that expensive.  This is paging, not
roll-in/roll-out.  After all, the file has already been opened, the
kernel has the handle and inode and all that.

Now if your system is so heavily loaded that you are in effect rolling
out ALL the dirty pages of a process - in effect swapping it out - it
means you are badly memory starved, and that's a problem to be solved by
other means.

Lets take a real simple example of VM.
All pages of user space are in a queue.
They get tagged as code or data, and get tagged if they are written to.
Code never gets written to.

Initially all pages are free.

Init runs and loads a program.  Well actually it opens the executable
and maps it to virtual memory and jumps to the "known location' of
"__start()".

There's nothing there so we get a page fault to read that page in.

Take a page off the front of the queue and map it.  Run the code.

Lather, rinse, repeat as more pages fault.

But stack and heap have been allocated and they fault and get allocated
as well.

Eventually all the physical memory is used.

Whenever a page is referenced, its moved back to the start of the queue.
So the least used pages drift to the end.

When a page is needed and the list is full, the "least recently used"
page is take off the list and deallocated from its old map.

Now if it was a code page, it wasn't written to, and we can get it back
any time, just the way it came in.

But if it was a 'data' page we need to make a copy on disk.  That's what
the swap area (or file) is for.

That's how it began.  You can find various explanations in more detail
on the 'Net.  Try Wikipedia or a start.

There are a LOT of parameters you can play with in there.
See /proc/sys/vm for some of the tunable parameters under Linux.
...
...
The other argument revolves around somehow determining the working set
and - which doesn't make sense in a heavily multi-tasked setting - and
keeping it around as a contiguous block.
No, it doesn't. Other tasks may run in between and break the nicely
pre-designed load sequence.
No only that, but many of those blocks are shared.

That's what the pre-loading is about.

OMG!  I've just looked in /etc/preload.d/kde and most of the entries
don't exist!  Think of all the wasted time trying to look them up!

Time for a shell script to do some pruning.

-- 
To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org
For additional commands, e-mail: opensuse+help@opensuse.org