On Tuesday 16 September 2014 18:41:09 Claudio Freire wrote:
<stefan.bruens@rwth-aachen.de> wrote:
On Tuesday 16 September 2014 16:54:13 Claudio Freire wrote:
That's not the case. It does use cached reads, but it takes about a minute to cache the whole thing in random order, whereas it takes only a few seconds in sequential order.
I'm having a hard time following rpmdb.c's code. I see it uses plain db3 cursors, which should be sequentially scanning the file instead of hopping all over the place. If it's truly the case, then it's db3 the one that needs fixing. If it's rpmdb.c creating other cursors in parallel and seeking other parts of the packages database, which I can't rule out because I couldn't fully figure out the code yet, but seems unlikely, it's rpmdb.c the one in need of fixing.
I think a simple test case should clear this. I'll try to make one with python (it has a nice and neat interface to db3 that's easier to use than C for this).
The Packages db is in DB_HASH format - this has several implications:
1) A linear scan of the database is a random access pattern of the backing store, i.e. the disk. 2) bdb *does* a mmap of database files, but not for DB_HASH databases.
Um... are you sure about that?
Sure - no ...
I thought the only difference between HASH and BTREE was that the iterating order of cursors was random (by key) in HASH, but it doesn't mean it will be random I/O.
It depends. rpm opens "Name" and "Packages" If it iterates over the keys found in Name for the list of packages and uses this as a key to access Packages, you will get random IO on Packages. Directly iterating over the Packages keys should give linear access patterns, indeed. But this is just guesswork ...
Do you have a pointer to documentation? I can't seem to find any relevant details on the access methods on the documentation I find by googling.
I haven't found any documentation about BDB internals, just read a little bit of source code. Regards, Stefan -- Stefan Brüns / Bergstraße 21 / 52062 Aachen phone: +49 241 53809034 mobile: +49 151 50412019 -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org