Re: [opensuse-factory] How many seconds does "time rpm -qa | wc" cost it?

16 Sep 2014

      On Tue, Sep 16, 2014 at 5:43 PM, Carlos E. R. <carlos.e.r@opensuse.org> wrote:
...
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 2014-09-16 21:54, Claudio Freire wrote:
...
On Mon, Sep 15, 2014 at 8:43 PM, Carlos E. R. <> wrote:
...
On 2014-09-16 01:29, Claudio Freire wrote:
...
On Mon, Sep 15, 2014 at 8:25 PM, Carlos E. R.
...
...
So it is obvious! We simply run the cp to null thing ahead
of the query on the script, and done.
Yes, it's a nice band-aid if the system has enough memory.
Not so much if it doesn't.
True.
It is a hack, or band-aid, as you say. The real problem is how
the database engine is coded: it is made, apparently, to minimize
ram, doing non-sequential and non-cached disk reads.
That's not the case. It does use cached reads, but it takes about
a minute to cache the whole thing in random order, whereas it takes
only a few seconds in sequential order.
Wrong.
Why do you say?

The fact that the hack works proves it does use the kernel's buffer cache.

In fact, it was one of the first things I checked with strace, whether
it opened in direct mode or not. It does not.
...
With the proposed hack, It takes about 3 seconds to cache the whole
thing, then another 3 to do the whole query - compared to 90 seconds
before the hack.
It does not matter how the database is accessed, once it is loaded in
RAM. Of course, caching it as it is randomly accessed is wrong, unless
the database engine is permanently running, as mysql might do.
It doesn't have to keep running. As the success of the cp notes, it
only needs to put all the data into the OS buffer cache, which happens
with each pread. The only difference between read and pread, is that
pread doesn't modify the file descriptor's pointer. Everything else
the kernel does to cache reads applies, as demonstrated by the fact
that the hack works.
...
Look:
Telcontar:~ # echo 3 > /proc/sys/vm/drop_caches
Telcontar:~ # time cp /var/lib/rpm/Packages /dev/null
real    0m3.532s
user    0m0.004s
sys     0m0.245s
Telcontar:~ # time rpm -qa | wc -l
6154
real    0m3.668s
user    0m2.670s
sys     0m0.206s
Telcontar:~ # echo 3 > /proc/sys/vm/drop_caches
Telcontar:~ # time rpm -qa | wc -l
6154
real    1m23.203s
user    0m2.912s
sys     0m1.692s
Telcontar:~ #
What does it prove?

The first run proves the reads are cached, otherwise the cp wouldn't
help, it would hurt.

On Tue, Sep 16, 2014 at 6:01 PM, Stefan Brüns
<stefan.bruens@rwth-aachen.de> wrote:
...
On Tuesday 16 September 2014 16:54:13 Claudio Freire wrote:
...
That's not the case. It does use cached reads, but it takes about a
minute to cache the whole thing in random order, whereas it takes only
a few seconds in sequential order.
I'm having a hard time following rpmdb.c's code. I see it uses plain
db3 cursors, which should be sequentially scanning the file instead of
hopping all over the place. If it's truly the case, then it's db3 the
one that needs fixing. If it's rpmdb.c creating other cursors in
parallel and seeking other parts of the packages database, which I
can't rule out because I couldn't fully figure out the code yet, but
seems unlikely, it's rpmdb.c the one in need of fixing.
I think a simple test case should clear this. I'll try to make one
with python (it has a nice and neat interface to db3 that's easier to
use than C for this).
The Packages db is in DB_HASH format - this has several implications:
1) A linear scan of the database is a random access pattern of the backing
store, i.e. the disk.
2) bdb *does* a mmap of database files, but not for DB_HASH databases.
Um... are you sure about that?

I thought the only difference between HASH and BTREE was that the
iterating order of cursors was random (by key) in HASH, but it doesn't
mean it will be random I/O.

Do you have a pointer to documentation? I can't seem to find any
relevant details on the access methods on the documentation I find by
googling.
-- 
To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org
To contact the owner, e-mail: opensuse-factory+owner@opensuse.org