Re: [opensuse] Re: [opensuse-factory] How many seconds does "time rpm -qa | wc" cost it?
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2014-09-16 01:08, Carlos E. R. wrote:
On 2014-09-15 23:10, Bernhard Voelker wrote:
Or force the system to cache it by copying it to null...
Look, it is as simple as this:
cer@Telcontar:~> time cp /var/lib/rpm/Packages /dev/null
Oh. I see now that Christian Boltz hit on the same idea, but I read his post after reading mine. Unfortunately someone sent the thread to the wrong mail list. If I do as he does: echo 3 > /proc/sys/vm/drop_caches cp /var/lib/rpm/Packages /dev/null -- 0m3.747s rpm -qa | wc -l -- 0m3.426s echo 3 > /proc/sys/vm/drop_caches rpm -qa | wc -l -- 1m22.089s So it is obvious! We simply run the cp to null thing ahead of the query on the script, and done. - -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iEYEARECAAYFAlQXdWoACgkQtTMYHG2NR9XoUQCgjw6cH+KyErpSevqhd1xJHBBH TkYAoIV4Xn76wVlELuehqdYYESoG9MDh =l1px -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On Mon, Sep 15, 2014 at 8:25 PM, Carlos E. R.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 2014-09-16 01:08, Carlos E. R. wrote:
On 2014-09-15 23:10, Bernhard Voelker wrote:
Or force the system to cache it by copying it to null...
Look, it is as simple as this:
cer@Telcontar:~> time cp /var/lib/rpm/Packages /dev/null
Oh. I see now that Christian Boltz hit on the same idea, but I read his post after reading mine. Unfortunately someone sent the thread to the wrong mail list.
If I do as he does:
echo 3 > /proc/sys/vm/drop_caches
cp /var/lib/rpm/Packages /dev/null -- 0m3.747s rpm -qa | wc -l -- 0m3.426s echo 3 > /proc/sys/vm/drop_caches rpm -qa | wc -l -- 1m22.089s
So it is obvious! We simply run the cp to null thing ahead of the query on the script, and done.
Yes, it's a nice band-aid if the system has enough memory. Not so much if it doesn't. -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2014-09-16 01:29, Claudio Freire wrote:
On Mon, Sep 15, 2014 at 8:25 PM, Carlos E. R.
So it is obvious! We simply run the cp to null thing ahead of the query on the script, and done.
Yes, it's a nice band-aid if the system has enough memory.
Not so much if it doesn't.
True. It is a hack, or band-aid, as you say. The real problem is how the database engine is coded: it is made, apparently, to minimize ram, doing non-sequential and non-cached disk reads. When I wrote a small database "engine" long ago, I did the same thing: it worked on disk, small memory foot print. It was how we were taught to do those things at the year... but it is 2014 now: If memory is available, use it to your advantage, somehow. No, I do not know the proper "somehow". - -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iEYEARECAAYFAlQXeZIACgkQtTMYHG2NR9UrogCdFbi9hxApFlyk3275mc6lgKrS RoAAoJPTulLKtUo++fxdAfShiCh830uN =mCJz -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On Mon, Sep 15, 2014 at 8:43 PM, Carlos E. R.
On 2014-09-16 01:29, Claudio Freire wrote:
On Mon, Sep 15, 2014 at 8:25 PM, Carlos E. R.
So it is obvious! We simply run the cp to null thing ahead of the query on the script, and done.
Yes, it's a nice band-aid if the system has enough memory.
Not so much if it doesn't.
True.
It is a hack, or band-aid, as you say. The real problem is how the database engine is coded: it is made, apparently, to minimize ram, doing non-sequential and non-cached disk reads.
That's not the case. It does use cached reads, but it takes about a minute to cache the whole thing in random order, whereas it takes only a few seconds in sequential order. I'm having a hard time following rpmdb.c's code. I see it uses plain db3 cursors, which should be sequentially scanning the file instead of hopping all over the place. If it's truly the case, then it's db3 the one that needs fixing. If it's rpmdb.c creating other cursors in parallel and seeking other parts of the packages database, which I can't rule out because I couldn't fully figure out the code yet, but seems unlikely, it's rpmdb.c the one in need of fixing. I think a simple test case should clear this. I'll try to make one with python (it has a nice and neat interface to db3 that's easier to use than C for this). -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2014-09-16 21:54, Claudio Freire wrote:
On Mon, Sep 15, 2014 at 8:43 PM, Carlos E. R. <> wrote:
On 2014-09-16 01:29, Claudio Freire wrote:
On Mon, Sep 15, 2014 at 8:25 PM, Carlos E. R.
So it is obvious! We simply run the cp to null thing ahead of the query on the script, and done.
Yes, it's a nice band-aid if the system has enough memory.
Not so much if it doesn't.
True.
It is a hack, or band-aid, as you say. The real problem is how the database engine is coded: it is made, apparently, to minimize ram, doing non-sequential and non-cached disk reads.
That's not the case. It does use cached reads, but it takes about a minute to cache the whole thing in random order, whereas it takes only a few seconds in sequential order.
Wrong. With the proposed hack, It takes about 3 seconds to cache the whole thing, then another 3 to do the whole query - compared to 90 seconds before the hack. It does not matter how the database is accessed, once it is loaded in RAM. Of course, caching it as it is randomly accessed is wrong, unless the database engine is permanently running, as mysql might do. Look: Telcontar:~ # echo 3 > /proc/sys/vm/drop_caches Telcontar:~ # time cp /var/lib/rpm/Packages /dev/null real 0m3.532s user 0m0.004s sys 0m0.245s Telcontar:~ # time rpm -qa | wc -l 6154 real 0m3.668s user 0m2.670s sys 0m0.206s Telcontar:~ # echo 3 > /proc/sys/vm/drop_caches Telcontar:~ # time rpm -qa | wc -l 6154 real 1m23.203s user 0m2.912s sys 0m1.692s Telcontar:~ # - -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iEYEARECAAYFAlQYoN4ACgkQtTMYHG2NR9UyFQCfSqfdYsRLRJOroXqsn1HJWId3 cjwAoIiuemaVKkWVbqp4T8JwUJQM6ZUR =q4Rn -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On Tue, Sep 16, 2014 at 5:43 PM, Carlos E. R.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 2014-09-16 21:54, Claudio Freire wrote:
On Mon, Sep 15, 2014 at 8:43 PM, Carlos E. R. <> wrote:
On 2014-09-16 01:29, Claudio Freire wrote:
On Mon, Sep 15, 2014 at 8:25 PM, Carlos E. R.
So it is obvious! We simply run the cp to null thing ahead of the query on the script, and done.
Yes, it's a nice band-aid if the system has enough memory.
Not so much if it doesn't.
True.
It is a hack, or band-aid, as you say. The real problem is how the database engine is coded: it is made, apparently, to minimize ram, doing non-sequential and non-cached disk reads.
That's not the case. It does use cached reads, but it takes about a minute to cache the whole thing in random order, whereas it takes only a few seconds in sequential order.
Wrong.
Why do you say? The fact that the hack works proves it does use the kernel's buffer cache. In fact, it was one of the first things I checked with strace, whether it opened in direct mode or not. It does not.
With the proposed hack, It takes about 3 seconds to cache the whole thing, then another 3 to do the whole query - compared to 90 seconds before the hack.
It does not matter how the database is accessed, once it is loaded in RAM. Of course, caching it as it is randomly accessed is wrong, unless the database engine is permanently running, as mysql might do.
It doesn't have to keep running. As the success of the cp notes, it only needs to put all the data into the OS buffer cache, which happens with each pread. The only difference between read and pread, is that pread doesn't modify the file descriptor's pointer. Everything else the kernel does to cache reads applies, as demonstrated by the fact that the hack works.
Look:
Telcontar:~ # echo 3 > /proc/sys/vm/drop_caches Telcontar:~ # time cp /var/lib/rpm/Packages /dev/null
real 0m3.532s user 0m0.004s sys 0m0.245s Telcontar:~ # time rpm -qa | wc -l 6154
real 0m3.668s user 0m2.670s sys 0m0.206s Telcontar:~ # echo 3 > /proc/sys/vm/drop_caches Telcontar:~ # time rpm -qa | wc -l 6154
real 1m23.203s user 0m2.912s sys 0m1.692s Telcontar:~ #
What does it prove?
The first run proves the reads are cached, otherwise the cp wouldn't
help, it would hurt.
On Tue, Sep 16, 2014 at 6:01 PM, Stefan Brüns
On Tuesday 16 September 2014 16:54:13 Claudio Freire wrote:
That's not the case. It does use cached reads, but it takes about a minute to cache the whole thing in random order, whereas it takes only a few seconds in sequential order.
I'm having a hard time following rpmdb.c's code. I see it uses plain db3 cursors, which should be sequentially scanning the file instead of hopping all over the place. If it's truly the case, then it's db3 the one that needs fixing. If it's rpmdb.c creating other cursors in parallel and seeking other parts of the packages database, which I can't rule out because I couldn't fully figure out the code yet, but seems unlikely, it's rpmdb.c the one in need of fixing.
I think a simple test case should clear this. I'll try to make one with python (it has a nice and neat interface to db3 that's easier to use than C for this).
The Packages db is in DB_HASH format - this has several implications:
1) A linear scan of the database is a random access pattern of the backing store, i.e. the disk. 2) bdb *does* a mmap of database files, but not for DB_HASH databases.
Um... are you sure about that? I thought the only difference between HASH and BTREE was that the iterating order of cursors was random (by key) in HASH, but it doesn't mean it will be random I/O. Do you have a pointer to documentation? I can't seem to find any relevant details on the access methods on the documentation I find by googling. -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On Tuesday 16 September 2014 18:41:09 Claudio Freire wrote:
wrote: On Tuesday 16 September 2014 16:54:13 Claudio Freire wrote:
That's not the case. It does use cached reads, but it takes about a minute to cache the whole thing in random order, whereas it takes only a few seconds in sequential order.
I'm having a hard time following rpmdb.c's code. I see it uses plain db3 cursors, which should be sequentially scanning the file instead of hopping all over the place. If it's truly the case, then it's db3 the one that needs fixing. If it's rpmdb.c creating other cursors in parallel and seeking other parts of the packages database, which I can't rule out because I couldn't fully figure out the code yet, but seems unlikely, it's rpmdb.c the one in need of fixing.
I think a simple test case should clear this. I'll try to make one with python (it has a nice and neat interface to db3 that's easier to use than C for this).
The Packages db is in DB_HASH format - this has several implications:
1) A linear scan of the database is a random access pattern of the backing store, i.e. the disk. 2) bdb *does* a mmap of database files, but not for DB_HASH databases.
Um... are you sure about that?
Sure - no ...
I thought the only difference between HASH and BTREE was that the iterating order of cursors was random (by key) in HASH, but it doesn't mean it will be random I/O.
It depends. rpm opens "Name" and "Packages" If it iterates over the keys found in Name for the list of packages and uses this as a key to access Packages, you will get random IO on Packages. Directly iterating over the Packages keys should give linear access patterns, indeed. But this is just guesswork ...
Do you have a pointer to documentation? I can't seem to find any relevant details on the access methods on the documentation I find by googling.
I haven't found any documentation about BDB internals, just read a little bit of source code. Regards, Stefan -- Stefan Brüns / Bergstraße 21 / 52062 Aachen phone: +49 241 53809034 mobile: +49 151 50412019 -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On Tuesday 16 September 2014 16:54:13 Claudio Freire wrote:
That's not the case. It does use cached reads, but it takes about a minute to cache the whole thing in random order, whereas it takes only a few seconds in sequential order.
I'm having a hard time following rpmdb.c's code. I see it uses plain db3 cursors, which should be sequentially scanning the file instead of hopping all over the place. If it's truly the case, then it's db3 the one that needs fixing. If it's rpmdb.c creating other cursors in parallel and seeking other parts of the packages database, which I can't rule out because I couldn't fully figure out the code yet, but seems unlikely, it's rpmdb.c the one in need of fixing.
I think a simple test case should clear this. I'll try to make one with python (it has a nice and neat interface to db3 that's easier to use than C for this).
The Packages db is in DB_HASH format - this has several implications: 1) A linear scan of the database is a random access pattern of the backing store, i.e. the disk. 2) bdb *does* a mmap of database files, but not for DB_HASH databases. Regards, Stefan -- Stefan Brüns / Bergstraße 21 / 52062 Aachen phone: +49 241 53809034 mobile: +49 151 50412019 -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On Tuesday 16 September 2014 23:01:40 Stefan Brüns wrote:
On Tuesday 16 September 2014 16:54:13 Claudio Freire wrote:
That's not the case. It does use cached reads, but it takes about a minute to cache the whole thing in random order, whereas it takes only a few seconds in sequential order.
I'm having a hard time following rpmdb.c's code. I see it uses plain db3 cursors, which should be sequentially scanning the file instead of hopping all over the place. If it's truly the case, then it's db3 the one that needs fixing. If it's rpmdb.c creating other cursors in parallel and seeking other parts of the packages database, which I can't rule out because I couldn't fully figure out the code yet, but seems unlikely, it's rpmdb.c the one in need of fixing.
I think a simple test case should clear this. I'll try to make one with python (it has a nice and neat interface to db3 that's easier to use than C for this).
The Packages db is in DB_HASH format - this has several implications:
1) A linear scan of the database is a random access pattern of the backing store, i.e. the disk. 2) bdb *does* a mmap of database files, but not for DB_HASH databases.
Regards,
Stefan
According to /usr/lib/rpm/rpmdb_stat -d /var/lib/rpm/Packages the hash has 19 buckets, and the database contains 3042 keys. Thus every bucket has 150 elements average. I doubt this hash has better access times than a BTree ... The disk access pattern is horrible, see the attached graphic ... Regards, Stefan -- Stefan Brüns / Bergstraße 21 / 52062 Aachen phone: +49 241 53809034 mobile: +49 151 50412019
В Tue, 16 Sep 2014 23:49:53 +0200
Stefan Brüns
On Tuesday 16 September 2014 23:01:40 Stefan Brüns wrote:
On Tuesday 16 September 2014 16:54:13 Claudio Freire wrote:
That's not the case. It does use cached reads, but it takes about a minute to cache the whole thing in random order, whereas it takes only a few seconds in sequential order.
I'm having a hard time following rpmdb.c's code. I see it uses plain db3 cursors, which should be sequentially scanning the file instead of hopping all over the place. If it's truly the case, then it's db3 the one that needs fixing. If it's rpmdb.c creating other cursors in parallel and seeking other parts of the packages database, which I can't rule out because I couldn't fully figure out the code yet, but seems unlikely, it's rpmdb.c the one in need of fixing.
I think a simple test case should clear this. I'll try to make one with python (it has a nice and neat interface to db3 that's easier to use than C for this).
The Packages db is in DB_HASH format - this has several implications:
1) A linear scan of the database is a random access pattern of the backing store, i.e. the disk.
As we do not care about real order in which entries are returned - is it possible to scan in order in which they are stored? I.e. iterate over one hash bucket then next etc?
2) bdb *does* a mmap of database files, but not for DB_HASH databases.
I do not see how mmap is relevant or can help here. If anything it can make things worse by pretending you can do fast access in random size and order.
Regards,
Stefan
According to /usr/lib/rpm/rpmdb_stat -d /var/lib/rpm/Packages the hash has 19 buckets, and the database contains 3042 keys. Thus every bucket has 150 elements average. I doubt this hash has better access times than a BTree ...
The disk access pattern is horrible, see the attached graphic ...
This seems to confirm what you said above and that hash function seems to be good :) It looks like it cycles through hash buckets on linear scan.
Regards,
Stefan
-- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On Tue, Sep 16, 2014 at 11:33 PM, Andrei Borzenkov
Regards,
Stefan
According to /usr/lib/rpm/rpmdb_stat -d /var/lib/rpm/Packages the hash has 19 buckets, and the database contains 3042 keys. Thus every bucket has 150 elements average. I doubt this hash has better access times than a BTree ...
The disk access pattern is horrible, see the attached graphic ...
This seems to confirm what you said above and that hash function seems to be good :) It looks like it cycles through hash buckets on linear scan.
Or it could be that it's scanning the hash index sequentially but accessing something else in tandem with a join, and that's random. -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On 09/17/2014 05:57 PM, Claudio Freire wrote:
Or [...]
... has someone checked what's the difference to other distros? {open,}SUSE seems to be the only one to be hit by this issue. Have a nice day, Berny -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On Wed, Sep 17, 2014 at 1:58 PM, Bernhard Voelker
On 09/17/2014 05:57 PM, Claudio Freire wrote:
Or [...]
... has someone checked what's the difference to other distros? {open,}SUSE seems to be the only one to be hit by this issue.
Comparing with an amazon linux: openSUSE: ~> /usr/lib/rpm/rpmdb_stat -d /var/lib/rpm/Packages Wed Sep 17 14:58:57 2014 Local time 61561 Hash magic number 9 Hash version number Little-endian Byte order Flags 52787 Number of pages in the database 4096 Underlying database page size 0 Specified fill factor 4030 Number of keys in the database 4030 Number of data items in the database 24 Number of hash buckets 23200 Number of bytes free on bucket pages (76% ff) 52311 Number of overflow pages 8141662 Number of bytes free in overflow pages (96% ff) 8 Number of bucket overflow pages 22417 Number of bytes free in bucket overflow pages (31% ff) 0 Number of duplicate pages 0 Number of bytes free in duplicate pages (0% ff) 435 Number of pages on the free list ~> time rpm -qa | wc -l 4029 real 0m1.173s user 0m1.048s sys 0m0.145s ~> echo 3 | sudo tee /proc/sys/vm/drop_caches 3 ~> time rpm -qa | wc -l 4029 real 0m31.426s user 0m1.522s sys 0m1.019s Amazon Linux: $ /usr/lib/rpm/rpmdb_stat -d /var/lib/rpm/Packages Wed Sep 17 17:58:37 2014 Local time 61561 Hash magic number 9 Hash version number Little-endian Byte order Flags 8920 Number of pages in the database 4096 Underlying database page size 0 Specified fill factor 508 Number of keys in the database 508 Number of data items in the database 3 Number of hash buckets 2977 Number of bytes free on bucket pages (75% ff) 6161 Number of overflow pages 1056910 Number of bytes free in overflow pages (95% ff) 1 Number of bucket overflow pages 2642 Number of bytes free in bucket overflow pages (35% ff) 0 Number of duplicate pages 0 Number of bytes free in duplicate pages (0% ff) 2753 Number of pages on the free list [ec2-user@ip-10-146-204-60 ~]$ echo 3 | sudo tee /proc/sys/vm/drop_caches 3 $ time rpm -qa | wc -l 507 real 0m3.306s user 0m0.304s sys 0m0.076s $ time rpm -qa | wc -l 507 real 0m0.258s user 0m0.232s sys 0m0.020s The Amazon Linux server has the same issue, but not nearly as bad, because the database is considerably smaller (and the I/O subsystem considerably better) -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On Wed, Sep 17, 2014 at 12:57 PM, Claudio Freire
On Tue, Sep 16, 2014 at 11:33 PM, Andrei Borzenkov
wrote: Regards,
Stefan
According to /usr/lib/rpm/rpmdb_stat -d /var/lib/rpm/Packages the hash has 19 buckets, and the database contains 3042 keys. Thus every bucket has 150 elements average. I doubt this hash has better access times than a BTree ...
The disk access pattern is horrible, see the attached graphic ...
This seems to confirm what you said above and that hash function seems to be good :) It looks like it cycles through hash buckets on linear scan.
Or it could be that it's scanning the hash index sequentially but accessing something else in tandem with a join, and that's random.
Ok, I reproduced it with a test python script, it's a bsddb3 thing. ~> python bsdcreate.py Berkeley DB 4.8.30: (July 23, 2013) ~> echo 3 | sudo tee /proc/sys/vm/drop_caches 3 ~> time python bsdtest.py real 0m28.912s user 0m0.220s sys 0m0.595s claudiofreire@klaumpp:~> time python bsdtest.py real 0m0.165s user 0m0.100s sys 0m0.064s Attached is the script. If I change HASH into BTREE, I get: ~> rm fruit ~> python bsdcreate.py Berkeley DB 4.8.30: (July 23, 2013) ~> echo 3 | sudo tee /proc/sys/vm/drop_caches 3 ~> time python bsdtest.py real 0m17.458s user 0m0.180s sys 0m0.450s ~> time python bsdtest.py real 0m0.164s user 0m0.092s sys 0m0.072s So, some improvement, but not much. If I do an fadvise first for the whole file (you gotta install python-fadvise[0] for that), I get: ~> echo 3 | sudo tee /proc/sys/vm/drop_caches 3 ~> time python bsdtest.py real 0m3.560s user 0m0.100s sys 0m0.115s ~> time python bsdtest.py real 0m0.164s user 0m0.080s sys 0m0.084s So, fadvise works for full scans. :-) [0] https://github.com/lamby/python-fadvise
On Wed, Sep 17, 2014 at 4:22 PM, Claudio Freire
On Wed, Sep 17, 2014 at 12:57 PM, Claudio Freire
wrote: On Tue, Sep 16, 2014 at 11:33 PM, Andrei Borzenkov
wrote: Regards,
Stefan
According to /usr/lib/rpm/rpmdb_stat -d /var/lib/rpm/Packages the hash has 19 buckets, and the database contains 3042 keys. Thus every bucket has 150 elements average. I doubt this hash has better access times than a BTree ...
The disk access pattern is horrible, see the attached graphic ...
This seems to confirm what you said above and that hash function seems to be good :) It looks like it cycles through hash buckets on linear scan.
Or it could be that it's scanning the hash index sequentially but accessing something else in tandem with a join, and that's random.
Ok, I reproduced it with a test python script, it's a bsddb3 thing.
Um... should I put all this instead on a BNC? -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On Wednesday 17 September 2014 17.27:26 Claudio Freire wrote:
On Wed, Sep 17, 2014 at 4:22 PM, Claudio Freire
wrote: On Wed, Sep 17, 2014 at 12:57 PM, Claudio Freire
wrote: On Tue, Sep 16, 2014 at 11:33 PM, Andrei Borzenkov
wrote: Regards,
Stefan
According to /usr/lib/rpm/rpmdb_stat -d /var/lib/rpm/Packages the hash has 19 buckets, and the database contains 3042 keys. Thus every bucket has 150 elements average. I doubt this hash has better access times than a BTree ...
The disk access pattern is horrible, see the attached graphic ...
This seems to confirm what you said above and that hash function seems to be good :) It looks like it cycles through hash buckets on linear scan.
Or it could be that it's scanning the hash index sequentially but accessing something else in tandem with a join, and that's random.
Ok, I reproduced it with a test python script, it's a bsddb3 thing.
Um... should I put all this instead on a BNC?
I would say yes, stored for long term memory -- Bruno Friedmann Ioda-Net Sàrl www.ioda-net.ch ------------------------ / openSUSE Member & Board \ / GPG KEY : D5C9B751C4653227 \ \ irc: tigerfoot / \ / ------------------------ \ /@ --~-. \/ __ .- | // // @ -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On Thu, Sep 18, 2014 at 4:31 AM, Bruno Friedmann
On Wednesday 17 September 2014 17.27:26 Claudio Freire wrote:
On Wed, Sep 17, 2014 at 4:22 PM, Claudio Freire
wrote: On Wed, Sep 17, 2014 at 12:57 PM, Claudio Freire
wrote: On Tue, Sep 16, 2014 at 11:33 PM, Andrei Borzenkov
wrote: > Regards, > > Stefan
According to /usr/lib/rpm/rpmdb_stat -d /var/lib/rpm/Packages the hash has 19 buckets, and the database contains 3042 keys. Thus every bucket has 150 elements average. I doubt this hash has better access times than a BTree ...
The disk access pattern is horrible, see the attached graphic ...
This seems to confirm what you said above and that hash function seems to be good :) It looks like it cycles through hash buckets on linear scan.
Or it could be that it's scanning the hash index sequentially but accessing something else in tandem with a join, and that's random.
Ok, I reproduced it with a test python script, it's a bsddb3 thing.
Um... should I put all this instead on a BNC?
I would say yes, stored for long term memory
Done. # 897353 -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On Thu, Sep 18, 2014 at 2:39 PM, Claudio Freire
On Thu, Sep 18, 2014 at 4:31 AM, Bruno Friedmann
wrote: On Wednesday 17 September 2014 17.27:26 Claudio Freire wrote:
On Wed, Sep 17, 2014 at 4:22 PM, Claudio Freire
wrote: On Wed, Sep 17, 2014 at 12:57 PM, Claudio Freire
wrote: On Tue, Sep 16, 2014 at 11:33 PM, Andrei Borzenkov
wrote: > > Regards, > > > > Stefan > > According to > /usr/lib/rpm/rpmdb_stat -d /var/lib/rpm/Packages > the hash has 19 buckets, and the database contains 3042 keys. Thus every > bucket has 150 elements average. I doubt this hash has better access times > than a BTree ... > > The disk access pattern is horrible, see the attached graphic ... >
This seems to confirm what you said above and that hash function seems to be good :) It looks like it cycles through hash buckets on linear scan.
Or it could be that it's scanning the hash index sequentially but accessing something else in tandem with a join, and that's random.
Ok, I reproduced it with a test python script, it's a bsddb3 thing.
Um... should I put all this instead on a BNC?
I would say yes, stored for long term memory
Done.
# 897353
WONTFIX right out of the bat. I guess nobody cares enough for it. -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On Tue, 23 Sep 2014 04:05, Claudio Freire
On Thu, Sep 18, 2014 at 2:39 PM, Claudio Freire wrote:
On Thu, Sep 18, 2014 at 4:31 AM, Bruno Friedmann
wrote: On Wednesday 17 September 2014 17.27:26 Claudio Freire wrote:
On Wed, Sep 17, 2014 at 4:22 PM, Claudio Freire wrote:
On Wed, Sep 17, 2014 at 12:57 PM, Claudio Freire wrote:
On Tue, Sep 16, 2014 at 11:33 PM, Andrei Borzenkov
wrote: >>> Regards, >>> Stefan >> >> According to >> /usr/lib/rpm/rpmdb_stat -d /var/lib/rpm/Packages >> the hash has 19 buckets, and the database contains 3042 keys. Thus every >> bucket has 150 elements average. I doubt this hash has better access times >> than a BTree ... >> >> The disk access pattern is horrible, see the attached graphic ... > > This seems to confirm what you said above and that hash function seems > to be good :) It looks like it cycles through hash buckets on linear > scan. Or it could be that it's scanning the hash index sequentially but accessing something else in tandem with a join, and that's random.
Ok, I reproduced it with a test python script, it's a bsddb3 thing.
Um... should I put all this instead on a BNC?
I would say yes, stored for long term memory
Done.
# 897353
WONTFIX right out of the bat.
I guess nobody cares enough for it.
Do you have any possibility to contact rpm-upstream with the issue, and give them the (here) gathered info? Thanks for the work, btw. - Yamaban. -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On Tue, Sep 23, 2014 at 4:57 AM, Yamaban
On Tue, 23 Sep 2014 04:05, Claudio Freire
wrote: On Thu, Sep 18, 2014 at 2:39 PM, Claudio Freire wrote:
On Thu, Sep 18, 2014 at 4:31 AM, Bruno Friedmann
wrote: On Wednesday 17 September 2014 17.27:26 Claudio Freire wrote:
On Wed, Sep 17, 2014 at 4:22 PM, Claudio Freire wrote:
On Wed, Sep 17, 2014 at 12:57 PM, Claudio Freire wrote: > > On Tue, Sep 16, 2014 at 11:33 PM, Andrei Borzenkov
> wrote: >>>> >>>> Regards, >>>> Stefan >>> >>> >>> According to >>> /usr/lib/rpm/rpmdb_stat -d /var/lib/rpm/Packages >>> the hash has 19 buckets, and the database contains 3042 keys. Thus >>> every >>> bucket has 150 elements average. I doubt this hash has better >>> access times >>> than a BTree ... >>> >>> The disk access pattern is horrible, see the attached graphic ... >> >> >> This seems to confirm what you said above and that hash function >> seems >> to be good :) It looks like it cycles through hash buckets on linear >> scan. > > > Or it could be that it's scanning the hash index sequentially but > accessing something else in tandem with a join, and that's random. Ok, I reproduced it with a test python script, it's a bsddb3 thing.
Um... should I put all this instead on a BNC?
I would say yes, stored for long term memory
Done.
# 897353
WONTFIX right out of the bat.
I guess nobody cares enough for it.
Do you have any possibility to contact rpm-upstream with the issue, and give them the (here) gathered info?
Thanks for the work, btw.
I guess it'd be possible, but I'm not in regular contact with them (ie: not subscribed to any ML and not planning to), so I can't promise to keep up with the bug reporting there. I can report it, but follow-up tasks would be best effort only. -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On Tue, Sep 23, 2014 at 2:03 PM, Claudio Freire
On Tue, Sep 23, 2014 at 4:57 AM, Yamaban
wrote: On Tue, 23 Sep 2014 04:05, Claudio Freire
wrote: On Thu, Sep 18, 2014 at 2:39 PM, Claudio Freire wrote:
On Thu, Sep 18, 2014 at 4:31 AM, Bruno Friedmann
wrote: On Wednesday 17 September 2014 17.27:26 Claudio Freire wrote:
On Wed, Sep 17, 2014 at 4:22 PM, Claudio Freire wrote: > > On Wed, Sep 17, 2014 at 12:57 PM, Claudio Freire wrote: >> >> On Tue, Sep 16, 2014 at 11:33 PM, Andrei Borzenkov
>> wrote: >>>>> >>>>> Regards, >>>>> Stefan >>>> >>>> >>>> According to >>>> /usr/lib/rpm/rpmdb_stat -d /var/lib/rpm/Packages >>>> the hash has 19 buckets, and the database contains 3042 keys. Thus >>>> every >>>> bucket has 150 elements average. I doubt this hash has better >>>> access times >>>> than a BTree ... >>>> >>>> The disk access pattern is horrible, see the attached graphic ... >>> >>> >>> This seems to confirm what you said above and that hash function >>> seems >>> to be good :) It looks like it cycles through hash buckets on linear >>> scan. >> >> >> Or it could be that it's scanning the hash index sequentially but >> accessing something else in tandem with a join, and that's random. > > > Ok, I reproduced it with a test python script, it's a bsddb3 thing. Um... should I put all this instead on a BNC?
I would say yes, stored for long term memory
Done.
# 897353
WONTFIX right out of the bat.
I guess nobody cares enough for it.
Do you have any possibility to contact rpm-upstream with the issue, and give them the (here) gathered info?
Thanks for the work, btw.
I guess it'd be possible, but I'm not in regular contact with them (ie: not subscribed to any ML and not planning to), so I can't promise to keep up with the bug reporting there.
I can report it, but follow-up tasks would be best effort only.
Besides, I consider this a bsddb3 bug, not really an rpm one. All RPM can do, is either fadvise the file, or switch to another db implementation. -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2014-09-23 19:04, Claudio Freire wrote:
On Tue, Sep 23, 2014 at 2:03 PM, Claudio Freire
wrote:
Besides, I consider this a bsddb3 bug, not really an rpm one. All RPM can do, is either fadvise the file, or switch to another db implementation.
Really, this would be something the openSUSE maintainer would be in a better position to do, but he does not want to. He just rejected the Bugzilla. He did point to a project he has to create a newrpmdb, though. - -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iEYEARECAAYFAlQhrNQACgkQtTMYHG2NR9UENACeOcODpAcVrP3rOoFqWOyJ8ydn yvQAoJHoxPjZwEj59k/TyfgIKwnzW9O/ =XDMg -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2014-09-16 01:29, Claudio Freire wrote:
On Mon, Sep 15, 2014 at 8:25 PM, Carlos E. R.
So it is obvious! We simply run the cp to null thing ahead of the query on the script, and done.
Yes, it's a nice band-aid if the system has enough memory.
Not so much if it doesn't.
I have added that "cp" hack to the script "/etc/init.d/rpmconfigcheck", and now runs in seconds, even after deleting "/var/adm/rpmconfigcheck". This thing have being delaying my boot for a minute or more during years... - -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iEYEARECAAYFAlQXj7kACgkQtTMYHG2NR9WGPQCfbgWqeh9a00VeSD+T5NoPJ8q+ hooAoIatBv/Eto6JQUSoXJWvFXLdKiXv =4CIO -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
participants (8)
-
Andrei Borzenkov
-
Bernhard Voelker
-
Bruno Friedmann
-
Carlos E. R.
-
Carlos E. R.
-
Claudio Freire
-
Stefan Brüns
-
Yamaban