[opensuse-factory] How many seconds does "time rpm -qa | wc" cost it? (was: Please review and help with tests about etckeeper)

Hi all: How many seconds does "time rpm -qa | wc" cost it in your OS? I put etckeeper-1.13 in openSUSE Factory's official repository. https://build.opensuse.org/package/show?project=openSUSE%3AFactory&package=e... And I added patch of https://build.opensuse.org/package/view_file/openSUSE:Factory/etckeeper/etck... https://github.com/joeyh/etckeeper/pull/17 . When ZYpp install or remove, etckeeper makes all package list as rpm -qa twice for getting changed packages list in ZYpp plugin. eg) before install: rpm -qa | sort >before_packagelist after install: rpm -qa | sort >after_packagelist diff -u before_packagelist after_packagelist >changed_packageslist But these "rpm -qa" are slowly. (sometimes over 30 sec.) Over 30 sec., ZYpp plugin gets timeout, so sometimes etckeeper's autocommits fail. So I wrote a patch of gh#joeyh/etckeeper#17. This patch provide a choice of whether or not make "changed packages list" But etckeeper's author (Mr. Joey Hess) said: https://github.com/joeyh/etckeeper/pull/17#issuecomment-55059127
then that seems very poor, since etckeeper could take a while to run for any number of reasons, including the system being busy.
So we execute "time rpm -qa | wc" in various environment. http://lists.opensuse.org/opensuse-ja/2014-09/msg00012.html http://lists.opensuse.org/opensuse-ja/2014-09/msg00015.html http://lists.opensuse.org/opensuse-ja/2014-09/msg00014.html http://lists.opensuse.org/opensuse-ja/2014-09/msg00016.html http://lists.opensuse.org/opensuse-ja/2014-09/msg00017.html http://lists.opensuse.org/opensuse-ja/2014-09/msg00020.html Most OSes can work within 2 sec. But some OSes cost over 15 sec. a) bear-metal OS: openSUSE13.1 CPU: Core i7-4930K RAM: 64GB HDD or SSD: ?
time rpm -qa | wc 5333 5333 211441
real 0m16.909s user 0m1.188s sys 0m0.276s b) I am using VirtualBox 4.3.12 r93733: Host: Operating System: Windows 7 Ultimate 64-bit (6.1, Build 7601) Service Pack 1 (7601.win7sp1_gdr.140303-2144) Language: Japanese (Regional Setting: Japanese) System Manufacturer: Dell Inc. System Model: XPS 8300 BIOS: BIOS Date: 03/28/12 09:12:57 Ver: 04.06.04 Processor: Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz (8 CPUs), ~3.4GHz Memory: 16384MB RAM HDD: SATA 2 T Byte (NTFS) (detail https://dl.dropboxusercontent.com/u/86335040/DxDiag20140911.txt ) Guest: OS: openSUSE13.1 64 bit RAM: 1 G byte HDD: 60 G bygte (.vdi) Format: LVM+ext4
time rpm -qa | wc 1917 1917 65019
real 1m52.677s user 0m1.641s sys 0m0.302s I can understand that "rpm -qa"s are slow on VM. But I can not understand that "rpm -qa" is slow on some bare-metal machines. I want to explain to Joey that "rpm -qa" are slow in some environments even if we use them commonly. Please tell me how many seconds does "time rpm -qa | wc" cost it in your OS and telle me your envrionments. (eg.) a) bear-metal OS: CPU: RAM: HDD or SSD: (size) Filesystem:
time rpm -qa | wc 5333 5333 211441
real 0m16.909s user 0m1.188s sys 0m0.276s b) VM Host: OS: CPU: RAM: HDD or SSD: (size) Filesystem: Virtualization-Software: (name and version) Guest: OS: RAM: Disk: (size Filesystem:
time rpm -qa | wc 1917 1917 65019
real 1m52.677s user 0m1.641s sys 0m0.302s Thank you! -- 1xx <ItSANgo@gmail.com> <https://twitter.com/ItSANgo> Mitsutoshi NAKANO <bkbin005@rinku.zaq.ne.jp> <http://d.hatena.ne.jp/Itisango/> -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org

[10:12:19 alin@abaddon:~]: time rpm -qa | wc 4730 4730 187545 real 0m1.438s user 0m1.392s sys 0m0.080s factory... cpu model name : Intel(R) Core(TM) i7-3537U CPU @ 2.00GHz RAM 8GiB ssd run it quite few times and is consistent Alin On Monday 15 Sep 2014 18:04:32 1xx wrote:
Hi all:
How many seconds does "time rpm -qa | wc" cost it in your OS?
I put etckeeper-1.13 in openSUSE Factory's official repository. https://build.opensuse.org/package/show?project=openSUSE%3AFactory&package=e tckeeper
And I added patch of https://build.opensuse.org/package/view_file/openSUSE:Factory/etckeeper/etck eeper-avoid-packagelist.patch?expand=1 https://github.com/joeyh/etckeeper/pull/17 .
When ZYpp install or remove, etckeeper makes all package list as rpm -qa twice for getting changed packages list in ZYpp plugin. eg) before install: rpm -qa | sort >before_packagelist after install: rpm -qa | sort >after_packagelist diff -u before_packagelist after_packagelist >changed_packageslist But these "rpm -qa" are slowly. (sometimes over 30 sec.) Over 30 sec., ZYpp plugin gets timeout, so sometimes etckeeper's autocommits fail.
So I wrote a patch of gh#joeyh/etckeeper#17. This patch provide a choice of whether or not make "changed packages list"
But etckeeper's author (Mr. Joey Hess) said: https://github.com/joeyh/etckeeper/pull/17#issuecomment-55059127
then that seems very poor, since etckeeper could take a while to run for any number of reasons, including the system being busy.
So we execute "time rpm -qa | wc" in various environment. http://lists.opensuse.org/opensuse-ja/2014-09/msg00012.html http://lists.opensuse.org/opensuse-ja/2014-09/msg00015.html http://lists.opensuse.org/opensuse-ja/2014-09/msg00014.html http://lists.opensuse.org/opensuse-ja/2014-09/msg00016.html http://lists.opensuse.org/opensuse-ja/2014-09/msg00017.html http://lists.opensuse.org/opensuse-ja/2014-09/msg00020.html
Most OSes can work within 2 sec. But some OSes cost over 15 sec.
a) bear-metal OS: openSUSE13.1 CPU: Core i7-4930K RAM: 64GB HDD or SSD: ?
time rpm -qa | wc
5333 5333 211441
real 0m16.909s user 0m1.188s sys 0m0.276s
b) I am using VirtualBox 4.3.12 r93733: Host: Operating System: Windows 7 Ultimate 64-bit (6.1, Build 7601) Service Pack 1 (7601.win7sp1_gdr.140303-2144) Language: Japanese (Regional Setting: Japanese) System Manufacturer: Dell Inc. System Model: XPS 8300 BIOS: BIOS Date: 03/28/12 09:12:57 Ver: 04.06.04 Processor: Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz (8 CPUs), ~3.4GHz Memory: 16384MB RAM HDD: SATA 2 T Byte (NTFS) (detail https://dl.dropboxusercontent.com/u/86335040/DxDiag20140911.txt ) Guest: OS: openSUSE13.1 64 bit RAM: 1 G byte HDD: 60 G bygte (.vdi) Format: LVM+ext4
time rpm -qa | wc
1917 1917 65019
real 1m52.677s user 0m1.641s sys 0m0.302s
I can understand that "rpm -qa"s are slow on VM. But I can not understand that "rpm -qa" is slow on some bare-metal machines.
I want to explain to Joey that "rpm -qa" are slow in some environments even if we use them commonly.
Please tell me how many seconds does "time rpm -qa | wc" cost it in your OS and telle me your envrionments.
(eg.) a) bear-metal OS: CPU: RAM: HDD or SSD: (size)
Filesystem:
time rpm -qa | wc
5333 5333 211441
real 0m16.909s user 0m1.188s sys 0m0.276s
b) VM Host: OS: CPU: RAM: HDD or SSD: (size) Filesystem:
Virtualization-Software: (name and version) Guest: OS: RAM: Disk: (size
Filesystem:
time rpm -qa | wc
1917 1917 65019
real 1m52.677s user 0m1.641s sys 0m0.302s
Thank you!
-- Without Questions there are no Answers! ______________________________________________________________________ Dr. Alin Marin ELENA http://alin.elenaworld.net/ ______________________________________________________________________ -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org

A bit slow on first run ( or if anything happen to rpmdb ) but quick on recall time rpm -qa | wc 4580 4580 166891 real 0m4.199s user 0m2.027s sys 0m0.257s bruno@c-3po:~$ time rpm -qa | wc 4580 4580 166891 real 0m1.716s user 0m1.635s sys 0m0.119s bruno@c-3po:~$ time rpm -qa | wc 4580 4580 166891 real 0m1.703s user 0m1.623s sys 0m0.115s bruno@c-3po:~$ time rpm -qa | wc 4580 4580 166891 real 0m1.736s user 0m1.684s sys 0m0.104s cpu Intel(R) Core(TM) i7-2820QM CPU @ 2.30GHz Ram 16Gb ssd encrypted lvm + ext4 Samsung SSD 840 Series On 2014-09-15 11:14, Alin Marin Elena wrote:
[10:12:19 alin@abaddon:~]: time rpm -qa | wc 4730 4730 187545
real 0m1.438s user 0m1.392s sys 0m0.080s
factory... cpu model name : Intel(R) Core(TM) i7-3537U CPU @ 2.00GHz RAM 8GiB ssd
run it quite few times and is consistent Alin
On Monday 15 Sep 2014 18:04:32 1xx wrote:
Hi all:
How many seconds does "time rpm -qa | wc" cost it in your OS?
I put etckeeper-1.13 in openSUSE Factory's official repository. https://build.opensuse.org/package/show?project=openSUSE%3AFactory&package=e tckeeper
And I added patch of https://build.opensuse.org/package/view_file/openSUSE:Factory/etckeeper/etck eeper-avoid-packagelist.patch?expand=1 https://github.com/joeyh/etckeeper/pull/17 .
When ZYpp install or remove, etckeeper makes all package list as rpm -qa twice for getting changed packages list in ZYpp plugin. eg) before install: rpm -qa | sort >before_packagelist after install: rpm -qa | sort >after_packagelist diff -u before_packagelist after_packagelist >changed_packageslist But these "rpm -qa" are slowly. (sometimes over 30 sec.) Over 30 sec., ZYpp plugin gets timeout, so sometimes etckeeper's autocommits fail.
So I wrote a patch of gh#joeyh/etckeeper#17. This patch provide a choice of whether or not make "changed packages list"
But etckeeper's author (Mr. Joey Hess) said: https://github.com/joeyh/etckeeper/pull/17#issuecomment-55059127
then that seems very poor, since etckeeper could take a while to run for any number of reasons, including the system being busy.
So we execute "time rpm -qa | wc" in various environment. http://lists.opensuse.org/opensuse-ja/2014-09/msg00012.html http://lists.opensuse.org/opensuse-ja/2014-09/msg00015.html http://lists.opensuse.org/opensuse-ja/2014-09/msg00014.html http://lists.opensuse.org/opensuse-ja/2014-09/msg00016.html http://lists.opensuse.org/opensuse-ja/2014-09/msg00017.html http://lists.opensuse.org/opensuse-ja/2014-09/msg00020.html
Most OSes can work within 2 sec. But some OSes cost over 15 sec.
a) bear-metal OS: openSUSE13.1 CPU: Core i7-4930K RAM: 64GB HDD or SSD: ?
time rpm -qa | wc
5333 5333 211441
real 0m16.909s user 0m1.188s sys 0m0.276s
b) I am using VirtualBox 4.3.12 r93733: Host: Operating System: Windows 7 Ultimate 64-bit (6.1, Build 7601) Service Pack 1 (7601.win7sp1_gdr.140303-2144) Language: Japanese (Regional Setting: Japanese) System Manufacturer: Dell Inc. System Model: XPS 8300 BIOS: BIOS Date: 03/28/12 09:12:57 Ver: 04.06.04 Processor: Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz (8 CPUs), ~3.4GHz Memory: 16384MB RAM HDD: SATA 2 T Byte (NTFS) (detail https://dl.dropboxusercontent.com/u/86335040/DxDiag20140911.txt ) Guest: OS: openSUSE13.1 64 bit RAM: 1 G byte HDD: 60 G bygte (.vdi) Format: LVM+ext4
time rpm -qa | wc
1917 1917 65019
real 1m52.677s user 0m1.641s sys 0m0.302s
I can understand that "rpm -qa"s are slow on VM. But I can not understand that "rpm -qa" is slow on some bare-metal machines.
I want to explain to Joey that "rpm -qa" are slow in some environments even if we use them commonly.
Please tell me how many seconds does "time rpm -qa | wc" cost it in your OS and telle me your envrionments.
(eg.) a) bear-metal OS: CPU: RAM: HDD or SSD: (size)
Filesystem:
time rpm -qa | wc
5333 5333 211441
real 0m16.909s user 0m1.188s sys 0m0.276s
b) VM Host: OS: CPU: RAM: HDD or SSD: (size) Filesystem:
Virtualization-Software: (name and version) Guest: OS: RAM: Disk: (size
Filesystem:
time rpm -qa | wc
1917 1917 65019
real 1m52.677s user 0m1.641s sys 0m0.302s
Thank you!
-- Without Questions there are no Answers! ______________________________________________________________________ Dr. Alin Marin ELENA http://alin.elenaworld.net/ ______________________________________________________________________
-- Bruno Friedmann Ioda-Net Sàrl Le Paigre 45 2947 Charmoille - Switzerland Tél : ++41 32 435 7171 Fax : ++41 32 435 7172 gsm : ++41 78 802 6760 web : www.ioda-net.ch -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org

1xx wrote:
Hi all:
How many seconds does "time rpm -qa | wc" cost it in your OS?
# time rpm -qa | wc 437 437 12827 real 0m1.254s user 0m1.052s sys 0m0.208s Environment: HP Proliant DL580 G2, 4 CPUs, 12Gb RAM. -- Per Jessen, Zürich (16.5°C) http://www.hostsuisse.com/ - dedicated server rental in Switzerland. -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org

On Mon, 2014-09-15 at 12:06 +0200, Per Jessen wrote:
1xx wrote:
Hi all:
How many seconds does "time rpm -qa | wc" cost it in your OS?
# time rpm -qa | wc 437 437 12827
real 0m1.254s user 0m1.052s sys 0m0.208s
Environment: HP Proliant DL580 G2, 4 CPUs, 12Gb RAM.
linux-fkkt:/home/oneukum/suseexpanded # time rpm -qa | wc 4196 4196 172185 real 0m4.661s user 0m1.612s sys 0m0.204s HP Aladin Laptop, 24 GB RAN, rpm db on SSD HTH Oliver -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org

On Monday 2014-09-15 11:04, 1xx wrote:
But these "rpm -qa" are slowly. (sometimes over 30 sec.)
I can understand that "rpm -qa"s are slow on VM. But I can not understand that "rpm -qa" is slow on some bare-metal machines.
rpmdb has terrible performance -- especially for dead-simple lists of installed packages (NEVR). db being in the read cache: ~0.7s to show the list, on Intel i7 (as others have mentioned and I confirm). What does this tell us: It uses way too much CPU. db being sourced from SSD: add ~0.6s. What does this tell us: It spends to much time reading things/reads too much. db being fragmented (i.e. before calling rpm --rebuilddb): add ~0.7s What does this tell us: the DB naturally fragments, and read performance suffers. db being read from a rotating disk: add LOTS of seconds. What does this tell us: terrible read patterns all over the place. Perhaps someone should explore splitting data up into more tables, or adding indexes, or replacing the Berkeley DB backend by LMDB or SQLite. -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org

On Monday 2014-09-15 12:38, Jan Engelhardt wrote:
On Monday 2014-09-15 11:04, 1xx wrote:
But these "rpm -qa" are slowly. (sometimes over 30 sec.)
I can understand that "rpm -qa"s are slow on VM. But I can not understand that "rpm -qa" is slow on some bare-metal machines.
rpmdb has terrible performance -- especially for dead-simple lists of installed packages (NEVR).
Perhaps someone should explore splitting data up into more tables, or adding indexes, or replacing the Berkeley DB backend by LMDB or SQLite.
The root of the problem is how it's stored. There is /var/lib/rpm/Packages, which in my case takes up some 142 MB for just 2270 packages. The utility "db48_dump" verifies this: there are 2270(+1) entries in that Berkeley file. On average, this means that each package has a value string that is almost 64K in size. That means you're either (a) reading the full ~64K for every package (b) reading some part of the string (to retrieve NEVR), and then seek the rest. and that would be my theory on why rpm spends so much time in reads. -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org

El 15/09/14 a las #4, Jan Engelhardt escribió:
Perhaps someone should explore splitting data up into more tables, or adding indexes, or replacing the Berkeley DB backend by LMDB or SQLite.
I agree that the rpm database should be backed by something like sqlite..however I am not sure how well that idea will be received :-) -- Cristian "I don't know the key to success, but the key to failure is trying to please everybody." -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org

On Mon, Sep 15, 2014 at 7:37 PM, Cristian Rodríguez <crrodriguez@opensuse.org> wrote:
El 15/09/14 a las #4, Jan Engelhardt escribió:
Perhaps someone should explore splitting data up into more tables, or adding indexes, or replacing the Berkeley DB backend by LMDB or SQLite.
I agree that the rpm database should be backed by something like sqlite..however I am not sure how well that idea will be received :-)
I wouldn't do that, unless you can accept frequent reconstruction of the db. RPM's database should be resilient, and while sqlite isn't as bad a plain text files, it's still not something people use as production, crash-resilient databases. MySQL databases, not only need mysql (good luck making it a dependency of rpm) but also are quite big by default when using innodb (which is the least you can do if you want crash safety). All in all, bdb isn't so bad. But a change in the access pattern for rpm -qa might be welcome. I'm sure some incarnations of bdb support sequential scan at least in some db configuraitons. Not sure what's applicable for rpm though. -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org

El 15/09/14 a las #4, Claudio Freire escribió: rash safety).
All in all, bdb isn't so bad.
That would be true if oracle hadn't relicensed bdb to AGPL making it incompatible with pretty much everything else..now we need a replacement in the medium term anyway. -- Cristian "I don't know the key to success, but the key to failure is trying to please everybody." -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org

Thanks you everyone! 2014-09-16 7:55 GMT+09:00 Claudio Freire <klaussfreire@gmail.com>:
On Mon, Sep 15, 2014 at 7:37 PM, Cristian Rodríguez <crrodriguez@opensuse.org> wrote:
El 15/09/14 a las #4, Jan Engelhardt escribió:
Perhaps someone should explore splitting data up into more tables, or adding indexes, or replacing the Berkeley DB backend by LMDB or SQLite.
I agree that the rpm database should be backed by something like sqlite..however I am not sure how well that idea will be received :-)
I wouldn't do that, unless you can accept frequent reconstruction of the db.
RPM's database should be resilient, and while sqlite isn't as bad a plain text files, it's still not something people use as production, crash-resilient databases.
MySQL databases, not only need mysql (good luck making it a dependency of rpm) but also are quite big by default when using innodb (which is the least you can do if you want crash safety).
All in all, bdb isn't so bad. But a change in the access pattern for rpm -qa might be welcome. I'm sure some incarnations of bdb support sequential scan at least in some db configuraitons. Not sure what's applicable for rpm though.
Can we resolve this issue from RPM or Berkeley DB? (This issue = "rpm -qa" are slow in some environments) -- 1xx <ItSANgo@gmail.com> <https://twitter.com/ItSANgo> Mitsutoshi NAKANO <bkbin005@rinku.zaq.ne.jp> <http://d.hatena.ne.jp/Itisango/> -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2014-09-16 07:29, 1xx wrote:
Can we resolve this issue from RPM or Berkeley DB? (This issue = "rpm -qa" are slow in some environments)
Just add this code before calling it: cp /var/lib/rpm/Packages /dev/null cp /var/lib/rpm/Basenames /dev/null cp /var/lib/rpm/Providename /dev/null cp /var/lib/rpm/Requirename /dev/null (which is a hack, but one that just works) The other proposed solutions mean recoding rpm itself, or whatever library manages the databases. Something like that should be done, but it is not something you can do if you are not an rpm developer ;-) And the code should check for installed memory in the machine, or be optional with a config option or switch. Just parse this output: Telcontar:/etc/init.d # grep MemTotal /proc/meminfo MemTotal: 8193508 kB Telcontar:/etc/init.d # and make a decision beyond a certain size. Otherwise, you could consider using "rpmqpack" instead of "rpm -qa", if you do not need versions. About rpm or database coding, Cristian suggested opening the database file with «fopen(3) with "m" mode.. or mmap()». - -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iEYEARECAAYFAlQYJoMACgkQtTMYHG2NR9VTaQCffCUSgQZNa8oWf/05BwqnSnvw l7kAoJIpn5Ly9htZneNeDTF9/IDmGnFf =2e4x -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org

On Tue, Sep 16, 2014 at 9:01 AM, Carlos E. R. <carlos.e.r@opensuse.org> wrote:
About rpm or database coding, Cristian suggested opening the database file with «fopen(3) with "m" mode.. or mmap()».
It won't make much of a difference. The access pattern has to change. One way to preload the database prior to a full scan that is rather easy, is to posix_fadvise the whole file. The kernel may ignore such a large fadvise, so I'm not sure it will work. But if it does, it will be a very easy workaround. -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2014-09-15 11:04, 1xx wrote:
Hi all:
How many seconds does "time rpm -qa | wc" cost it in your OS?
About a minute and a half. Not "seconds". Telcontar:~ # time rpm -qa | wc 6154 6154 240152 real 1m20.028s user 0m2.875s sys 0m1.735s Telcontar:~ # Telcontar:~ # time rpm -qa | wc -l 6154 real 0m2.877s user 0m2.665s sys 0m0.213s Telcontar:~ # This is on 13.1 on real hardware, with 8 GiB RAM, and a quad core2 cpu (Q9550 @ 2.83GHz), and reasonably good rotating disks, almost idling. Notice the huge difference between the first and second runs, which proves that the slag in on the disk and database, as Jan Engelhardt says. Telcontar:~ # l -h /var/lib/rpm/Packages - -rw-r--r-- 1 root root 348M Sep 15 13:49 /var/lib/rpm/Packages - -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iEYEARECAAYFAlQW0+0ACgkQtTMYHG2NR9U0MQCgl/TX1TrfQybADFb8Hj+A+qor DMMAn2kj4bMKXQGqWuhBmrWo2seHLsgj =ECWW -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org

Hello, Am Montag, 15. September 2014 schrieb 1xx:
How many seconds does "time rpm -qa | wc" cost it in your OS?
First attemp: cb@geeko:~> time rpm -qa | wc -l 3071 real 1m37.325s user 0m1.997s sys 0m0.767s Second attemp, now with everything in the cache: cb@geeko:~> time rpm -qa | wc -l 3071 real 0m1.747s user 0m1.607s sys 0m0.177s I just did a rpm --rebuilddb which shrinked /var/lib/rpm from 227 MB to 121 MB (biggest saving: Packages shrinked from 200 MB to 104 MB) which also reduces the time for rpm -qa: # echo 3 > /proc/sys/vm/drop_caches cb@geeko:~> time rpm -qa | wc -l 3071 real 0m21.569s user 0m1.715s sys 0m0.322s Needless to say that the limiting factor is the harddisk ;-) (actually, it's a RAID1 with two disks in my > 4 years old laptop - and yes, I should buy a new one with a SSD ;-) Nevertheless, the performance with cold cache is quite bad - I just tried cat $300_mb_file > /dev/null and it took 4 seconds. Why does reading /var/lib/rpm/ (227 MB) take 1:37 minutes (or still 21 seconds after --rebuilddb)? Maybe a good workaround would be cat /var/lib/rpm/Packages > /dev/null ; rpm -qa which I expect to be much faster than a simple rpm -qa Yes, this is a serious proposal ;-) # echo 3 > /proc/sys/vm/drop_caches cb@geeko:~> time (cat /var/lib/rpm/Packages >/dev/null ; rpm -qa|wc -l ) 3071 real 0m5.837s user 0m1.599s sys 0m0.293s No comment :-/ Regards, Christian Boltz -- Untersuchungen, wie viele der "RSS ist tot"-Blogbeiträge per Newsfeed gelesen worden sind, sind uns nicht bekannt. [http://www.heise.de/newsticker/meldung/Facebook-Twitter-und-der-Tod-von-RSS-...] -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org

On Mon, Sep 15, 2014 at 4:42 PM, Christian Boltz <opensuse@cboltz.de> wrote:
Nevertheless, the performance with cold cache is quite bad - I just tried cat $300_mb_file > /dev/null and it took 4 seconds. Why does reading /var/lib/rpm/ (227 MB) take 1:37 minutes (or still 21 seconds after --rebuilddb)?
Because it's random I/O. I'd suggest a very simple optimization for rpmdb: if there's enough memory (simple by checking the amount of "cached" in /proc/meminfo), just read the whole database sequentially to pre-load. A better optimization would be to use a sequential scan when listing all packages, but bdb doesn't support that in all of the db types. -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org

Am 15.09.2014 um 21:42 schrieb Christian Boltz:
tried cat $300_mb_file > /dev/null and it took 4 seconds. Why does reading /var/lib/rpm/ (227 MB) take 1:37 minutes (or still 21 seconds after --rebuilddb)?
Because the files are usually fragmented like hell: susi:/var/lib/rpm # shake -vvp .|grep -v alternatives|awk '{print $4" "$5" "$8}' FRAGC CRUMBC NAME 3 0 ./Name 30 5 ./Packages 55 47 ./Basenames 20 13 ./Providename 3 0 ./Obsoletename 8 0 ./Installtid 12 8 ./Requirename 2 0 ./Group 19 4 ./Dirnames 12 5 ./Sha1header 7 0 ./Sigmd5 1 0 ./Triggername 1 0 ./Conflictname 1 0 ./Pubkeys Now I'm not 100% sure what the difference between FRAGC (Fragments) and CRUMBC (Crumbs) in shake's terms is, but high numbers => lots of fragmentation. Not that I'd care anymore, after journald forced SSDs onto all my machines... :-)
Maybe a good workaround would be cat /var/lib/rpm/Packages > /dev/null ; rpm -qa which I expect to be much faster than a simple rpm -qa
No, the problem is the fragmentation, not the bad read pattern of rpm. "cat" will take long and rpm will be faster, overall it will be not much gain. I tried stuff like that before :-)
Yes, this is a serious proposal ;-)
# echo 3 > /proc/sys/vm/drop_caches
cb@geeko:~> time (cat /var/lib/rpm/Packages >/dev/null ; rpm -qa|wc -l ) 3071
real 0m5.837s user 0m1.599s sys 0m0.293s
hm, ok, this contradicts my experience. Well, maybe rpm has gotten worse :-) Have fun, seife -- Stefan Seyfried "Your mail is 7 pages of printout. Do you seriously expect people that do openSUSE in their free time to read that? Little less Castro, little more JFK..." -- coolo -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org

On Mon, Sep 15, 2014 at 4:58 PM, Stefan Seyfried <stefan.seyfried@googlemail.com> wrote:
tried cat $300_mb_file > /dev/null and it took 4 seconds. Why does reading /var/lib/rpm/ (227 MB) take 1:37 minutes (or still 21 seconds after --rebuilddb)?
Because the files are usually fragmented like hell:
susi:/var/lib/rpm # shake -vvp .|grep -v alternatives|awk '{print $4" "$5" "$8}' FRAGC CRUMBC NAME 3 0 ./Name 30 5 ./Packages 55 47 ./Basenames 20 13 ./Providename 3 0 ./Obsoletename 8 0 ./Installtid 12 8 ./Requirename 2 0 ./Group 19 4 ./Dirnames 12 5 ./Sha1header 7 0 ./Sigmd5 1 0 ./Triggername 1 0 ./Conflictname 1 0 ./Pubkeys
Now I'm not 100% sure what the difference between FRAGC (Fragments) and CRUMBC (Crumbs) in shake's terms is, but high numbers => lots of fragmentation.
Not that I'd care anymore, after journald forced SSDs onto all my machines... :-)
time rpm -qa | wc -l 4029
real 0m30.762s user 0m1.488s sys 0m0.996s iostat -x -m -d 5 while running the above: Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sda 14.00 0.80 2.20 0.60 0.06 0.01 50.29 0.01 4.93 3.64 9.67 4.93 1.38 sda 159.40 2.20 43.80 1.40 1.25 0.01 57.10 0.69 13.72 13.24 28.71 5.26 23.76 sda 122.40 1.20 446.20 5.60 7.16 0.17 33.21 2.87 6.50 3.01 284.25 2.15 97.24 sda 29.80 7.60 369.00 11.60 5.87 0.07 31.97 1.96 5.14 4.71 18.97 2.58 98.16 sda 0.60 3.40 443.20 1.60 7.87 0.02 36.33 1.24 2.80 2.73 19.62 2.17 96.36 sda 0.60 0.60 522.40 0.60 7.93 0.00 31.09 1.52 2.90 2.89 11.67 1.84 96.02 sda 6.00 0.00 425.20 0.20 7.79 0.01 37.56 1.41 3.32 3.27 103.00 2.24 95.20 sda 0.80 0.40 433.00 0.80 5.88 0.00 27.77 1.36 3.15 3.11 25.50 2.07 89.78 sda 0.00 0.60 2.80 0.60 0.20 0.00 123.76 0.02 7.06 4.86 17.33 5.65 1.92 See the avgqu-sz of practically 1 throughout? That's synchronous random I/O of ~16k reads each. Now,
time cat /var/lib/rpm/Packages > /dev/null
real 0m3.080s user 0m0.008s sys 0m0.113s Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sda 0.80 1.20 7.20 1.40 0.27 0.01 67.91 0.07 8.07 8.36 6.57 5.26 4.52 sda 19.80 2.00 204.40 1.80 41.51 0.02 412.55 3.10 15.01 7.53 865.44 3.00 61.88 sda 5.20 3.20 2.60 1.40 0.04 0.03 33.60 0.05 12.30 8.15 20.00 7.25 2.90 That's sequential I/O I don't have shake, and cnf doesn't find it. But I bet my /var/lib/rpm is just as fragmented, I install/update stuff all the time and have never rebuilt the rpmdb manually. It's all about the I/O pattern. If I replace the cat with dd and a blocksize of 16k, it's the same. In fact, lets strace: sudo strace -c rpm -qa: % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 99.85 0.168894 3 52370 pread 0.10 0.000177 0 4029 write 0.04 0.000064 0 8067 rt_sigprocmask 0.00 0.000007 0 130 37 open 0.00 0.000000 0 147 read 0.00 0.000000 0 97 close 0.00 0.000000 0 15 4 stat 0.00 0.000000 0 90 fstat 0.00 0.000000 0 1 poll 0.00 0.000000 0 2 lseek 0.00 0.000000 0 127 mmap 0.00 0.000000 0 38 mprotect 0.00 0.000000 0 62 munmap 0.00 0.000000 0 28 brk 0.00 0.000000 0 2 rt_sigaction 0.00 0.000000 0 4 3 access 0.00 0.000000 0 5 mremap 0.00 0.000000 0 1 socket 0.00 0.000000 0 1 connect 0.00 0.000000 0 1 sendto 0.00 0.000000 0 1 recvmsg 0.00 0.000000 0 1 execve 0.00 0.000000 0 1 uname 0.00 0.000000 0 10 fcntl 0.00 0.000000 0 4 getdents 0.00 0.000000 0 1 getrlimit 0.00 0.000000 0 1 getuid 0.00 0.000000 0 1 getgid 0.00 0.000000 0 2 1 statfs 0.00 0.000000 0 1 arch_prctl 0.00 0.000000 0 1 futex 0.00 0.000000 0 1 set_tid_address 0.00 0.000000 0 4 2 openat 0.00 0.000000 0 1 set_robust_list ------ ----------- ----------- --------- --------- ---------------- 100.00 0.169142 65247 47 total Extract: pread(3, "\0\0\0\0\1\0\0\0\205\310\0\0\0\0\0\0\206\310\0\0\1\0\346\17\0\7\0\0\0\200\0\0"..., 4096, 210259968) = 4096 pread(3, "\0\0\0\0\1\0\0\0\206\310\0\0\205\310\0\0\207\310\0\0\1\0\346\17\0\7\26\366S\26\26\365"..., 4096, 210264064) = 4096 pread(3, "\0\0\0\0\1\0\0\0\207\310\0\0\206\310\0\0\210\310\0\0\1\0\346\17\0\007718c57"..., 4096, 210268160) = 4096 pread(3, "\0\0\0\0\1\0\0\0\210\310\0\0\207\310\0\0\211\310\0\0\1\0\346\17\0\7\0root\0"..., 4096, 210272256) = 4096 pread(3, "\0\0\0\0\1\0\0\0\211\310\0\0\210\310\0\0\212\310\0\0\1\0\346\17\0\7\0\0\0\1\0\0"..., 4096, 210276352) = 4096 pread(3, "\0\0\0\0\1\0\0\0\212\310\0\0\211\310\0\0\213\310\0\0\1\0\346\17\0\0072pdf-0"..., 4096, 210280448) = 4096 pread(3, "\0\0\0\0\1\0\0\0\213\310\0\0\212\310\0\0\214\310\0\0\1\0\346\17\0\7\0\0004$\0\0"..., 4096, 210284544) = 4096 pread(3, "\0\0\0\0\1\0\0\0\214\310\0\0\213\310\0\0\215\310\0\0\1\0\346\17\0\007280\0cf"..., 4096, 210288640) = 4096 pread(3, "\0\0\0\0\1\0\0\0\215\310\0\0\214\310\0\0\216\310\0\0\1\0\346\17\0\7\0\0\0\0\0\0"..., 4096, 210292736) = 4096 pread(3, "\0\0\0\0\1\0\0\0\216\310\0\0\215\310\0\0\217\310\0\0\1\0\346\17\0\7yle1 i"..., 4096, 210296832) = 4096 pread(3, "\0\0\0\0\1\0\0\0\217\310\0\0\216\310\0\0\220\310\0\0\1\0\346\17\0\7.pyc\0o"..., 4096, 210300928) = 4096 pread(3, "\0\0\0\0\1\0\0\0\220\310\0\0\217\310\0\0\0\0\0\0\1\0\16\3\0\7\0\1\0\0\0\1"..., 4096, 210305024) = 4096 rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1], [], 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 write(1, "python-rst2pdf-0.93-10.2.noarch\n", 32python-rst2pdf-0.93-10.2.noarch I've checked a few offsets. Sequential. But, two such consecutive snippets (that's the snippet that seems to be for python-rst2pdf) are not sequential at all. As a result, kernel read-ahead doesn't read ahead, and I/O is random. The fix: traverse the bigger database (which seems to be /var/lib/rpm/Packages) in sequential order. How? Well, depends on exactly how that db is created. -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org

2014-09-16 4:58 GMT+09:00 Stefan Seyfried <stefan.seyfried@googlemail.com>:
Am 15.09.2014 um 21:42 schrieb Christian Boltz:
susi:/var/lib/rpm # shake -vvp .|grep -v alternatives|awk '{print $4" "$5" "$8}'
How can I get "shake" command? I searched in software.opensuse.org, but "shake" package is not same as yours. -- 1xx <ItSANgo@gmail.com> <https://twitter.com/ItSANgo> Mitsutoshi NAKANO <bkbin005@rinku.zaq.ne.jp> <http://d.hatena.ne.jp/Itisango/> -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org

On Monday 2014-09-15 23:26, 1xx wrote:
2014-09-16 4:58 GMT+09:00 Stefan Seyfried <stefan.seyfried@googlemail.com>:
Am 15.09.2014 um 21:42 schrieb Christian Boltz:
susi:/var/lib/rpm # shake -vvp .|grep -v alternatives|awk '{print $4" "$5" "$8}'
How can I get "shake" command? I searched in software.opensuse.org, but "shake" package is not same as yours.
Just use the regular tools: 23:31 ares08:~ > xfs_bmap /var/lib/rpm/Packages /var/lib/rpm/Packages: 0: [0..23]: 118144480..118144503 1: [24..278095]: 130080616..130358687 23:31 ares08:~ > filefrag /var/lib/rpm/Packages /var/lib/rpm/Packages: 2 extents found 23:31 ares08:~ > filefrag -v /var/lib/rpm/Packages Filesystem type is: 58465342 File size of /var/lib/rpm/Packages is 142385152 (34762 blocks of 4096 bytes) ext: logical_offset: physical_offset: length: expected: flags: 0: 0.. 2: 14768060.. 14768062: 3: 1: 3.. 34761: 16260077.. 16294835: 34759: 14768063: eof /var/lib/rpm/Packages: 2 extents found -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org

2014-09-16 6:32 GMT+09:00 Jan Engelhardt <jengelh@inai.de>:
On Monday 2014-09-15 23:26, 1xx wrote:
Just use the regular tools:
23:31 ares08:~ > filefrag /var/lib/rpm/Packages /var/lib/rpm/Packages: 2 extents found 23:31 ares08:~ > filefrag -v /var/lib/rpm/Packages
I could use "filefrag" command. Thank you! -- 1xx <ItSANgo@gmail.com> <https://twitter.com/ItSANgo> Mitsutoshi NAKANO <bkbin005@rinku.zaq.ne.jp> <http://d.hatena.ne.jp/Itisango/> -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org

On 09/15/2014 11:39 PM, 1xx wrote:
I could use "filefrag" command.
Fragmentation may partly be the culprit for "rpm -qa" being slow, but even after de-fragmenting the files with something like $ for f in /var/lib/rpm/[A-Z]* ; do e4defrag $f ; done the command is not much faster. The big boost is when they are already cached as Christian already demonstrated. Therefore, the culprit is the sheer number of pread(2) calls reading 4k blocks from /var/lib/rpm/Packages. Have a nice day, Berny -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org

Thanks you everyone! 2014-09-15 18:04 GMT+09:00 1xx <itsango@gmail.com>:
I put etckeeper-1.13 in openSUSE Factory's official repository. https://build.opensuse.org/package/show?project=openSUSE%3AFactory&package=e...
And I added patch of https://build.opensuse.org/package/view_file/openSUSE:Factory/etckeeper/etck... https://github.com/joeyh/etckeeper/pull/17 .
When ZYpp install or remove, etckeeper makes all package list as rpm -qa twice for getting changed packages list in ZYpp plugin. eg) before install: rpm -qa | sort >before_packagelist after install: rpm -qa | sort >after_packagelist diff -u before_packagelist after_packagelist >changed_packageslist But these "rpm -qa" are slowly. (sometimes over 30 sec.) Over 30 sec., ZYpp plugin gets timeout, so sometimes etckeeper's autocommits fail.
So I wrote a patch of gh#joeyh/etckeeper#17. This patch provide a choice of whether or not make "changed packages list"
But etckeeper's author (Mr. Joey Hess) said: https://github.com/joeyh/etckeeper/pull/17#issuecomment-55059127
then that seems very poor, since etckeeper could take a while to run for any number of reasons, including the system being busy.
So we execute "time rpm -qa | wc" in various environment.
I posted a comment to gh#joeyh/etckeeper#17 . https://github.com/joeyh/etckeeper/pull/17#issuecomment-55693937 -- 1xx <ItSANgo@gmail.com> <https://twitter.com/ItSANgo> Mitsutoshi NAKANO <bkbin005@rinku.zaq.ne.jp> <http://d.hatena.ne.jp/Itisango/> -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
participants (12)
-
1xx
-
Alin Marin Elena
-
Bernhard Voelker
-
Bruno Friedmann
-
Carlos E. R.
-
Christian Boltz
-
Claudio Freire
-
Cristian Rodríguez
-
Jan Engelhardt
-
Oliver Neukum
-
Per Jessen
-
Stefan Seyfried