[opensuse-kde] Understanding what Baloo is doing
I'm starting a new thread because the other is filled to the brim with pissing and moaning - I hope everyone keeps that over there.. and OUT of this thread. So.. I did a clean install of openSUSE 13.1 (full format of all partitions) and then updated to KDE4 Current. I enabled Baloo to index my home (clean home with nothing in it) and a 1TB storage drive (with about 600GB of data... music, video and documents). All other mount points are explicitly excluded. It's been running now for several hours and I'm seeing: - baloo_file_extractor and it's pegging one CPU core at 100% and keeping it there - baloo_file is running a second core at about 95% and keeping it there. So why would Baloo be churning for so many hours... and is there a way to see what Baloo might be stuck on? My system has enough gigahertz to drive all this without any noticeable impact. I am wondering though is this... "normal" or have I got something in my 1TB drive that is a troublemaker for Baloo? or something else? Or is this even considered "a problem"? C. -- openSUSE 13.1 x86_64, KDE 4.13 -- To unsubscribe, e-mail: opensuse-kde+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kde+owner@opensuse.org
Dne Pá 2. května 2014 11:01:24, C napsal(a):
I'm starting a new thread because the other is filled to the brim with pissing and moaning - I hope everyone keeps that over there.. and OUT of this thread.
So.. I did a clean install of openSUSE 13.1 (full format of all partitions) and then updated to KDE4 Current. I enabled Baloo to index my home (clean home with nothing in it) and a 1TB storage drive (with about 600GB of data... music, video and documents). All other mount points are explicitly excluded. It's been running now for several hours and I'm seeing:
- baloo_file_extractor and it's pegging one CPU core at 100% and keeping it there - baloo_file is running a second core at about 95% and keeping it there.
So why would Baloo be churning for so many hours... and is there a way to see what Baloo might be stuck on?
My system has enough gigahertz to drive all this without any noticeable impact. I am wondering though is this... "normal" or have I got something in my 1TB drive that is a troublemaker for Baloo? or something else? Or is this even considered "a problem"?
I think I have similar problem. See https://bugs.kde.org/show_bug.cgi?id=333655 There are also some hints how to track it. The bottleneck isn't CPU or RAM, but HDD... Check atop, iotop. Best, V. -- Vojtěch Zeisek Komunita openSUSE GNU/Linuxu Community of the openSUSE GNU/Linux http://www.opensuse.org/ http://trapa.cz/
On Fri, May 2, 2014 at 11:07 AM, Vojtěch Zeisek <vojtech.zeisek@opensuse.org> wrote:
Dne Pá 2. května 2014 11:01:24, C napsal(a):
I'm starting a new thread because the other is filled to the brim with pissing and moaning - I hope everyone keeps that over there.. and OUT of this thread.
So.. I did a clean install of openSUSE 13.1 (full format of all partitions) and then updated to KDE4 Current. I enabled Baloo to index my home (clean home with nothing in it) and a 1TB storage drive (with about 600GB of data... music, video and documents). All other mount points are explicitly excluded. It's been running now for several hours and I'm seeing:
- baloo_file_extractor and it's pegging one CPU core at 100% and keeping it there - baloo_file is running a second core at about 95% and keeping it there.
So why would Baloo be churning for so many hours... and is there a way to see what Baloo might be stuck on?
My system has enough gigahertz to drive all this without any noticeable impact. I am wondering though is this... "normal" or have I got something in my 1TB drive that is a troublemaker for Baloo? or something else? Or is this even considered "a problem"?
I think I have similar problem. See https://bugs.kde.org/show_bug.cgi?id=333655 There are also some hints how to track it. The bottleneck isn't CPU or RAM, but HDD... Check atop, iotop.
I looked into atop and it at least reports the same as I'm seeing elsewhere... PID TID RUID EUID THR SYSCPU USRCPU VGROW RGROW RDDSK WRDSK ST EXC S CPUNR CPU CMD 1/2 18827 - username username 1 5.19s 4.75s 0K 0K 0K 0K -- - R 0 100% baloo_file_ext 1333 - username username 2 4.56s 4.18s 0K 0K 0K 0K -- - R 1 90% baloo_file C. -- openSUSE 13.1 x86_64, KDE 4.13 -- To unsubscribe, e-mail: opensuse-kde+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kde+owner@opensuse.org
On Friday 02 of May 2014 11:07:28 Vojtěch Zeisek wrote:
Dne Pá 2. května 2014 11:01:24, C napsal(a):
I'm starting a new thread because the other is filled to the brim with pissing and moaning - I hope everyone keeps that over there.. and OUT of this thread.
So.. I did a clean install of openSUSE 13.1 (full format of all partitions) and then updated to KDE4 Current. I enabled Baloo to index my home (clean home with nothing in it) and a 1TB storage drive (with about 600GB of data... music, video and documents). All other mount points are explicitly excluded. It's been running now for
several hours and I'm seeing: - baloo_file_extractor and it's pegging one CPU core at 100% and
keeping it there
- baloo_file is running a second core at about 95% and keeping it there.
So why would Baloo be churning for so many hours... and is there a way to see what Baloo might be stuck on?
My system has enough gigahertz to drive all this without any noticeable impact. I am wondering though is this... "normal" or have I got something in my 1TB drive that is a troublemaker for Baloo? or something else? Or is this even considered "a problem"?
I think I have similar problem. See https://bugs.kde.org/show_bug.cgi?id=333655 There are also some hints how to track it. The bottleneck isn't CPU or RAM, but HDD... Check atop, iotop. Next KDE:Current update should contain some upstream optimizations in this area, so you might want to test later during the day once it publishes ;-)
Cheers, Hrvoje
Best, V.
On Fri, May 2, 2014 at 12:41 PM, šumski <hrvoje.senjan@gmail.com> wrote:
Next KDE:Current update should contain some upstream optimizations in this area, so you might want to test later during the day once it publishes ;-)
I've just now updated and rebooted just for good measure (yah I know I didn't need to, but this way I'm sure that everything is loading properly post update). Baloo had finished whatever it was doing sometime in the past 2 hours (so it ran the indexing process for around 8 hours over a drive with 600GB of media). Prior to the update there was no indexing occurring. After the update.. still quiet. Whatever Baloo was doing to chew up the CPU and I/O is not starting up again. I could try clearing the Baloo database file. I'm guessing that would trigger Baloo to re-index? Will that work to reset things and put Baloo back to initial status to see if the CPU load and I/O is still happening? C. -- openSUSE 13.1 x86_64, KDE 4.13 -- To unsubscribe, e-mail: opensuse-kde+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kde+owner@opensuse.org
Dne Pá 2. května 2014 17:56:36, C napsal(a):
On Fri, May 2, 2014 at 12:41 PM, šumski <hrvoje.senjan@gmail.com> wrote:
Next KDE:Current update should contain some upstream optimizations in this area, so you might want to test later during the day once it publishes ;-)
I've just now updated and rebooted just for good measure (yah I know I didn't need to, but this way I'm sure that everything is loading properly post update).
Baloo had finished whatever it was doing sometime in the past 2 hours (so it ran the indexing process for around 8 hours over a drive with 600GB of media). Prior to the update there was no indexing occurring. After the update.. still quiet. Whatever Baloo was doing to chew up the CPU and I/O is not starting up again.
I have same experience. After upgrade, Baloo is still consuming 30-40 % of CPU. but not so much I/O, so that the computer is usable, although it is not perfect. ;-) Definitely change for better, thanks devs! All the best, V. -- Vojtěch Zeisek Komunita openSUSE GNU/Linuxu Community of the openSUSE GNU/Linux http://www.opensuse.org/ http://trapa.cz/
On 05/02/2014 07:59 PM, Vojtěch Zeisek wrote:
Dne Pá 2. května 2014 17:56:36, C napsal(a):
On Fri, May 2, 2014 at 12:41 PM, šumski <hrvoje.senjan@gmail.com> wrote:
Next KDE:Current update should contain some upstream optimizations in this area, so you might want to test later during the day once it publishes ;-)
I've just now updated and rebooted just for good measure (yah I know I didn't need to, but this way I'm sure that everything is loading properly post update).
Baloo had finished whatever it was doing sometime in the past 2 hours (so it ran the indexing process for around 8 hours over a drive with 600GB of media). Prior to the update there was no indexing occurring. After the update.. still quiet. Whatever Baloo was doing to chew up the CPU and I/O is not starting up again.
I have same experience. After upgrade, Baloo is still consuming 30-40 % of CPU. but not so much I/O, so that the computer is usable, although it is not perfect. ;-) Definitely change for better, thanks devs! All the best, V.
not for me, in my PC=Dell latitude E6510, RAM=8Gb, GPU=GT218 NVS 3100M, CPU=i7 Q 720 @ 1.60GHz, OS=opensuse 13.1 KDE= 4.13.0 I used the new-advanced-baloo-configuration-tool, an it is very ok, to start and stop baloo indexing, but, even leaving baloo indexing active for all the night, the laptop was unusable when baloo indexing is active, my system monitor ksysguard doesn't identify more than 10% baloo cpu usage, but laptop isn't unusable likewise..:-) :-) :-) -- To unsubscribe, e-mail: opensuse-kde+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kde+owner@opensuse.org
On Saturday 03 May 2014 11:16:19 yahoo-pier_andreit wrote:
On 05/02/2014 07:59 PM, Vojtěch Zeisek wrote:
Dne Pá 2. května 2014 17:56:36, C napsal(a):
On Fri, May 2, 2014 at 12:41 PM, šumski <hrvoje.senjan@gmail.com> wrote:
Next KDE:Current update should contain some upstream optimizations in this area, so you might want to test later during the day once it publishes ;-)
I've just now updated and rebooted just for good measure (yah I know I didn't need to, but this way I'm sure that everything is loading properly post update).
Baloo had finished whatever it was doing sometime in the past 2 hours (so it ran the indexing process for around 8 hours over a drive with 600GB of media). Prior to the update there was no indexing occurring. After the update.. still quiet. Whatever Baloo was doing to chew up the CPU and I/O is not starting up again.
I have same experience. After upgrade, Baloo is still consuming 30-40 % of CPU. but not so much I/O, so that the computer is usable, although it is not perfect. ;-) Definitely change for better, thanks devs! All the best, V.
not for me, in my PC=Dell latitude E6510, RAM=8Gb, GPU=GT218 NVS 3100M, CPU=i7 Q 720 @ 1.60GHz, OS=opensuse 13.1 KDE= 4.13.0 I used the new-advanced-baloo-configuration-tool, an it is very ok, to start and stop baloo indexing, but, even leaving baloo indexing active for all the night, the laptop was unusable when baloo indexing is active, my system monitor ksysguard doesn't identify more than 10% baloo cpu usage, but laptop isn't unusable likewise..:-) :-) :-)
Do you have a hard drive or SSD? So it still hasn't finished indexing after a whole night? Wow, there must be some files on there that are really hard to index. How big is your drive and do you happen to have huge text files on it? Does anybody know how to debug this and find out what files cause the problem for Pier? This seems like something Vishesh (baloo developer) wants to know - what files cause Search to be indexing for a while night... -- To unsubscribe, e-mail: opensuse-kde+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kde+owner@opensuse.org
Dne So 3. května 2014 13:34:50, Jos Poortvliet napsal(a):
On Saturday 03 May 2014 11:16:19 yahoo-pier_andreit wrote:
On 05/02/2014 07:59 PM, Vojtěch Zeisek wrote:
Dne Pá 2. května 2014 17:56:36, C napsal(a):
On Fri, May 2, 2014 at 12:41 PM, šumski <hrvoje.senjan@gmail.com> wrote: Baloo had finished whatever it was doing sometime in the past 2 hours (so it ran the indexing process for around 8 hours over a drive with 600GB of media). Prior to the update there was no indexing occurring. After the update.. still quiet. Whatever Baloo was doing to chew up the CPU and I/O is not starting up again.
I have same experience. After upgrade, Baloo is still consuming 30-40 % of CPU. but not so much I/O, so that the computer is usable, although it is not perfect. ;-) Definitely change for better, thanks devs! All the best, V.
not for me, in my PC=Dell latitude E6510, RAM=8Gb, GPU=GT218 NVS 3100M, CPU=i7 Q 720 @ 1.60GHz, OS=opensuse 13.1 KDE= 4.13.0 I used the new-advanced-baloo-configuration-tool, an it is very ok, to start and stop baloo indexing, but, even leaving baloo indexing active for all the night, the laptop was unusable when baloo indexing is active, my system monitor ksysguard doesn't identify more than 10% baloo cpu usage, but laptop isn't unusable likewise..:-) :-) :-)
Do you have a hard drive or SSD?
So it still hasn't finished indexing after a whole night? Wow, there must be some files on there that are really hard to index. How big is your drive and do you happen to have huge text files on it?
Does anybody know how to debug this and find out what files cause the problem for Pier? This seems like something Vishesh (baloo developer) wants to know - what files cause Search to be indexing for a while night...
Well, I tried let it for 3 nights, last night with updated Baloo, not finished... :-/ My patience is limited... Now, it consumes „only“ up to 40 % of CPU and I/O lags are not so terrible (the computer is usable, but it is far from normal state), but it is still running. As it doesn't show any progress (Other nice feature to implement!:-), I have no idea how long to wait... I have SSD for system and Seagate hybrid HDD (with 8 GB SSD flash as sort of cache) for /home. Also 16 GB RAM, i7 CPU 8x3.4 GHz. I have a lot of pictures there (including raw files from digital camera), a lot of music in ogg, thousands of PDFs and huge amount of weird (mostly text) files, like DNA sequences, huge matrices from permutations (I'm biologist) and so on... Well, might be not so typical, but really so weird...? ;-) So, now the situation is much better than it was on beginning, but still not so good... Have a nice day, Vojtěch -- Vojtěch Zeisek Komunita openSUSE GNU/Linuxu Community of the openSUSE GNU/Linux http://www.opensuse.org/ http://trapa.cz/
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2014-05-03 13:40, Vojtěch Zeisek wrote:
thousands of PDFs and huge amount of weird (mostly text) files, like DNA sequences, huge matrices from permutations (I'm biologist) and so on...
Those would appear to be "text", but (my educated guess) they are almost random data. If they are large, they will make any content indexer to go berserk. Unless the indexer can be redesigned to skip those files, or that content, automatically, you will have to tell it to skip those folders. - -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iEYEARECAAYFAlNlUKQACgkQtTMYHG2NR9VXqACfSOKftaPXw4qyR49qnt7EmdJe nEIAnRvSNEw+LsdcsUIrdgrB3qlzSCSy =Pijt -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse-kde+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kde+owner@opensuse.org
On Saturday, May 03, 2014 22:25:08 Carlos E. R. wrote:
thousands of PDFs and huge amount of weird (mostly text) files, like DNA sequences, huge matrices from permutations (I'm biologist) and so on...
Those would appear to be "text", but (my educated guess) they are almost random data. If they are large, they will make any content indexer to go berserk.
Why is indexing one large file more of a problem than indexing the same amount of text in multiple small(er) files? Regards mararm -- Have you locked your file cabinet? -- To unsubscribe, e-mail: opensuse-kde+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kde+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2014-05-03 23:10, mararm wrote:
On Saturday, May 03, 2014 22:25:08 Carlos E. R. wrote:
thousands of PDFs and huge amount of weird (mostly text) files, like DNA sequences, huge matrices from permutations (I'm biologist) and so on...
Those would appear to be "text", but (my educated guess) they are almost random data. If they are large, they will make any content indexer to go berserk.
Why is indexing one large file more of a problem than indexing the same amount of text in multiple small(er) files?
No, the problem in this case (IMHO) is not the size, but the lack of patterns in those files, because they are like random data. When indexing the contents of text files you want later to be able to search for a particular word or sentence, and find it fast. So you would (very rough approximation) create a list of sentences and their locations. If the data is so random, you would have to create location indexes for million of separate words... That's just a guess. I have never studied this type of thing, so I can only make guesses. Just think of compressing a file of random data: it can not be compressed. This must be similar. If this data (of Vojt?ch) can be indexed, it will take a lot of CPU to do so, or disk space, or both. - -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iEYEARECAAYFAlNlnokACgkQtTMYHG2NR9X6rACfc4CXkp6IAY7+RxbHeUJs4Rdv J5QAn0VMOD4Z8UkOQ4OK92o+aOj7/DMV =JA4u -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse-kde+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kde+owner@opensuse.org
On Saturday 03 May 2014 13:40:14 Vojtěch Zeisek wrote:
Dne So 3. května 2014 13:34:50, Jos Poortvliet napsal(a):
On Saturday 03 May 2014 11:16:19 yahoo-pier_andreit wrote:
On 05/02/2014 07:59 PM, Vojtěch Zeisek wrote:
Dne Pá 2. května 2014 17:56:36, C napsal(a):
On Fri, May 2, 2014 at 12:41 PM, šumski <hrvoje.senjan@gmail.com>
wrote:
Baloo had finished whatever it was doing sometime in the past 2 hours (so it ran the indexing process for around 8 hours over a drive with 600GB of media). Prior to the update there was no indexing occurring. After the update.. still quiet. Whatever Baloo was doing to chew up the CPU and I/O is not starting up again.
I have same experience. After upgrade, Baloo is still consuming 30-40 % of CPU. but not so much I/O, so that the computer is usable, although it is not perfect. ;-) Definitely change for better, thanks devs! All the best, V.
not for me, in my PC=Dell latitude E6510, RAM=8Gb, GPU=GT218 NVS 3100M, CPU=i7 Q 720 @ 1.60GHz, OS=opensuse 13.1 KDE= 4.13.0 I used the new-advanced-baloo-configuration-tool, an it is very ok, to start and stop baloo indexing, but, even leaving baloo indexing active for all the night, the laptop was unusable when baloo indexing is active, my system monitor ksysguard doesn't identify more than 10% baloo cpu usage, but laptop isn't unusable likewise..:-) :-) :-)
Do you have a hard drive or SSD?
So it still hasn't finished indexing after a whole night? Wow, there must be some files on there that are really hard to index. How big is your drive and do you happen to have huge text files on it?
Does anybody know how to debug this and find out what files cause the problem for Pier? This seems like something Vishesh (baloo developer) wants to know - what files cause Search to be indexing for a while night... Well, I tried let it for 3 nights, last night with updated Baloo, not finished... :-/ My patience is limited... Now, it consumes „only“ up to 40 % of CPU and I/O lags are not so terrible (the computer is usable, but it is far from normal state), but it is still running. As it doesn't show any progress (Other nice feature to implement!:-), I have no idea how long to wait... I have SSD for system and Seagate hybrid HDD (with 8 GB SSD flash as sort of cache) for /home. Also 16 GB RAM, i7 CPU 8x3.4 GHz. I have a lot of pictures there (including raw files from digital camera), a lot of music in ogg, thousands of PDFs and huge amount of weird (mostly text) files, like DNA sequences, huge matrices from permutations (I'm biologist) and so on... Yeah, that goes crazy. They put in a few fixes, but as Carlos also pointed out, huge text files (any flat text file over ~50 MB) causes problems. Baloo will detect this and put them on a black list of files for the indexer, but the detecting is essentially a matter of noticing it takes too long and then identifying the file (taking even longer). So this can easily keep your system busy for a looooong time until it has put them all on the black list.
I suggest you blacklist the folders with the evil big text files... Again, some fixes are going on for this, but I'm not sure if Vishesh can really fix this.
Well, might be not so typical, but really so weird...? ;-) So, now the situation is much better than it was on beginning, but still not so good... Have a nice day, Vojtěch
Hi Dne St 7. května 2014 18:20:31 jste napsal(a):
On Saturday 03 May 2014 13:40:14 Vojtěch Zeisek wrote:
Dne So 3. května 2014 13:34:50, Jos Poortvliet napsal(a):
On Saturday 03 May 2014 11:16:19 yahoo-pier_andreit wrote:
On 05/02/2014 07:59 PM, Vojtěch Zeisek wrote:
Dne Pá 2. května 2014 17:56:36, C napsal(a):
On Fri, May 2, 2014 at 12:41 PM, šumski <hrvoje.senjan@gmail.com> Well, I tried let it for 3 nights, last night with updated Baloo, not finished... :-/ My patience is limited... Now, it consumes „only“ up to 40 % of CPU and I/O lags are not so terrible (the computer is usable, but it is far from normal state), but it is still running. As it doesn't show any progress (Other nice feature to implement!:-), I have no idea how long to wait... I have SSD for system and Seagate hybrid HDD (with 8 GB SSD flash as sort of cache) for /home. Also 16 GB RAM, i7 CPU 8x3.4 GHz. I have a lot of pictures there (including raw files from digital camera), a lot of music in ogg, thousands of PDFs and huge amount of weird (mostly text) files, like DNA sequences, huge matrices from permutations (I'm biologist) and so on...
Yeah, that goes crazy. They put in a few fixes, but as Carlos also pointed out, huge text files (any flat text file over ~50 MB) causes problems. Baloo will detect this and put them on a black list of files for the indexer, but the detecting is essentially a matter of noticing it takes too long and then identifying the file (taking even longer). So this can easily keep your system busy for a looooong time until it has put them all on the black list.
Actually, I don' think this is the only source of problems. As I see https://bugs.kde.org/show_bug.cgi?id=333655 there are more people having such issues and I don't believe all of them are scientists with very weird data on their disks. :-) Moreover, as soon as I don't need those weird text matrices, I zip them to reduce used space. BTW, how does Baloo treat zipped files? And DNA data DO contain words! :-D Well, I don't think DNA „words“ are more weird from point of view of index engine than texts in Czech or other such a strange language. :-D No, seriously, there have to be another issue...
I suggest you blacklist the folders with the evil big text files...
They are dispersed in various folders. That's why I like possibility to blacklist file types. ;-)
Again, some fixes are going on for this, but I'm not sure if Vishesh can really fix this.
Seriously, back in the time of openSUSE 12.3 I was using Beagle as search engine on Intel Celeron 1.6 GHz with about 2 GB RAM (if I remember well). With similar sort of data, it was working perfectly. Might be not so feature rich, but it worked. Id didn't make my computer unusable and I cold use all types of search I liked... Since than, changes are only to worst... :-/
Well, might be not so typical, but really so weird...? ;-) So, now the situation is much better than it was on beginning, but still not so good... Have a nice day, Vojtěch
Bye, Vojtěch -- Vojtěch Zeisek Komunita openSUSE GNU/Linuxu Community of the openSUSE GNU/Linux http://www.opensuse.org/ http://trapa.cz/
Well, I have next same bad experience with it. I upgrade my netbook with Intel Atom. Well, it is not very fast computer, but it doesn't contain such a huge amount of huge and weird data. Normally, it works fine, but with Baloo it is disaster. No, I don't believe in that software very much... And waiting for fixing of some bugs shows it won't be so easy to make it really well working... V. Dne St 7. května 2014 18:20:31 jste napsal(a):
On Saturday 03 May 2014 13:40:14 Vojtěch Zeisek wrote:
Dne So 3. května 2014 13:34:50, Jos Poortvliet napsal(a):
On Saturday 03 May 2014 11:16:19 yahoo-pier_andreit wrote:
On 05/02/2014 07:59 PM, Vojtěch Zeisek wrote:
Dne Pá 2. května 2014 17:56:36, C napsal(a):
On Fri, May 2, 2014 at 12:41 PM, šumski <hrvoje.senjan@gmail.com>
wrote:
Baloo had finished whatever it was doing sometime in the past 2 hours (so it ran the indexing process for around 8 hours over a drive with 600GB of media). Prior to the update there was no indexing occurring. After the update.. still quiet. Whatever Baloo was doing to chew up the CPU and I/O is not starting up again.
I have same experience. After upgrade, Baloo is still consuming 30-40 % of CPU. but not so much I/O, so that the computer is usable, although it is not perfect. ;-) Definitely change for better, thanks devs! All the best, V.
not for me, in my PC=Dell latitude E6510, RAM=8Gb, GPU=GT218 NVS 3100M, CPU=i7 Q 720 @ 1.60GHz, OS=opensuse 13.1 KDE= 4.13.0 I used the new-advanced-baloo-configuration-tool, an it is very ok, to start and stop baloo indexing, but, even leaving baloo indexing active for all the night, the laptop was unusable when baloo indexing is active, my system monitor ksysguard doesn't identify more than 10% baloo cpu usage, but laptop isn't unusable likewise..:-) :-) :-)
Do you have a hard drive or SSD?
So it still hasn't finished indexing after a whole night? Wow, there must be some files on there that are really hard to index. How big is your drive and do you happen to have huge text files on it?
Does anybody know how to debug this and find out what files cause the problem for Pier? This seems like something Vishesh (baloo developer) wants to know - what files cause Search to be indexing for a while night...
Well, I tried let it for 3 nights, last night with updated Baloo, not finished... :-/ My patience is limited... Now, it consumes „only“ up to 40 % of CPU and I/O lags are not so terrible (the computer is usable, but it is far from normal state), but it is still running. As it doesn't show any progress (Other nice feature to implement!:-), I have no idea how long to wait... I have SSD for system and Seagate hybrid HDD (with 8 GB SSD flash as sort of cache) for /home. Also 16 GB RAM, i7 CPU 8x3.4 GHz. I have a lot of pictures there (including raw files from digital camera), a lot of music in ogg, thousands of PDFs and huge amount of weird (mostly text) files, like DNA sequences, huge matrices from permutations (I'm biologist) and so on...
Yeah, that goes crazy. They put in a few fixes, but as Carlos also pointed out, huge text files (any flat text file over ~50 MB) causes problems. Baloo will detect this and put them on a black list of files for the indexer, but the detecting is essentially a matter of noticing it takes too long and then identifying the file (taking even longer). So this can easily keep your system busy for a looooong time until it has put them all on the black list.
I suggest you blacklist the folders with the evil big text files...
Again, some fixes are going on for this, but I'm not sure if Vishesh can really fix this.
Well, might be not so typical, but really so weird...? ;-) So, now the situation is much better than it was on beginning, but still not so good... Have a nice day, Vojtěch -- Vojtěch Zeisek
Komunita openSUSE GNU/Linuxu Community of the openSUSE GNU/Linux http://www.opensuse.org/ http://trapa.cz/
In data domenica 11 maggio 2014 10:55:35, Vojtěch Zeisek ha scritto:
Well, I have next same bad experience with it. I upgrade my netbook with Intel Atom. Well, it is not very fast computer, but it doesn't contain such
I have done some tests myself on an old dual core Atom. It was usable (albeit a little slow) with both PIM and file search on. With Nepomuk it was impossible to use at all. -- Luca Beltrame - KDE Forums team KDE Science supporter GPG key ID: 6E1A4E79
On Sunday, May 11, 2014 12:40:29 PM Luca Beltrame wrote:
In data domenica 11 maggio 2014 10:55:35, Vojtěch Zeisek ha scritto:
Well, I have next same bad experience with it. I upgrade my netbook with Intel Atom. Well, it is not very fast computer, but it doesn't contain such
I have done some tests myself on an old dual core Atom. It was usable (albeit a little slow) with both PIM and file search on. With Nepomuk it was impossible to use at all.
I am not ranting nor whining. Just curious about. It is funny. I have a Netbook with Intel Atom 2 cores, 2 GB RAM. I made the same test and got the on reverse outcomes that yours. I mean it just worked better on Nepomuk than Baloo. I did not even tried to use PIM with Baloo. The system randomly freezes mouse or pointer for a while. Maybe I am missing something here. Any advise? I am back on my steps for another try from scratch. Best, Rick Chung -- To unsubscribe, e-mail: opensuse-kde+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kde+owner@opensuse.org
On Saturday, May 03, 2014 13:34:50 Jos Poortvliet wrote:
So it still hasn't finished indexing after a whole night? Wow, there must be some files on there that are really hard to index. How big is your drive and do you happen to have huge text files on it?
How big a file would you consider "huge"? And, what files baloo consideres "text"? Something with it's name ending in .txt? Everything 'file' says is a text file? How about for instance XML-Files oder Perl-Code? Regards mararm -- My polyvinyl cowboy wallet was made in Hong Kong by Montgomery Clift! -- To unsubscribe, e-mail: opensuse-kde+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kde+owner@opensuse.org
On Saturday 03 May 2014 14:52:58 mararm wrote:
On Saturday, May 03, 2014 13:34:50 Jos Poortvliet wrote:
So it still hasn't finished indexing after a whole night? Wow, there must be some files on there that are really hard to index. How big is your drive and do you happen to have huge text files on it?
How big a file would you consider "huge"?
And, what files baloo consideres "text"? Something with it's name ending in .txt? Everything 'file' says is a text file? How about for instance XML-Files oder Perl-Code? They are text files (but probably excepted as a bunch of filetypes is black listed - no need to index all code). Text = anything encoded as ASCII I suppose :D
Big is >50 MB, I believe.
Regards mararm
-- To unsubscribe, e-mail: opensuse-kde+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kde+owner@opensuse.org
On Wednesday 07 May 2014 18:21:23 Jos Poortvliet wrote:
On Saturday 03 May 2014 14:52:58 mararm wrote:
And, what files baloo consideres "text"? Something with it's name ending in .txt? Everything 'file' says is a text file? How about for instance XML-Files oder Perl-Code?
They are text files (but probably excepted as a bunch of filetypes is black listed
Interesting to know. To get that list I assume I have to read the source code?
- no need to index all code).
I disagree. I have quite a lot of perl code and most of my uses of full text search are to find things in that code or its documentation.
Text = anything encoded as ASCII I suppose :D
I surely hope not. This is 2014. We have Unicode. Almost all of what I consider a text file on my storage is UTF-8 encoded.
Big is >50 MB, I believe.
OK, thanks. Regards mararm -- You will be held hostage by a radical group. -- To unsubscribe, e-mail: opensuse-kde+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kde+owner@opensuse.org
On 05/10/2014 09:40 AM, mararm wrote:
Text = anything encoded as ASCII I suppose :D
I surely hope not. This is 2014. We have Unicode. Almost all of what I consider a text file on my storage is UTF-8 encoded.
I hope so too for different reasons. If that was the only criteria then it undermines one of the principles of *NIX. Please, let us differentiate between .txt. .conf, .xml, .html, .pj, .rb, .sh, .mm, MAN source files, .ps and a plethora of other files "encoded as ASCII" -- or similar. I re4alise many of these can be identified by some kind of signature as well as suffic, but that's not always a given. There's no reason perl, rby or other interpreter source files have to begin with a hash-bang. As for UTF-8, why not? The interpreters can handle it, can't they? And while I may be a native English speaker and would like to have the comments in interpreter scripts in English, at least for 'published' scripts, I see no reason why other languages that required other character sets should not exist. -- "The wide world is all about you: you can fence yourselves in, but you cannot for ever fence it out." -- JRR Tolkien, -- To unsubscribe, e-mail: opensuse-kde+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kde+owner@opensuse.org
On 05/03/2014 01:34 PM, Jos Poortvliet wrote: sorry to reply only to jos but ctrl+R in m y thinderbird replied only to him and not to the list..:-) :-) :-)
On Saturday 03 May 2014 11:16:19 yahoo-pier_andreit wrote:
On 05/02/2014 07:59 PM, Vojtěch Zeisek wrote:
Dne Pá 2. května 2014 17:56:36, C napsal(a):
On Fri, May 2, 2014 at 12:41 PM, šumski <hrvoje.senjan@gmail.com> wrote:
Next KDE:Current update should contain some upstream optimizations in this area, so you might want to test later during the day once it publishes ;-)
I've just now updated and rebooted just for good measure (yah I know I didn't need to, but this way I'm sure that everything is loading properly post update).
Baloo had finished whatever it was doing sometime in the past 2 hours (so it ran the indexing process for around 8 hours over a drive with 600GB of media). Prior to the update there was no indexing occurring. After the update.. still quiet. Whatever Baloo was doing to chew up the CPU and I/O is not starting up again.
I have same experience. After upgrade, Baloo is still consuming 30-40 % of CPU. but not so much I/O, so that the computer is usable, although it is not perfect. ;-) Definitely change for better, thanks devs! All the best, V.
not for me, in my PC=Dell latitude E6510, RAM=8Gb, GPU=GT218 NVS 3100M, CPU=i7 Q 720 @ 1.60GHz, OS=opensuse 13.1 KDE= 4.13.0 I used the new-advanced-baloo-configuration-tool, an it is very ok, to start and stop baloo indexing, but, even leaving baloo indexing active for all the night, the laptop was unusable when baloo indexing is active, my system monitor ksysguard doesn't identify more than 10% baloo cpu usage, but laptop isn't unusable likewise..:-) :-) :-)
Do you have a hard drive or SSD?
So it still hasn't finished indexing after a whole night? Wow, there must be some files on there that are really hard to index. How big is your drive and do you happen to have huge text files on it?
I don't know if it has finished or not becouse I havent any real monitor on what baloo is doing, only can only see that it use some "% of CPU" or "disk sleep" on system monitor, I have 1 Hard disk drive no SSD, 1TB, 400GB occupied, the biggest files seems (I have a lot of folders and I don't know how to look inside any folder) to be 4.4 GB .iso files and a 10GB crypto file, I think I havent huge text files
Does anybody know how to debug this and find out what files cause the problem for Pier? This seems like something Vishesh (baloo developer) wants to know - what files cause Search to be indexing for a while night...
tell me what I can do and I will do..:-) :-) :-) :-) -- To unsubscribe, e-mail: opensuse-kde+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kde+owner@opensuse.org
On Friday 02 of May 2014 17:56:36 C wrote:
I could try clearing the Baloo database file. I'm guessing that would trigger Baloo to re-index? Will that work to reset things and put Baloo back to initial status to see if the CPU load and I/O is still happening? i guess it's up to you =) but, if you want start again, stop any baloo client (e.g. dolphin, gwenview, akonadi), baloo_file itself, remove contents of ~/.local/share/baloo, and remove the first run=false key from ~/.kde4/share/config/baloofilerc and proceed as usual...
Cheers, Hrvoje
C.
In data venerdì 2 maggio 2014 11:01:24, C ha scritto:
So why would Baloo be churning for so many hours... and is there a way to see what Baloo might be stuck on?
Use pgrep or equivalent to look up the full command line list of baloo_file_extractor: it'll be a list of numbers. Use "balooshow <number>" to gain insights on the type of file. Being extracted. Usually some file types, in particular text (if large, > 20M) can stress the extractor a bit (a more efficient solution is being investigated as we speak - it's under review). -- Luca Beltrame - KDE Forums team KDE Science supporter GPG key ID: 6E1A4E79
Dne Pá 2. května 2014 11:13:26, Luca Beltrame napsal(a):
In data venerdì 2 maggio 2014 11:01:24, C ha scritto:
So why would Baloo be churning for so many hours... and is there a way to see what Baloo might be stuck on?
Use pgrep or equivalent to look up the full command line list of baloo_file_extractor: it'll be a list of numbers.
Use "balooshow <number>" to gain insights on the type of file. Being extracted. Usually some file types, in particular text (if large, > 20M) can stress the extractor a bit (a more efficient solution is being investigated as we speak - it's under review).
It could explain my problems, as I have some large text matrices (usually ziped, but original files could have several GB) with calculations... Thank You for that, V. -- Vojtěch Zeisek Komunita openSUSE GNU/Linuxu Community of the openSUSE GNU/Linux http://www.opensuse.org/ http://trapa.cz/
On 05/02/2014 10:13 AM, Luca Beltrame wrote:
Use "balooshow <number>" to gain insights on the type of file. Being extracted. Usually some file types, in particular text (if large, > 20M) can stress the extractor a bit (a more efficient solution is being investigated as we speak - it's under review).
so maybe baloo should not have been forced on the users like this? -- To unsubscribe, e-mail: opensuse-kde+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kde+owner@opensuse.org
On Fri, May 2, 2014 at 11:18 AM, Mathias Homann <Mathias.Homann@opensuse.org> wrote:
On 05/02/2014 10:13 AM, Luca Beltrame wrote:
Use "balooshow <number>" to gain insights on the type of file. Being extracted. Usually some file types, in particular text (if large, > 20M) can stress the extractor a bit (a more efficient solution is being investigated as we speak - it's under review).
so maybe baloo should not have been forced on the users like this?
Please take your whining to the other thread. C. -- openSUSE 13.1 x86_64, KDE 4.13 -- To unsubscribe, e-mail: opensuse-kde+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kde+owner@opensuse.org
On Fri, May 2, 2014 at 11:13 AM, Luca Beltrame <lbeltrame@kde.org> wrote:
In data venerdì 2 maggio 2014 11:01:24, C ha scritto:
So why would Baloo be churning for so many hours... and is there a way to see what Baloo might be stuck on?
Use pgrep or equivalent to look up the full command line list of baloo_file_extractor: it'll be a list of numbers.
Use "balooshow <number>" to gain insights on the type of file. Being extracted. Usually some file types, in particular text (if large, > 20M) can stress the extractor a bit (a more efficient solution is being investigated as we speak - it's under review).
I don't know pgrep so well... not sure if I'm using it right pgrep -l baloo 1333 baloo_file 18540 baloo_file_extr That didn't look right so I tried this: pgrep -a baloo 1333 /usr/bin/baloo_file 18540 /usr/bin/baloo_file_extractor 239 238 237 236 235 234 233 232 231 230 229 228 227 226 225 223 222 221 220 219 That looks more like what you were referring to... but balooshow gives me this: balooshow 239 QSqlQuery::prepare: database not open Object::connect: No such signal org::freedesktop::UPower::DeviceAdded(QDBusObjectPath) Object::connect: No such signal org::freedesktop::UPower::DeviceRemoved(QDBusObjectPath) No index information found It's the same for all numbers in that list. I think I'm having a PEBKAC moment. :-P C. -- openSUSE 13.1 x86_64, KDE 4.13 -- To unsubscribe, e-mail: opensuse-kde+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kde+owner@opensuse.org
Am Freitag, 2. Mai 2014, 11:01:24 schrieb C:
I'm starting a new thread because the other is filled to the brim with pissing and moaning - I hope everyone keeps that over there.. and OUT of this thread.
So.. I did a clean install of openSUSE 13.1 (full format of all partitions) and then updated to KDE4 Current. I enabled Baloo to index my home (clean home with nothing in it) and a 1TB storage drive (with about 600GB of data... music, video and documents). All other mount points are explicitly excluded. It's been running now for several hours and I'm seeing:
- baloo_file_extractor and it's pegging one CPU core at 100% and keeping it there - baloo_file is running a second core at about 95% and keeping it there.
So why would Baloo be churning for so many hours... and is there a way to see what Baloo might be stuck on?
My system has enough gigahertz to drive all this without any noticeable impact. I am wondering though is this... "normal" or have I got something in my 1TB drive that is a troublemaker for Baloo? or something else? Or is this even considered "a problem"?
C.
Just a little piece of factual data, I have uninstalled baloo-pim, baloo-file and baloo-tools and both my wife and me are having way more fun with our computers again. Baloo does not work well when ~ is on nfs, or so it seems. Cheers MH -- To unsubscribe, e-mail: opensuse-kde+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kde+owner@opensuse.org
In data domenica 11 maggio 2014 15:13:20, Mathias Homann ha scritto:
Baloo does not work well when ~ is on nfs, or so it seems.
Did you file a bug report about it? Such setups are less likely to get tested by the developers. -- Luca Beltrame - KDE Forums team KDE Science supporter GPG key ID: 6E1A4E79
participants (11)
-
Anton Aylward
-
C
-
Carlos E. R.
-
Jos Poortvliet
-
Luca Beltrame
-
mararm
-
Mathias Homann
-
Rick Chung
-
Vojtěch Zeisek
-
yahoo-pier_andreit
-
šumski