"balooctl config list includeFolders" returns just my home directory.
"balooctl config list excludeFolders" is empty.
"balooctl config hidden" is also empty.
Well, I could exclude a lot of subdirectories from being indexed, but
unfortunately my directory structure is organized by projects, themes and other
criteria, but not by "worth to index or not". While this might be doable it
would be quite cumbersome to maintain.
And I'd rather prefer to have everything in the index, because when I search
for some keyword XYZ and won't find anything, then I can be sure that there is
absolutely nothing there containing XYZ and I'm done. Otherwise, I'd think
"maybe it's just omitted from the index" and have to search by other means.
(e.g. grep -r)
Speaking of grep:
> time grep -rl something $HOME >out.txt
real 6m59.015s
user 2m49.994s
sys 1m14.764s
So grep can search through my entire home directory in 7 minutes. Yes, I know
that indexing takes a lot more than just searching for some bytes - but such a
huge difference?
Is it possible to find out what baloo is doing and where it spends most of its
CPU time?
I tried "balooctl monitor": It say "Indexing: /some/files" 40 times, then there
is a pause of one and a halve minutes, the come the next 40 files and so on.
At the moment is is going throug some html files. I find it hard to believe
that, extracting the keywords can take 1,5 minutes for just 40 files, so I'd
rather guess that inserting the keywords into the database causes the delay.
Maybe a missing index?
Anything I can do to identify the bottleneck?