[opensuse] Nepomuk indexing misses .doc, .xls files
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I've recently enabled Nepomuk and find that it doesn't index files with Microsoft Office extensions. For example, I have a folder containing 66 files, all with the string <agenda> in the filename. 64 of these files have a .doc extension, and are completely ignored by Nepomuk when I search for files containing $agenda in the filename in Dolphin. The two files with a .odt extension are found. I'm as keen a FOSS proselytiser as the next person, but to exclude these files seems to be taking purity too far. ;-) I have checked the exclusions dialog, but there is nothing to suggest that .doc or .xls should be excluded. I'd also like my python scripts to be indexed, but they are also invisible, despite their containing folder being in the search path. Bob - -- Bob Williams System: Linux 3.11.10-7-desktop Distro: openSUSE 13.1 (x86_64) with KDE Development Platform: 4.12.2 Uptime: 12:00pm up 2 days 20:00, 6 users, load average: 1.15, 0.58, 0.40 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iEYEARECAAYFAlL/cXUACgkQ0Sr7eZJrmU6D4gCfXrPpdExmcSLizE40GPm7FOuj XwkAn09w4udSWYRYuJ0LDvyTgLRvPUpl =HzKd -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
* Bob Williams
I've recently enabled Nepomuk and find that it doesn't index files with Microsoft Office extensions. For example, I have a folder containing 66 files, all with the string <agenda> in the filename. 64 of these files have a .doc extension, and are completely ignored by Nepomuk when I search for files containing $agenda in the filename in Dolphin. The two files with a .odt extension are found.
I'm as keen a FOSS proselytiser as the next person, but to exclude these files seems to be taking purity too far. ;-)
I have checked the exclusions dialog, but there is nothing to suggest that .doc or .xls should be excluded.
I'd also like my python scripts to be indexed, but they are also invisible, despite their containing folder being in the search path.
Don't know about the .doc or .xls files but did you *enable* the source code checkbox? -- (paka)Patrick Shanahan Plainfield, Indiana, USA @ptilopteri http://en.opensuse.org openSUSE Community Member facebook/ptilopteri http://wahoo.no-ip.org Photo Album: http://wahoo.no-ip.org/gallery2 Registered Linux User #207535 @ http://linuxcounter.net -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 15/02/14 14:19, Patrick Shanahan wrote:
* Bob Williams
[02-15-14 08:54]: I've recently enabled Nepomuk and find that it doesn't index files with Microsoft Office extensions. For example, I have a folder containing 66 files, all with the string <agenda> in the filename. 64 of these files have a .doc extension, and are completely ignored by Nepomuk when I search for files containing $agenda in the filename in Dolphin. The two files with a .odt extension are found.
I'm as keen a FOSS proselytiser as the next person, but to exclude these files seems to be taking purity too far. ;-)
I have checked the exclusions dialog, but there is nothing to suggest that .doc or .xls should be excluded.
I'd also like my python scripts to be indexed, but they are also invisible, despite their containing folder being in the search path.
Don't know about the .doc or .xls files but did you *enable* the source code checkbox?
Yes. - -- Bob Williams System: Linux 3.11.10-7-desktop Distro: openSUSE 13.1 (x86_64) with KDE Development Platform: 4.12.2 Uptime: 12:00pm up 2 days 20:00, 6 users, load average: 1.15, 0.58, 0.40 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iEYEARECAAYFAlL/eUwACgkQ0Sr7eZJrmU52VgCgnT9vfm7TUDxPwaUtBVfkLDK+ +j4AniHTw4UPRkw2fQRH/6RWMHz6dQbU =hLuJ -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Saturday 15 of February 2014 13:53:57 Bob Williams wrote:
I've recently enabled Nepomuk and find that it doesn't index files with Microsoft Office extensions. For example, I have a folder containing 66 files, all with the string <agenda> in the filename. 64 of these files have a .doc extension, and are completely ignored by Nepomuk when I search for files containing $agenda in the filename in Dolphin. The two files with a .odt extension are found. ...
Hi Bob, IIRC, for MO files, one needs also have catdoc package installed, maybe you can try with that? It is available in KDE:Extra repo. Cheers, Hrvoje
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 15/02/14 18:31, ?umski wrote:
On Saturday 15 of February 2014 13:53:57 Bob Williams wrote:
I've recently enabled Nepomuk and find that it doesn't index files with Microsoft Office extensions. ...
Hi Bob, IIRC, for MO files, one needs also have catdoc package installed, maybe you can try with that? It is available in KDE:Extra repo.
Cheers, Hrvoje
Hi sumski, Many thanks. I've now installed catdoc. I'll give Nepomuk a chance to reindex and then test again. Bob - -- Bob Williams System: Linux 3.11.10-7-desktop Distro: openSUSE 13.1 (x86_64) with KDE Development Platform: 4.12.2 Uptime: 18:00pm up 3 days 2:00, 6 users, load average: 0.22, 0.34, 0.35 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iEYEARECAAYFAlL/un8ACgkQ0Sr7eZJrmU6s/QCgjVFtIKs8Wq96l00JdkwDP2ZB su4An1u+7mYEwT9VzmwBFgidCftcF0D8 =Pi0J -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2/15/2014 11:05 AM, Bob Williams wrote:
On 15/02/14 18:31, ?umski wrote:
On Saturday 15 of February 2014 13:53:57 Bob Williams wrote:
I've recently enabled Nepomuk and find that it doesn't index files with Microsoft Office extensions. ...
Hi Bob, IIRC, for MO files, one needs also have catdoc package installed, maybe you can try with that? It is available in KDE:Extra repo.
Cheers, Hrvoje
Hi sumski,
Many thanks. I've now installed catdoc. I'll give Nepomuk a chance to reindex and then test again.
Bob
Bob, you don't have to wait. Just open a shell and navigate to the file and type nepomukindexer <filename> However, don't hold your breath, because one of the devs of Nepomuk says its not working yet: http://vhanda.in/blog/2013/05/we-need-more-indexers/ They used to use Strigi for that, but it had a lot of problems, so now they have been writing their own indexers, and apparently .doc is not high on their list. - -- _____________________________________ - ---This space for rent--- -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (MingW32) iEYEARECAAYFAlL/x50ACgkQv7M3G5+2DLILhgCfRfdgRZJ/x6R9xVigjcng8qr5 SHwAnRV6d6ZR4E+AUYPVITW+WrdevaNP =6lY0 -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 15/02/14 20:01, John Andersen wrote:
On 2/15/2014 11:05 AM, Bob Williams wrote:
On 15/02/14 18:31, ?umski wrote:
On Saturday 15 of February 2014 13:53:57 Bob Williams wrote:
I've recently enabled Nepomuk and find that it doesn't index files with Microsoft Office extensions. ...
Hi Bob, IIRC, for MO files, one needs also have catdoc package installed, maybe you can try with that? It is available in KDE:Extra repo.
Cheers, Hrvoje
Hi sumski,
Many thanks. I've now installed catdoc. I'll give Nepomuk a chance to reindex and then test again.
Bob
Bob, you don't have to wait. Just open a shell and navigate to the file and type nepomukindexer <filename>
However, don't hold your breath, because one of the devs of Nepomuk says its not working yet: http://vhanda.in/blog/2013/05/we-need-more-indexers/
They used to use Strigi for that, but it had a lot of problems, so now they have been writing their own indexers, and apparently .doc is not high on their list.
Hi John, Actually, catdoc seems to be performing as advertised. I had to logout and login again before seeing the improved search results. Dolphin Find now shows .doc, .xls and .pdf. Regards, Bob - -- Bob Williams System: Linux 3.11.10-7-desktop Distro: openSUSE 13.1 (x86_64) with KDE Development Platform: 4.12.2 Uptime: 00:00am up 3 days 8:00, 6 users, load average: 0.46, 0.27, 0.32 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iEYEARECAAYFAlMAAkMACgkQ0Sr7eZJrmU6UzQCfVv/95uQA3Jj2e4XtgAVRlFtp azcAn0KGBUDlFAbL7GxON3mvxz3w5Aw4 =25eP -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2/15/2014 4:11 PM, Bob Williams wrote:
Hi John,
Actually, catdoc seems to be performing as advertised. I had to logout and login again before seeing the improved search results.
Dolphin Find now shows .doc, .xls and .pdf.
Regards,
In my case, (OS 12.3) I had to upgrade to KDE 4.12, in order to get all the proper pieces to work, because mine was missing nepomukofficeextractor After upgrading to 4.12, new (or touched) .doc files would get picked up, but it still did not go back and index all those previously existing documents. When I manually sent them through nepomukindexer <filename> it seemed to pick most of them up. There are still a few from very early versions of Word that it seems not to know how to handle. - -- _____________________________________ - ---This space for rent--- -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (MingW32) iEYEARECAAYFAlMABaoACgkQv7M3G5+2DLIqPgCfQIzvmqiGu+TBKHbffVuUg8JS N8oAnjwupJ+H2H8CYxHvc0XorJqpWl1g =eMzF -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2/15/2014 10:31 AM, šumski wrote:
On Saturday 15 of February 2014 13:53:57 Bob Williams wrote:
I've recently enabled Nepomuk and find that it doesn't index files with Microsoft Office extensions. For example, I have a folder containing 66 files, all with the string <agenda> in the filename. 64 of these files have a .doc extension, and are completely ignored by Nepomuk when I search for files containing $agenda in the filename in Dolphin. The two files with a .odt extension are found. ...
Hi Bob, IIRC, for MO files, one needs also have catdoc package installed, maybe you can try with that? It is available in KDE:Extra repo.
Cheers, Hrvoje
I don't believe this is sufficient. Installing catdoc did not have any effect for my collection of MSWord Docs. - ----unrelated: And your pgp signature shows invalid - -- _____________________________________ - ---This space for rent--- -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (MingW32) iEYEARECAAYFAlL/wMgACgkQv7M3G5+2DLLBUgCeL1RtBu8MTmysI9PglXsKdm0G 5NkAnR1Kd2swFnG9+OhGFI11DIS89WmB =VFew -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Saturday 15 of February 2014 11:32:24 John Andersen wrote:
I don't believe this is sufficient. Installing catdoc did not have any effect for my collection of MSWord Docs. For 2003 and older, that is prerequisite. For 2007 and newer, catdoc is not required. Maybe those files need to be re-indexing by hand...
----unrelated: And your pgp signature shows invalid Invalid where and how?
Cheers, Hrvoje
participants (4)
-
Bob Williams
-
John Andersen
-
Patrick Shanahan
-
šumski