[opensuse] htdig and other search engines
Hi, can anyone recommend an web-based search tool that I can use to index and search our document server. At the moment Im using htdig, but the fact that it was last updated in 2004 and it cant do partial word searches is making me reconsider. Must be able to parse MSOfficeDocs/ODF/PDF/txt/html. Hans E-Mail disclaimer: http://www.sunspace.co.za/emaildisclaimer.htm -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Hi Hans Am Mo, 27. August 2007 16:46:09 schrieb Hans van der Merwe:
Hi, can anyone recommend an web-based search tool that I can use to index and search our document server. At the moment Im using htdig, but the fact that it was last updated in 2004 and it cant do partial word searches is making me reconsider. Must be able to parse MSOfficeDocs/ODF/PDF/txt/html. [...]
May be swish++ is what you are looking for. Description: Simple Document Indexing System for Humans: C++ version SWISH++ is a Unix-based file indexing and searching engine (typically used to index and search files on web sites). It was based on SWISH-E although SWISH++ is a complete rewrite. . SWISH++ features: * Lightning-fast indexing * Indexes META elements, ALT, and other attributes * Selectively not index text within HTML or XHTML elements * Intelligently index mail and news files * Index Unix manual page files * Apply filters to files on-the-fly prior to indexing * Index non-text files such as Microsoft Office documents * Modular indexing architecture * Index new files incrementally * Index remote web sites * Handles large collections of files * Lightning-fast searching * Optional word stemming (suffix stripping) * Ability to run as a search server * Easy-to-parse results format Homepage: http://homepage.mac.com/pauljlucas/software/swish/ regards, thomas -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On 8/27/07, email.listen@googlemail.com <email.listen@googlemail.com> wrote:
Hi Hans
Am Mo, 27. August 2007 16:46:09 schrieb Hans van der Merwe:
Hi, can anyone recommend an web-based search tool that I can use to index and search our document server. At the moment Im using htdig, but the fact that it was last updated in 2004 and it cant do partial word searches is making me reconsider. Must be able to parse MSOfficeDocs/ODF/PDF/txt/html. [...]
May be swish++ is what you are looking for.
or mnogosearch. We use to index +/-100 web sites (some pretty big). It can parse all these docs (actually we even flash files) Regards, Gael -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
participants (3)
-
email.listen@googlemail.com
-
Gaël Lams
-
Hans van der Merwe