[opensuse-project] openSUSE wiki search
The opensuse wiki search needs improvement. Today I tried: http://en.opensuse.org/Special:Search?search=community+week&go.x=0&go.y=0&go=Go and perfectly present web site that has small difference to search term: http://en.opensuse.org/CommunityWeek gave me trouble. At the same time Google search did not have that problem. I moved CommunityWeek to Community Week, so our wiki search works fine now, but we either have to develop better search ('inventing hot water' comes in mind), or incorporate Google search and be over with. What do you think? PS. This is probably decision that needs some approval beyond wiki maintainers, so it is posted here. -- Regards, Rajko -- To unsubscribe, e-mail: opensuse-project+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-project+help@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Rajko M. wrote:
The opensuse wiki search needs improvement.
Today I tried: http://en.opensuse.org/Special:Search?search=community+week&go.x=0&go.y=0&go=Go and perfectly present web site that has small difference to search term: http://en.opensuse.org/CommunityWeek gave me trouble.
At the same time Google search did not have that problem. I moved CommunityWeek to Community Week, so our wiki search works fine now, but we either have to develop better search ('inventing hot water' comes in mind), or incorporate Google search and be over with.
What do you think?
Well, yes, Mediawiki's native search capabilities are.. err.. "not every good". I'm not sure why it is like that, because in theory, using MySQL's full text indexing should make it both easy and effective, but they probably don't use that feature for database portability reasons, or maybe because it would be way too slow on Wikipedia (the only Mediawiki installation the Mediawiki authors care about). There are several options here: - - implement a better search module directly in Mediawiki (e.g. something totally MySQL-specific, we don't have a problem with that), assuming that it is possible - - use an alternative indexing and search engine such as Apache Solr - - register and use a Google Custom Search Engine, or search with "site:en.opensuse.org keywords" in regular Google The 3rd option is already implemented, "sgt-d" has registered a Google CSE for openSUSE, but he only added en.opensuse.org to the crawling index -- unfortunately, I don't know how to get in touch with him, if someone does, please let me know. You can use it here: http://s.opensu.se/ or here: http://www.google.com/coop/cse?cx=008552698452931774792%3Azfwo6gcl7gi The 2nd option is probably the one that would yield the highest quality of search hits, by a large margin (Google is pretty good, but Apache Solr is stellar and would allow openSUSE branding, faceted search, etc...), but it also involves quite some (software) development. We could do a bit of research, I guess that someone already implemented an at least half-way complete implementation of that as a Mediawiki plugin. It's not just about finding such an extension, but also testing it on another instance, and make a quality assessment. And to finish in reverse order :) I'm not sure about the 1st option (hack Mediawiki's search implementation). While I don't see any reason why we couldn't improve it dramatically in a few hours of coding, if it was that simple, someone would have done it already. So there might be a good reason for not having a better native search quality in Mediawiki.
PS. This is probably decision that needs some approval beyond wiki maintainers, so it is posted here.
Feasibility is yet another topic, and a much more complex one. The wiki is hosted on a cluster at Novell IC&T, and can only be accessed by some Novell employees, hence it involves having time from both one of the really busy people at Novell who are involved into openSUSE as well as an admin in Provo (assuming that the Wiki is hosted there). Then again, there is no reason to keep a better solution from being sent upstream (to Mediawiki) first. The steps would be as follows: - - take the Google CSE option or implement Apache Solr or hack Mediawiki - - put up a local instance to test, without the real data (which is probably way too large of a dump to be synced easily, not even mentioning accessing things one isn't supposed to see in the database) - - test, test, test - - once it's stable and verified, submit upstream to the Mediawiki project (e.g. as an extension, depends on the implementation) - - convince IC&T to install that extension (possibly the hardest part, I don't know) - - find people at Novell who have time to install it Yes, I agree, that doesn't sound too good. The solution that involves the least of efforts (at the cost of branding and relying on Google) is having a properly configured Google CSE, and hacking the search form in our Mediawiki installation to submit searches there instead of using its own capabilities (easy, just a few lines to change). But for that, we need to get ahold of sgt-d, or register another CSE with the required credentials being shared with several people, not just a one man show. That being said, it would be even better to have an all-encompassing search that indexes and shows results from the wiki, the mailing-list archives, the forums, and possibly even from package search engines (webpin/software portal, OBS search, Packman, ...). Could be achieved with Google CSE too. Just needs admin access to the CSE instance ;) cheers - -- -o) Pascal Bleser <pascal.bleser@opensuse.org> /\\ http://opensuse.org -- I took the green pill _\_v FOSDEM::7+8 Feb 2009, Brussels, http://fosdem.org -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.9 (GNU/Linux) iD8DBQFJ6L9Fr3NMWliFcXcRAouPAJ4zVP4JUphkHZmAeXTFMJnShn3P2ACgg5WO cZbtZHbF1g1Jr8JCEe34m3M= =/k+h -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse-project+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-project+help@opensuse.org
participants (2)
-
Pascal Bleser
-
Rajko M.