[opensuse-wiki] Wiki Google Custom Search
Hello, As Carlos pointed out, I had accidentally hijacked an earlier thread, so I am starting this new one up. Here is the original message: I have implemented Google Custom Search on enstage.opensuse.org, and I am personally very happy with the results. To try it out, you can visit http://enstage.opensuse.org/Portal:GoogleSearch and try searching on anything you want. It is about a thousand times better than what we have, and with the better relevance ranking, namespace searching is no longer an issue. While the current search is advertisement supported, I am nearly positive that we can get ad support removed. It seems that Google is pretty willing to do this for non-profits. What does everyone think? If you don't have access to the staging site, I will be more than happy to send you a screen shot of the search page. If everyone is on board, I will ask Thomas to edit the theme to make it the default search, and we'll send it live. Thanks, Matt By the way, you need access to the staging site to see this (i.e. you need to be on the Novell network). I can provide screen shots if requested.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2010-07-28 00:32, Matthew Ehle wrote:
Hello,
As Carlos pointed out, I had accidentally hijacked an earlier thread, so I am starting this new one up. Here is the original message:
I have implemented Google Custom Search on enstage.opensuse.org, and I am personally very happy with the results. To try it out, you can visit http://enstage.opensuse.org/Portal:GoogleSearch and try searching on anything you want. It is about a thousand times better than what we have, and with the better relevance ranking, namespace searching is no longer an issue.
It times out without loading. Is there some problem? - -- Cheers / Saludos, Carlos E. R. (from 11.2 x86_64 "Emerald" GM (Elessar)) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.12 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org/ iEUEARECAAYFAkxPX1MACgkQU92UU+smfQVWXwCcDMR3mEVbNEN/LXaDLIgFINpa IP8Al0yIVdy8/dFxig9+u38s7nZZPKA= =mCrk -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse-wiki+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-wiki+help@opensuse.org
"Carlos E. R." <robin.listas@telefonica.net> 7/27/2010 4:36 PM >>> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 2010-07-28 00:32, Matthew Ehle wrote:
Hello,
As Carlos pointed out, I had accidentally hijacked an earlier thread, so I am starting this new one up. Here is the original message:
I have implemented Google Custom Search on enstage.opensuse.org, and I am personally very happy with the results. To try it out, you can visit http://enstage.opensuse.org/Portal:GoogleSearch and try searching on anything you want. It is about a thousand times better than what we have, and with the better relevance ranking, namespace searching is no longer an issue.
It times out without loading. Is there some problem?
It sounds like you're not on the Novell network. The staging site is firewalled, so you have to either be on a Novell site or VPN'ed in to be able to access staging sites. I know, it's pretty annoying. I'm going to try to do something about it, but for now, we'll just have to work around it.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2010-07-28 00:55, Matthew Ehle wrote:
"Carlos E. R." <> 7/27/2010 4:36 PM >>>
It times out without loading. Is there some problem? It sounds like you're not on the Novell network. The staging site is firewalled, so you have to either be on a Novell site or VPN'ed in to be able to access staging sites. I know, it's pretty annoying. I'm going to try to do something about it, but for now, we'll just have to work around it.
Well... then, you shouldn't have announced it on a public list :-( - -- Cheers / Saludos, Carlos E. R. (from 11.2 x86_64 "Emerald" GM (Elessar)) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.12 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org/ iEYEARECAAYFAkxPdBgACgkQU92UU+smfQWsxACeM1gaYqVn58u7g6nUozyFu9Is K1MAn3LGzYaxaN9nITvyDY7iGyG6ClLG =zNr+ -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse-wiki+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-wiki+help@opensuse.org
On 7/27/2010 6:55 PM, Matthew Ehle wrote:
"Carlos E. R." <robin.listas@telefonica.net> 7/27/2010 4:36 PM >>> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 2010-07-28 00:32, Matthew Ehle wrote:
Hello,
As Carlos pointed out, I had accidentally hijacked an earlier thread, so I am starting this new one up. Here is the original message:
I have implemented Google Custom Search on enstage.opensuse.org, and I am personally very happy with the results. To try it out, you can visit http://enstage.opensuse.org/Portal:GoogleSearch and try searching on anything you want. It is about a thousand times better than what we have, and with the better relevance ranking, namespace searching is no longer an issue.
It times out without loading. Is there some problem? It sounds like you're not on the Novell network. The staging site is firewalled, so you have to either be on a Novell site or VPN'ed in to be able to access staging sites. I know, it's pretty annoying. I'm going to try to do something about it, but for now, we'll just have to work around it.
OK. Then please type in "lxc" into the new search. Does a copy of old-en.opensuse.org/LXC come up? I can't check this myself. I'm not saying I have any reason to believe that page has been transferred to anywhere in the new wiki. It may not exist or it may simply be unfindable. It was probably supposed to be my job to move it, which I have not done. If so, well sorry I already have a more than full time job I had to steal way too much time from just to get that doc written (and be sure everything was sound accurate, safe, tested umpteen times, missing scripts written and supplied in an rpm package created and maintained in an obs repo, etc... followed by subsequent updates, organizational improvements, clarification and fluff removal, etc...). And now, not very long after the effort to create it, it gets removed, eaten, down a black hole. Effort wasted. After going through that process of being burned, can you blame me for being loathe to invest even more time in this wiki? I will spend a lot of time and effort on documentation and reference, for free, because I know how important it is. But not if it's going to be wasted this way. You may say, "Well yes it was supposed to be your job to move it and if you can't be bothered then why should we?". the problem is I did not reorganize the site and break all the links to content I didn't write. All this is doing is hurting opensuse on several levels and from several directions at once. Content from authors who are no longer actively maitaining what they wrote is simply gone. Maybe some of it is obsolete by now, maybe not, and maybe there is no such thing as obsolete knowledge, all loss is bad. I often specifically need reference for older systems and current docs are of no use. Content from still active authors may or may not be moved because people like me may simply abandon a black hole that eats content. Or some may move their existing docs but be a lot less willing to invest new time & effort. You can't say "We promise not to do break things again" after already having done so once. With so much stuff missing or unfindable, users are left with a paucity of help compared to what they can get elsewhere. So new users, or even old ones have incentive to leave. That's opensuse users, not just wiki users. And prospective not-yet users reading the countless existing external articles on countless topics, "Here's the Ubuntu example/package/info, Here's the RedHat example/package/info, Here's the opensuse [nothing, broken link]." repeated all over the place... They just see, hmm opensuse really allows all there links to be broken? Not exactly the kind of outfit I want to deal with. IT guys who aren't really Linux guys reading manuals in back rooms "This is stupid, the manual says go here and it doesn't even exist. What garbage... Just the excuse I needed to convince the customer to scrap this box and I'll put a new Windows box in." What a disaster... We site admin 101 do not break links to reference material. -- bkw -- To unsubscribe, e-mail: opensuse-wiki+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-wiki+help@opensuse.org
On Tuesday 27 July 2010 21:41:47 Brian K. White wrote:
On 7/27/2010 6:55 PM, Matthew Ehle wrote:
"Carlos E. R." <robin.listas@telefonica.net> 7/27/2010 4:36 PM >>>
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 2010-07-28 00:32, Matthew Ehle wrote:
Hello,
As Carlos pointed out, I had accidentally hijacked an earlier thread, so I am starting this new one up. Here is the original message:
I have implemented Google Custom Search on enstage.opensuse.org, and I am personally very happy with the results. To try it out, you can visit http://enstage.opensuse.org/Portal:GoogleSearch and try searching on anything you want. It is about a thousand times better than what we have, and with the better relevance ranking, namespace searching is no longer an issue.
It times out without loading. Is there some problem?
It sounds like you're not on the Novell network. The staging site is firewalled, so you have to either be on a Novell site or VPN'ed in to be able to access staging sites. I know, it's pretty annoying. I'm going to try to do something about it, but for now, we'll just have to work around it.
OK. Then please type in "lxc" into the new search. Does a copy of old-en.opensuse.org/LXC come up? I can't check this myself.
http://old-en.opensuse.org/LXC was not transferred. It is too big to be transferred with whole history, so I transferred only last revision. The article http://en.opensuse.org/LXC should be introduction like other articles about software. Don't forget template infobox http://en.opensuse.org/Template:Infobox so that people can see digest information about it and how to download it. The installation, setup and troubleshooting parts go to http://en.opensuse.org/SDB:LXC and it should be linked from main article. ...
What a disaster... We site admin 101 do not break links to reference material.
It is not anymore :) ... but you have to split it in 2 parts, presentation and installation and configuration. It would be nice if other would come just as you did and tell what is broken. IMO, 80% of the problems is like with LXC, easy to fix if someone cares. -- Regards, Rajko -- To unsubscribe, e-mail: opensuse-wiki+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-wiki+help@opensuse.org
On 7/27/2010 11:45 PM, Rajko M. wrote:
On Tuesday 27 July 2010 21:41:47 Brian K. White wrote:
OK. Then please type in "lxc" into the new search. Does a copy of old-en.opensuse.org/LXC come up? I can't check this myself.
http://old-en.opensuse.org/LXC was not transferred. It is too big to be transferred with whole history, so I transferred only last revision.
Hey thanks Rajko that was nice of you. -- bkw -- To unsubscribe, e-mail: opensuse-wiki+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-wiki+help@opensuse.org
On Jul 27, 10 16:55:44 -0600, Matthew Ehle wrote:
On 2010-07-28 00:32, Matthew Ehle wrote:
I have implemented Google Custom Search on enstage.opensuse.org, and I am personally very happy with the results. To try it out, you can visit http://enstage.opensuse.org/Portal:GoogleSearch and try searching on anything you want.
Yes! This search engine just works! I'll use it from now on, and see if there are any edges that need polishing. Great work! thanks, JW- -- o \ Juergen Weigert paint it green! __/ _=======.=======_ <V> | jw@suse.de back to ascii! __/ _---|____________\/ \ | 0911 74053-508 __/ (____/ /\ (/) | _____________________________/ _/ \_ vim:set sw=2 wm=8 SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nuernberg) "Why would it be stupid to assume that a file can continue to be accessed by the same name in the future?" Brion Vibber bwmo#15842#c12 -- To unsubscribe, e-mail: opensuse-wiki+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-wiki+help@opensuse.org
Yes! This search engine just works! I'll use it from now on, and see if there are any edges that need polishing.
Thank you for doing that. In fact, I think I already see one that I can work on. It appears that when you get towards the end of the useful search results, the search results start getting into the history and diff pages. I don't know if that's really something we want, but it should be easy to change the CSE settings to fix that. Also everyone, please keep in mind that this does not have to be the end solution. If nothing else, this can be a stop-gap on our way to getting Lucene (the wikipedia search engine) or Sphinx in place. Or, we can simply leave it, if everyone is happy with that. Either way, it is, IMHO, unquestionably superior to our current search. -Matt
Hello, on Mittwoch, 28. Juli 2010, Matthew Ehle wrote:
It appears that when you get towards the end of the useful search results, the search results start getting into the history and diff pages. I don't know if that's really something we want, but it should be easy to change the CSE settings to fix that.
Don't fix the search results, fix the indexing ;-) I'd propose to create a robots.txt with Disallow: /index.php This should keep out the page history, view source etc. out of search engines. Articles will still be listed because they don't have index.php in their URL. You may also want to add things like Disallow: /Special:Search Disallow: /Special:Random and all its translated counterparts - http://de.wikipedia.org/robots.txt has a nice list ;-) Regards, Christian Boltz -- Den ganzen Prozess zusammengenommen nennt man "Branding": Man nimmt ein glühendes Eisen mit der neuen Ausdrucksform und drückt sie der Firma kräftig drauf - und die reagiert wie ein Rindviech. :-) [Ratti] -- To unsubscribe, e-mail: opensuse-wiki+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-wiki+help@opensuse.org
I will look at doing this. For now, the custom search engine will at least not display undesirable results, but this can also be beneficial for searches made from outside the wiki (i.e. www.google.com). -Matt
Christian Boltz <opensuse@cboltz.de> 7/29/2010 3:45 AM >>> Hello,
on Mittwoch, 28. Juli 2010, Matthew Ehle wrote:
It appears that when you get towards the end of the useful search results, the search results start getting into the history and diff pages. I don't know if that's really something we want, but it should be easy to change the CSE settings to fix that.
Don't fix the search results, fix the indexing ;-) I'd propose to create a robots.txt with Disallow: /index.php This should keep out the page history, view source etc. out of search engines. Articles will still be listed because they don't have index.php in their URL. You may also want to add things like Disallow: /Special:Search Disallow: /Special:Random and all its translated counterparts - http://de.wikipedia.org/robots.txt has a nice list ;-) Regards, Christian Boltz -- Den ganzen Prozess zusammengenommen nennt man "Branding": Man nimmt ein glühendes Eisen mit der neuen Ausdrucksform und drückt sie der Firma kräftig drauf - und die reagiert wie ein Rindviech. :-) [Ratti] -- To unsubscribe, e-mail: opensuse-wiki+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-wiki+help@opensuse.org
On 29.07.2010 11:45, Christian Boltz wrote:
Hello,
on Mittwoch, 28. Juli 2010, Matthew Ehle wrote:
It appears that when you get towards the end of the useful search results, the search results start getting into the history and diff pages. I don't know if that's really something we want, but it should be easy to change the CSE settings to fix that.
Don't fix the search results, fix the indexing ;-)
I'd propose to create a robots.txt with Disallow: /index.php
This should keep out the page history, view source etc. out of search engines. Articles will still be listed because they don't have index.php in their URL.
You may also want to add things like Disallow: /Special:Search Disallow: /Special:Random and all its translated counterparts - http://de.wikipedia.org/robots.txt has a nice list ;-)
Hi, I think excluding /index.php is a good way to go. I added a robots.txt to the wiki sources. Matthew: To make this file available we need to change the apache rewrite conditions, I already changed that on staging. Greetings -- Thomas Schmidt (tschmidt [at] suse.de) SUSE Linux Products GmbH :: Research & Development :: Tools "Don't Panic", Douglas Adams (1952 - 11.05.2001) -- To unsubscribe, e-mail: opensuse-wiki+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-wiki+help@opensuse.org
Hello, on Dienstag, 3. August 2010, Thomas Schmidt wrote:
On 29.07.2010 11:45, Christian Boltz wrote:
on Mittwoch, 28. Juli 2010, Matthew Ehle wrote: [robots.txt] Hi, I think excluding /index.php is a good way to go. I added a robots.txt to the wiki sources.
Matthew: To make this file available we need to change the apache rewrite conditions, I already changed that on staging.
I'm just cleaning up some of my old mails and wanted to check the robots.txt. Unfortunately I only get "Object not found" for http://en.opensuse.org/robots.txt :-( Looks like another victim of the wiki update - can you please restore it? Regards, Christian Boltz -- what is Office? Is that software I need if I work in an office (e.g. patience game)? [Stephan Kulow in opensuse-factory] -- To unsubscribe, e-mail: opensuse-wiki+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-wiki+help@opensuse.org
On 30.10.2010 01:27, Christian Boltz wrote:
Hello,
on Dienstag, 3. August 2010, Thomas Schmidt wrote:
On 29.07.2010 11:45, Christian Boltz wrote:
on Mittwoch, 28. Juli 2010, Matthew Ehle wrote: [robots.txt] Hi, I think excluding /index.php is a good way to go. I added a robots.txt to the wiki sources.
Matthew: To make this file available we need to change the apache rewrite conditions, I already changed that on staging.
I'm just cleaning up some of my old mails and wanted to check the robots.txt.
Unfortunately I only get "Object not found" for http://en.opensuse.org/robots.txt :-(
Looks like another victim of the wiki update - can you please restore it?
I re-added a simple robots.txt file. Matthew, could you please deploy it? Greetings -- Thomas Schmidt (tschmidt [at] suse.de) SUSE Linux Products GmbH :: Research & Development :: Tools "Don't Panic", Douglas Adams (1952 - 11.05.2001) -- To unsubscribe, e-mail: opensuse-wiki+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-wiki+help@opensuse.org
On 28.07.2010 00:32, Matthew Ehle wrote:
I have implemented Google Custom Search on enstage.opensuse.org, and I am personally very happy with the results.
It's not very fresh (google updates once a month!) it does not support mediawiki features (namespaces, page weighting etc) and it searches the resulting web pages not the wiki pages. Wikipedia has been through this already and they switched back. Maybe we can use google beside the normal search? Henne -- Henne Vogelsang, openSUSE. Everybody has a plan, until they get hit. - Mike Tyson -- To unsubscribe, e-mail: opensuse-wiki+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-wiki+help@opensuse.org
It's not very fresh (google updates once a month!) it does not support mediawiki features (namespaces, page weighting etc) and it searches the resulting web pages not the wiki pages.
We could set up a sitemap and use on-demand indexing, which can get at least parts of the site indexed faster. As for the other features... is anyone really going to miss them, since the search results would be actually relevant and comprehensive? ; )
Wikipedia has been through this already and they switched back. Maybe we can use google beside the normal search?
This is a reasonable suggestion. After all, OSS is all about choice, right? Thomas, would you be able to work two search boxes into the Bento theme and still make it look good?
Well... then, you shouldn't have announced it on a public list :-( Public or not, it is still the best way to get that information out to the people who have an interest and are able to look at it. It's better to have a few people try it out than none at all. However, I should have made it more clear that only the Novell employees can see the staging site.
I have updated the production site, so it now includes the Google extension. To try it out, just go to http://en.opensuse.org/Portal:GoogleSearch and try out your favorite search term. As Henne mentioned, a major difference here is that it will not simply take you to a wiki page if you type in something obvious (e.g. Firefox). Depending on how you look at it, this is either a good or a bad thing. If you really are looking for the main Firefox page, the default search takes you right there, where the Google search only places it at the top of the search results. However, if you are looking for some other article dealing with Firefox, the Google search is vastly superior. Also I have submitted a site map to Google, and they are currenty reindexing the wiki for the updated content. This should take all the dead links left over from the legacy wiki out of the Google index. This should help the Google extension come up with better results, as well as for users searching on google.com. Last, but not least, I will also be trying out Lucene search on the staging site. I'm just waiting for an upgrade on the staging server so I can install Java 6. Once that is done, I'll start playing with it. The biggest trick will be to get it working with multiple databases, but it could be a very good option if I can get that figured out. In the meantime, I would really like to see the Google search option worked into the theme somehow. We could replace the default search form, put another one beside it, create a link to the above-mentioned page, whatever. We just need to make it obvious, so that people can use it. -Matt
"Matthew Ehle" <mehle@novell.com> 7/28/2010 8:08 AM >>> It's not very fresh (google updates once a month!) it does not support mediawiki features (namespaces, page weighting etc) and it searches the resulting web pages not the wiki pages.
We could set up a sitemap and use on-demand indexing, which can get at least parts of the site indexed faster. As for the other features... is anyone really going to miss them, since the search results would be actually relevant and comprehensive? ; )
Wikipedia has been through this already and they switched back. Maybe we can use google beside the normal search?
This is a reasonable suggestion. After all, OSS is all about choice, right? Thomas, would you be able to work two search boxes into the Bento theme and still make it look good?
Well... then, you shouldn't have announced it on a public list :-( Public or not, it is still the best way to get that information out to the people who have an interest and are able to look at it. It's better to have a few people try it out than none at all. However, I should have made it more clear that only the Novell employees can see the staging site.
On Jul 29, 10 11:57:37 -0600, Matthew Ehle wrote:
I have updated the production site, so it now includes the Google extension. To try it out, just go to http://en.opensuse.org/Portal:GoogleSearch and try out your favorite search term.
As Henne mentioned, a major difference here is that it will not simply take you to a wiki page if you type in something obvious (e.g. Firefox). Depending on how you look at it, this is either a good or a bad thing.
From Usability Point of View an extra click is often regarded as a degrade in user experience. It could be argued that the click is useless, as the default search engine can do the same without it.
In this case I'd say that the extra click is not useless, as it shows that our wiki has more than one page with this result. This is a good thing from user experience.
If you really are looking for the main Firefox page, the default search takes you right there, where the Google search only places it at the top of the search results. However, if you are looking for some other article dealing with Firefox, the Google search is vastly superior.
The main google used to implement this direct jump function with its 'I feel luck button' Do we also have this in the site search? Can we influence the results of the google search? We may want to move results from Portal and Main namespaces to the top, to honor the higher quality of the pages found there. cheers, JW- -- o \ Juergen Weigert paint it green! __/ _=======.=======_ <V> | jw@suse.de back to ascii! __/ _---|____________\/ \ | 0911 74053-508 __/ (____/ /\ (/) | _____________________________/ _/ \_ vim:set sw=2 wm=8 SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nuernberg) "Why would it be stupid to assume that a file can continue to be accessed by the same name in the future?" Brion Vibber bwmo#15842#c12 -- To unsubscribe, e-mail: opensuse-wiki+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-wiki+help@opensuse.org
On 30/07/10 12:08, Juergen Weigert wrote:
On Jul 29, 10 11:57:37 -0600, Matthew Ehle wrote: [...] Can we influence the results of the google search? We may want to move results from Portal and Main namespaces to the top, to honor the higher quality of the pages found there.
Is it possible in the Google Custom Search to make even more differences especially between matches 0. (not only) between namespaces but 1. exactly to the name 2. in the title/name 3. to the text 4. in combinations of (0. AND 1.); (0 and 2.); ... For example: If someone gives "sax2" or "SaX2" to the search engine can he be directed only to http://en.opensuse.org/Archive:SaX2 to a list with first to [[Archive:SaX2]] instead of to a list in that are all the (probably at least for 11.3 outdated) articles listed that have (still) SaX2/sax2 anywhere in the body/source of the article... - ...and only in three (rated most important?) namespaces: like the Wiki-search with the settings uses now as default http://en.opensuse.org/index.php?ns0=1&ns102=1&search=SaX2&title=Special%3ASearch&fulltext=Advanced+search&fulltext=Advanced+searchy Can pages with redirects be treated different for example: be excluded if the name does not match exactly to the searched term/phrase (but disregarding capital/non-capital letters)? See also: http://lists.opensuse.org/archive/opensuse-wiki/2010-07/msg00221.html http://en.opensuse.org/User:Jnweiger/Wiki_search#Suggestions http://forums.opensuse.org/english/community/opensuse-wiki-discussions/44150... Regards pistazienfresser By the way: A) I am not able to access: http://enstage.opensuse.org/Portal:GoogleSearch B) the normal/external google search on en.opensuse.org for "SaX2" http://www.google.com/search?hl=en&q=site%3Aen.opensuse.org+SaX2&aq=f&aqi=&aql=&oq=&gs_rfai= still gives me a list... *... with the best match as the second in the list but mostly containing dead links: * http://www.google.com/url?sa=t&source=web&cd=1&ved=0CBIQFjAA&url=http%3A%2F%2Fen.opensuse.org%2FYaST%2FModules%2FMouse_Model&ei=c7FSTP6sD-KJOPSPpZ4O&usg=AFQjCNHMiUTfCa-SIdK_0liuAOBdzVDTXQ a dead link as 1. match * http://en.opensuse.org/Talk:Bugs/SaX2 a dead link as 3. match * http://en.opensuse.org/Bugs/SaX2?lang= a dead link as 5. match * http://en.opensuse.org/Category:SDB:SaX2 a dead link as 6. match *http://en.opensuse.org/Talk:SaX2_Bugs a dead link as 7. match * http://en.opensuse.org/Patterns/Definition_Language/X11 a dead link as 8. match C) On 30/07/10 12:08, Juergen Weigert wrote:
higher quality
I do not think that articles/pages in the namespaces "Main" and "Portal" should / are able to / do necessary be of higher quality. In my opinion they should just be more appropriate for a 'normal user'/'consumer' of the openSUSE project (including the openSUSE distribution) so: - be an entrance to the wiki - be more basic than special - include definitions/explaination that may not interest/annoy a programmer - include links more special topics (in the other namespaces and in other sub-projects of the openSUSE project) -- - openSUSE profile: https://users.opensuse.org/show/pistazienfresser -- To unsubscribe, e-mail: opensuse-wiki+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-wiki+help@opensuse.org
"pistazienfresser (see profile)" <pistazienfresser@gmx.de> 7/30/2010 5:36 AM >>> On 30/07/10 12:08, Juergen Weigert wrote: On Jul 29, 10 11:57:37 -0600, Matthew Ehle wrote: [...] Can we influence the results of the google search? We may want to move results from Portal and Main namespaces to the top, to honor the higher quality of the pages found there.
Is it possible in the Google Custom Search to make even more differences especially between matches 0. (not only) between namespaces but 1. exactly to the name 2. in the title/name 3. to the text 4. in combinations of (0. AND 1.); (0 and 2.); ...
For example: If someone gives "sax2" or "SaX2" to the search engine can he be directed only to http://en.opensuse.org/Archive:SaX2 to a list with first to [[Archive:SaX2]] instead of
to a list in that are all the (probably at least for 11.3 outdated) articles listed that have (still) SaX2/sax2 anywhere in the body/source of the article...
- ...and only in three (rated most important?) namespaces: like the Wiki-search with the settings uses now as default http://en.opensuse.org/index.php?ns0=1&ns102=1&search=SaX2&title=Special3ASearch&fulltext=Advanced+search&fulltext=Advanced+searchy
Can pages with redirects be treated different for example: be excluded if the name does not match exactly to the searched term/phrase (but disregarding capital/non-capital letters)?
See also: http://lists.opensuse.org/archive/opensuse-wiki/2010-07/msg00221.html http://en.opensuse.org/User:Jnweiger/Wiki_search#Suggestions http://forums.opensuse.org/english/community/opensuse-wiki-discussions/44150...
The short answer to your questions is... unfortunately not. The problem with using Google search is that you have to live by the Google algorithms. However, IMHO, this is still not such a bad thing. If the Portal articles are really of higher quality, users will use it more, link to it more, and those namespaces will naturally bubble to the top. Also, Henne seems to have come across a pretty good idea for how to handle the whole issue about Google not taking you to an article directly. I think the idea would be that if the MediaWiki search doesn't take you right there, the Google search will be prominently displayed as another option. This way, the user has a choice of search options.
By the way:
A) I am not able to access: http://enstage.opensuse.org/Portal:GoogleSearch You have to be inside the Novell network to access the staging site at all. This is done for a couple of reasons, but I won't go into it here. In any case, that page was just nuked, and it can now be found at http://enstage.opensuse.org/MediaWiki:GoogleSearch, assuming you can get to the site.
B) the normal/external google search on en.opensuse.org for "SaX2" http://www.google.com/search?hl=en&q=site%3Aen.opensuse.org+SaX2&aq=f&aqi=&aql=&oq=&gs_rfai= still gives me a list...
*... with the best match as the second in the list but mostly containing dead links:
I submitted a site map to Google yesterday, and they are currently re-indexing the wiki. A lot of dead links are already out of the index, and this will continue to get better over time. In addition, I have created a cron to rebuild the site map daily, so new articles won't take forever to show up. C)
On 30/07/10 12:08, Juergen Weigert wrote: higher quality
I do not think that articles/pages in the namespaces "Main" and "Portal" should / are able to / do necessary be of higher quality.
In my opinion they should just be more appropriate for a 'normal user'/'consumer' of the openSUSE project (including the openSUSE distribution) so: - be an entrance to the wiki - be more basic than special - include definitions/explaination that may not interest/annoy a programmer - include links more special topics (in the other namespaces and in other sub-projects of the openSUSE project)
Agreed. Google got to where it is by catering to the consumer more than the publisher ; )
Hey, On 29.07.2010 19:57, Matthew Ehle wrote:
In the meantime, I would really like to see the Google search option worked into the theme somehow. We could replace the default search form, put another one beside it, create a link to the above-mentioned page, whatever. We just need to make it obvious, so that people can use it.
I have reworked the search page that is shown if you can't find something a bit and included the google search http://en.opensuse.org/index.php?search=something&ns0=1&ns102=1&title=Special%3ASearch&fulltext=Search&fulltext=Search But this is only for the meantime until we get Lucene. Lucene is what wikipedia uses and it imho has the right combination of the features of the MySQL search and Google. Henne -- Henne Vogelsang, openSUSE. Everybody has a plan, until they get hit. - Mike Tyson -- To unsubscribe, e-mail: opensuse-wiki+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-wiki+help@opensuse.org
Hi, On 30.07.2010 13:54, Henne Vogelsang wrote:
On 29.07.2010 19:57, Matthew Ehle wrote:
In the meantime, I would really like to see the Google search option worked into the theme somehow.
BTW can you please use another pagename for google search engine? Portal: does not really fit with our namespace policy. Can you please use Mediawiki: or Special:? TIA Henne -- Henne Vogelsang, openSUSE. Everybody has a plan, until they get hit. - Mike Tyson -- To unsubscribe, e-mail: opensuse-wiki+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-wiki+help@opensuse.org
On 30/07/10 13:59, Henne Vogelsang wrote:
Hi,
On 30.07.2010 13:54, Henne Vogelsang wrote:
On 29.07.2010 19:57, Matthew Ehle wrote:
In the meantime, I would really like to see the Google search option worked into the theme somehow.
Yes. On place of/under/beneath the other search engine.
BTW can you please use another pagename for google search engine? Portal: does not really fit with our namespace policy.[...]
http://en.opensuse.org/Portal:GoogleSearch or [[Portal:GoogleSearch]] Why not? [1] http://en.opensuse.org/Help:Namespace contains on description or definition of the namespace "Portal". And [2] http://en.opensuse.org/Portal:Wiki/Concept#Navigation = http://en.opensuse.org/index.php?title=Portal:Wiki/Concept&oldid=19860 contains this definition: "Navigation Navigation happens through means common to any wiki, but there are also portals. Portals are entry points for a specific topic, similar to the main page. They provide an overview over a topic and guide readers to the content they seek which is either another portal or an individual article. " As a search (that really finds something) - is (the main tool) for navigation, - is the entry point of most importance, - does provide a overview over any special topic and - guides readers to the content it is the best example for a page fitting to that only definition of the namespace "Portal" in the openSUSEwiki. An other possible place for it is the namespace Main as a (working) search in the openSUSE wiki is the very most important thing "for people who are new to openSUSE and maybe to Linux in general" (source: see above [2]) Greetings pistazienfresser -- - openSUSE profile: https://users.opensuse.org/show/pistazienfresser -- To unsubscribe, e-mail: opensuse-wiki+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-wiki+help@opensuse.org
I have reworked the search page that is shown if you can't find something a bit and included the google search
But this is only for the meantime until we get Lucene. Lucene is what wikipedia uses and it imho has the right combination of the features of the MySQL search and Google.
That is a good start. Is there a way that we can move that higher up or make it more prominent somehow? Also, can we do the same even if we get a list of results? Even when Mediawiki finds pages, they are often very limited or poorly ranked. I like where you are going though, as it combines the strengths of the default search with that of the Google Search. As for Lucene, just remember that patience is a virtue ; ) Don't worry, I plan on getting it installed and tested on stage as soon as the server gets an upgrade. There are a few technical hurdles to overcome, but I'm confident we can make it work.
BTW can you please use another pagename for google search engine? Portal: does not really fit with our namespace policy. Can you please use Mediawiki: or Special:?
My first thought was to use Special namespace, but that would have to be programmed in the wiki software. Portal seemed to make the most sense after that, but I didn't think to use the MediaWiki namespace. Therefore, it is now under both Portal:GoogleSearch and MediaWiki:GoogleSearch. Let me know when you have made the change in the search results page, and I'll nuke Portal:GoogleSearch. -Matt
On Friday 30 July 2010 12:10:01 Matthew Ehle wrote:
MediaWiki:GoogleSearch
The MediaWiki namespace is really not meant to put normal pages there. -- Regards, Rajko -- To unsubscribe, e-mail: opensuse-wiki+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-wiki+help@opensuse.org
"Rajko M." <rmatov101@charter.net> 7/30/2010 12:32 PM >>> On Friday 30 July 2010 12:10:01 Matthew Ehle wrote:
MediaWiki:GoogleSearch
The MediaWiki namespace is really not meant to put normal pages there.
The point is that this is not a normal page. It makes sense to put it there, since it is a wiki function and will be automatically restricted to the Administrators group.
http://en.wiktionary.org/w/index.php?title=Special%3ASearch&redirs=1&search=linux&fulltext=Search&searchengineselect=mediawiki&ns0=1 In the en.wikitionary you get offered to choose between - MediaWiki - Google - Wikiwix - Bing - Yahoo for the search on en.wiktionary.org if your input to the default search box does not lead you directly to an exact fitting article. Have a lot of fun pistazienfresser -- - openSUSE profile: https://users.opensuse.org/show/pistazienfresser -- To unsubscribe, e-mail: opensuse-wiki+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-wiki+help@opensuse.org
Hey, On 07/31/2010 11:50 AM, pistazienfresser (see profile) wrote:
In the en.wikitionary you get offered to choose between
They use Lucene :) Henne -- Henne Vogelsang, openSUSE. Everybody has a plan, until they get hit. - Mike Tyson -- To unsubscribe, e-mail: opensuse-wiki+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-wiki+help@opensuse.org
Hey,
On 07/31/2010 11:50 AM, pistazienfresser (see profile) wrote:
In the en.wikitionary you get offered to choose between
They use Lucene :) Oh sorry Henne, I was not aware of that/what Lucene includes/how cool your idea to use Lucene really is. Thanks
On 02/08/10 12:11, Henne Vogelsang wrote: pistazienfresser -- - openSUSE profile: https://users.opensuse.org/show/pistazienfresser -- To unsubscribe, e-mail: opensuse-wiki+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-wiki+help@opensuse.org
Hey, On 08/02/2010 12:41 PM, pistazienfresser (see profile) wrote:
On 02/08/10 12:11, Henne Vogelsang wrote:
On 07/31/2010 11:50 AM, pistazienfresser (see profile) wrote:
In the en.wikitionary you get offered to choose between
They use Lucene :) Oh sorry Henne, I was not aware of that/what Lucene includes/how cool your idea to use Lucene really is.
It's not my idea others had it before me as Matthew already explained. I'm just pushing for it now. The coolest thing about it is that wikipedia uses it and it therefor gets continuously improved in way that matters for wiki's and combines the strengths of the default mysql search and a real search engine. Unlike google or any other 3rd party tool that "just" parses web-pages in context of the whole WWW... Henne -- Henne Vogelsang, openSUSE. Everybody has a plan, until they get hit. - Mike Tyson -- To unsubscribe, e-mail: opensuse-wiki+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-wiki+help@opensuse.org
Hi! On Mon, 2010-08-02 at 12:51 +0200, Henne Vogelsang wrote:
Hey,
It's not my idea others had it before me as Matthew already explained. I'm just pushing for it now. The coolest thing about it is that wikipedia uses it and it therefor gets continuously improved in way that matters for wiki's and combines the strengths of the default mysql search and a real search engine. Unlike google or any other 3rd party tool that "just" parses web-pages in context of the whole WWW...
It would be nice to explore the possibility of modifying the search-box [with this Lucene as backend] so that it looks like banshee's filter-box with various options such as "Documentation", "Community Contributions", etc. See for example here [http://imagebin.ca/view/PGG81G.html] Btw, I would just like to mention that I like the new wiki instance a lot. Breaking up contents into namespaces has made it easier to browse the contents of the wiki without resorting to search every time. With some improvement in the search capabilities of the wiki, and more filling up of content in keeping with the present organisation will make it just perfect. Bye. -- Atri -- To unsubscribe, e-mail: opensuse-wiki+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-wiki+help@opensuse.org
Hey, On 07/30/2010 07:10 PM, Matthew Ehle wrote:
I have reworked the search page that is shown if you can't find something a bit and included the google search
But this is only for the meantime until we get Lucene. Lucene is what wikipedia uses and it imho has the right combination of the features of the MySQL search and Google.
That is a good start. Is there a way that we can move that higher up or make it more prominent somehow?
I moved it around. Have a look again.
Also, can we do the same even if we get a list of results?
Not without patching the search special page and that is something I would like to avoid. Better invest that time into Lucene.
As for Lucene, just remember that patience is a virtue ; ) Don't worry, I plan on getting it installed and tested on stage as soon as the server gets an upgrade. There are a few technical hurdles to overcome, but I'm confident we can make it work.
Okay cool :)
BTW can you please use another pagename for google search engine? Portal: does not really fit with our namespace policy. Can you please use Mediawiki: or Special:?
My first thought was to use Special namespace, but that would have to be programmed in the wiki software. Portal seemed to make the most sense after that, but I didn't think to use the MediaWiki namespace. Therefore, it is now under both Portal:GoogleSearch and MediaWiki:GoogleSearch. Let me know when you have made the change in the search results page, and I'll nuke Portal:GoogleSearch.
Done. Henne -- Henne Vogelsang, openSUSE. Everybody has a plan, until they get hit. - Mike Tyson -- To unsubscribe, e-mail: opensuse-wiki+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-wiki+help@opensuse.org
participants (11)
-
Atri
-
Brian K. White
-
Carlos E. R.
-
Carlos E. R.
-
Christian Boltz
-
Henne Vogelsang
-
Juergen Weigert
-
Matthew Ehle
-
pistazienfresser (see profile)
-
Rajko M.
-
Thomas Schmidt