[opensuse-wiki] List of regularly accessed old wiki articles?
Hi, since I stumble across non-migrated articles from time to time and I plan to migrate some of those soon, I wondered if there is a list of old articles that were often accessed _since_ the migration to the new wiki? Is there also a list for often used search terms? Especially those who return no article? My main goal is to rather migrate or write articles that are needed by the majority than those that I need or miss ;) -- Bye, Stephan Barth Novell Technical Services, Worldwide Support Services Linux SUSE LINUX GmbH, GF: Felix Imendörffer, HRB 21284 (AG Nürnberg) Maxfeldstr. 5, D-90409 Nuremberg -- To unsubscribe, e-mail: opensuse-wiki+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-wiki+help@opensuse.org
Hey, On 12/17/10 5:24 PM, Stephan Barth wrote:
since I stumble across non-migrated articles from time to time and I plan to migrate some of those soon, I wondered if there is a list of old articles that were often accessed _since_ the migration to the new wiki?
Not that i know of no. But i bet Matthew can send you apache logs. Matthew? Henne -- http://www.opensuse.org -- To unsubscribe, e-mail: opensuse-wiki+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-wiki+help@opensuse.org
On Monday, December 20, 2010 06:29:42 am Henne Vogelsang wrote:
Hey,
On 12/17/10 5:24 PM, Stephan Barth wrote:
since I stumble across non-migrated articles from time to time and I plan to migrate some of those soon, I wondered if there is a list of old articles that were often accessed _since_ the migration to the new wiki?
Not that i know of no. But i bet Matthew can send you apache logs. Matthew?
Henne
Are those the same logs that I asked for some time ago, when I wanted to make some counts on page popularity and sort that in a few ways, like how it chages trough the time? (Recent popular pages need more attention then historic.) All visits in one count is good for Wikipedia, but for openSUSE wiki that has large part of content sensitive on passage of the time, it is largely useless. -- Regards, Rajko -- To unsubscribe, e-mail: opensuse-wiki+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-wiki+help@opensuse.org
Hello, on Dienstag, 21. Dezember 2010, Rajko M. wrote:
On Monday, December 20, 2010 06:29:42 am Henne Vogelsang wrote:
On 12/17/10 5:24 PM, Stephan Barth wrote:
since I stumble across non-migrated articles from time to time and I plan to migrate some of those soon, I wondered if there is a list of old articles that were often accessed _since_ the migration to the new wiki?
Are those the same logs that I asked for some time ago, when I wanted to make some counts on page popularity and sort that in a few ways, like how it chages trough the time? (Recent popular pages need more attention then historic.)
All visits in one count is good for Wikipedia, but for openSUSE wiki that has large part of content sensitive on passage of the time, it is largely useless.
The access_log probably contains too much uninteresting stuff ;-) and too much private data (IPs, referrer, pages visited by the same person/IP etc.) IMHO the easier way would be to create a cronjob that writes the content of the "page" table (as list of INSERTs or tab-separated text) to a file every day. Please include the date in the filename ("pagestats-20101221") so that historical data is not overwritten. Please use the following query for this: SELECT *, date(now()) AS date FROM page; (reason for including the date: it allows to hold historical data in a (local) database) Then we "only" need a tool that calculates the diff in page_counter for two given dates - that's not too difficult to write ;-) A self-JOIN might be enough, something like (untested!) SELECT d1.page_namespace, d1.page_title, # or d1.*, d2.page_counter - d1.page_counter as counter_diff FROM page d1 LEFT JOIN page p2 ON p1.page_id = p2.page_id WHERE date = "2010-12-01" OR date = "2010-12-20" ORDER BY date, counter_diff DESC The most interesting columns are probably page_namespace, page_title and the diff of page_counter, however the other columns (page_touched, page_is_redirect) might also be useful (for things like "often accessed old pages" or something like that). If someone (Matthew?) can make the daily export/dump (of all old and new wikis, please! [1]) available, I can probably write the query interface (and host it if needed). Regards, Christian Boltz [1] Those access statistics are useful for every wiki, not only for the migration from the old wiki. --
Ich moechte gern einige User die ihre Mails ueber einen Mailserver (sendmail bevorzugt, postfix auch moeglich) scannen. Dafür reicht ein Kopierer. Hosen runter, User draufsetzen und "Copy" drücken! [> Ralf Thomas und Sandy Drobic in suse-linux] -- To unsubscribe, e-mail: opensuse-wiki+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-wiki+help@opensuse.org
Am 21/12/10 19:21, schrieb Christian Boltz:
Hello,
on Dienstag, 21. Dezember 2010, Rajko M. wrote:
On Monday, December 20, 2010 06:29:42 am Henne Vogelsang wrote:
On 12/17/10 5:24 PM, Stephan Barth wrote:
since I stumble across non-migrated articles from time to time and I plan to migrate some of those soon, I wondered if there is a list of old articles that were often accessed _since_ the migration to the new wiki?
Are those the same logs that I asked for some time ago, when I wanted to make some counts on page popularity and sort that in a few ways, like how it chages trough the time? (Recent popular pages need more attention then historic.)
All visits in one count is good for Wikipedia, but for openSUSE wiki that has large part of content sensitive on passage of the time, it is largely useless.
The access_log probably contains too much uninteresting stuff ;-) and too much private data (IPs, referrer, pages visited by the same person/IP etc.)
IMHO the easier way would be to create a cronjob that writes the content of the "page" table (as list of INSERTs or tab-separated text) to a file every day. Please include the date in the filename ("pagestats-20101221") so that historical data is not overwritten.
Please use the following query for this: SELECT *, date(now()) AS date FROM page; (reason for including the date: it allows to hold historical data in a (local) database)
Then we "only" need a tool that calculates the diff in page_counter for two given dates - that's not too difficult to write ;-) A self-JOIN might be enough, something like (untested!)
SELECT d1.page_namespace, d1.page_title, # or d1.*, d2.page_counter - d1.page_counter as counter_diff FROM page d1 LEFT JOIN page p2 ON p1.page_id = p2.page_id WHERE date = "2010-12-01" OR date = "2010-12-20" ORDER BY date, counter_diff DESC
The most interesting columns are probably page_namespace, page_title and the diff of page_counter, however the other columns (page_touched, page_is_redirect) might also be useful (for things like "often accessed old pages" or something like that).
If someone (Matthew?) can make the daily export/dump (of all old and new wikis, please! [1]) available, I can probably write the query interface (and host it if needed).
Regards,
Christian Boltz
[1] Those access statistics are useful for every wiki, not only for the migration from the old wiki.
There is a small patten on every page of the openSUSE wikis (on the original wiki a bit less small) with the number of accesses e.g. http://old-en.opensuse.org/KNetworkManager "This page has been accessed 299,292 times." Maybe there is a history for the corresponding data and the difference between a pair of data from different times could be computed? Just an idea, Regards pistazienfresser -- - openSUSE profile: https://users.opensuse.org/show/pistazienfresser -- To unsubscribe, e-mail: opensuse-wiki+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-wiki+help@opensuse.org
Hello, on Donnerstag, 23. Dezember 2010, pistazienfresser (see profile) wrote:
Am 21/12/10 19:21, schrieb Christian Boltz:
SELECT *, date(now()) AS date FROM page;
Then we "only" need a tool that calculates the diff in page_counter for two given dates - that's not too difficult to write ;-) A self-JOIN might be enough, something like (untested!)
There is a small patten on every page of the openSUSE wikis (on the original wiki a bit less small) with the number of accesses "This page has been accessed 299,292 times."
Yes, that's exactly what is stored in the page_counter column.
Maybe there is a history for the corresponding data and the difference between a pair of data from different times could be computed?
AFAIK mediawiki doesn't store historic data for the the page_counter. That's why I asked for daily dumps of the page table ;-) Regards, Christian Boltz -- Nicht das ich frei von Paranoia Schueben waere ;), aber wenn Dir das passiert spiel sofort Lotto, bei dem Glueck bekommst Du bestimmt 4 Wochen den 6er mit Superzahl. [Maik Holtkamp in suse-linux] -- To unsubscribe, e-mail: opensuse-wiki+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-wiki+help@opensuse.org
participants (5)
-
Christian Boltz
-
Henne Vogelsang
-
pistazienfresser (see profile)
-
Rajko M.
-
Stephan Barth