Re: [opensuse-web] Re: [opensuse-wiki] Wiki Upgrade Problems

Am Freitag, 2. Dezember 2011 schrieb Matthew Ehle:
Hi Matthew, please don't roll back the content of the english wiki,
as it doesn't have utf-8 characters in the titles, and there have
been quite some changes this week. We are starting the opensuse
board elections today, and use the wiki as platform for the
candidates for example.

I have verified that it doesn't contain UTF8 in the titles and is safe
to continue. It will NOT be rolled back, and I have removed the lock
on it.

An interesting thing I noticed is that the English wiki has a
different collation on the page table than the other wikis. It is a
new wiki, unlike the others, which have been upgraded from much
earlier versions. That may be a clue as to the root cause and how we
can fix it.

Which collation is used for a) the english (and new german) wiki b) for
all other wikis? Which collation was used before doing the update on the
now broken wikis?

BTW: The MySQL default charset might also be involved - I seem to
remember that old mediawiki versions just used whatever was the default.
At least I have a wiki with similar problems where the column uses the
default MySQL charset - but fortunately it only affects
Special:Listfiles (
if you are interested). In my case, the database contains utf-8, but the
column is marked as iso-8859-15.

I had a short look at the ru wiki - I don't understand anything there
;-) but the page titles look like double-encoded utf-8 to me. Write some
of them to a text file and try recode utf-8..$previous_charset $file

<scary idea>
If I understood you right, the problem only affects the page _titles_.
It looks like the page title is stored in the "page" table - and not in
too many other tables (I found it in some logging and cache tables,
which aren't too relevant IMHO).

Can you try to just roll back the page titles in the page table?

Run the following query on the _old_ database to get a list of the
correct page titles as UPDATE statements:

select concat('UPDATE page SET page_title="', page_title, '" WHERE
page_id=' , page_id) from page;

Check that the result is valid utf-8 (or use recode to fix it), make
sure your MySQL connection uses utf-8 and then apply the resulting
UPDATE queries to the new database.

WARNING: this is completely untested and wrapped in a "<scary idea>" tag
for a reason. It might work, but I can't promise anything...
</scary idea>

The other wiki that is new, of course, is the German wiki. It has the
same collation on the page table as the English wiki, and it is
different than the old German wiki and all the other wikis.

Guess which UTF8 wiki isn't broken?

Hmmm... ;-)

Also for the other wikis I hope there comes up a patch so we can fix
the page titles without losing the new edits. I think there also
have been some changes in the german wiki at least.
It looks like the German wiki is actually fine, for the reasons
mentioned above. I'll remove the lock on it soon. I'm working hard
on saving the others. I think the Russian wiki may be a lost cause,
but I haven't lost hope yet.

See above - and I don't see a reason why the ru wiki should be "more
lost" than other language wikis ;-)


Christian Boltz
Glaub mir, die Schrott-Quote bei den ATA/Billig-SATA ist enorm, die
meisten merken's halt nur nicht. ;)
PS. Wir handeln u.a. mit sowas und die R├╝cklaufrate ist (sehr) hoch.
Du bist Schrotthaendler? ;-)
[> Mirko Richter und Thomas Hertweck in suse-linux]

