Hello, Am Sonntag, 22. März 2020, 21:31:54 CET schrieb Lars Vogdt:
* Some wiki databases have 3 different collations and 2 different engines defined now. Note: interestingly not all of them. Example: [!!] 3 different collations for database wiki_it [!!] 2 different engines for database wiki_it
InnoDB should work for all wiki tables.
[!!] There are 2 storage engines. Be careful. Affected wiki databases: wiki_es, wiki_it, wiki_nl, wiki_pt, wiki_old_de, wiki_zh, wiki_en and: webforums
The forums have some MEMORY tables, for example for user sessions. (I'm not sure if this is a good idea when using a galera cluster - IIRC I've read somewhere that MEMORY tables don't get synced.)
* Some (äh: 8944!) tables also have wrong types set for some fields.
* Also very important: a lot of tables don't have primary keys. This slows down the synchronization between the nodes. I will see if there are some guidelines for the mediawiki databases to add indexes to these tables -
I just checked the tables you told me on IRC. Some of them have multi- column unique keys - but that's not really helpful as primary key. Unfortunately adding a single column primary key isn't possible for some (most?) tables IMHO. The good thing with MediaWiki is that there's quite some documentation, for example https://www.mediawiki.org/wiki/Manual:Change_tag_table and for this specific table, it seems newer Mediawiki versions have a primary key added. Yeah, I know that I should finally do the upgrade :-/ But let's look at each of the tables: CREATE TABLE `change_tag` ( `ct_rc_id` int(11) DEFAULT NULL, `ct_log_id` int(11) DEFAULT NULL, `ct_rev_id` int(11) DEFAULT NULL, `ct_tag` varbinary(255) NOT NULL, `ct_params` blob DEFAULT NULL, UNIQUE KEY `change_tag_rc_tag` (`ct_rc_id`,`ct_tag`), UNIQUE KEY `change_tag_log_tag` (`ct_log_id`,`ct_tag`), UNIQUE KEY `change_tag_rev_tag` (`ct_rev_id`,`ct_tag`), KEY `change_tag_tag_id` (`ct_tag`,`ct_rc_id`,`ct_rev_id`,`ct_log_id`) ) ENGINE=InnoDB DEFAULT CHARSET=binary ROW_FORMAT=DYNAMIC Empty table - based on the field names, I'd guess that it could track n:m relations. https://www.mediawiki.org/wiki/Manual:Change_tag_table shows that newer versions have an additional column as primary key. CREATE TABLE `hit_counter_extension` ( `hc_id` int(10) unsigned NOT NULL ) ENGINE=MEMORY DEFAULT CHARSET=latin1 MAX_ROWS=25000 ROW_FORMAT=DYNAMIC Empty table (from the extension that feeds the "This page has been accessed $number times." footer). CREATE TABLE `l10n_cache` ( `lc_lang` varbinary(32) NOT NULL, `lc_key` varchar(255) NOT NULL, `lc_value` mediumblob NOT NULL, KEY `lc_lang_key` (`lc_lang`,`lc_key`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=DYNAMIC lc_lang_key could possibly be UNIQUE instead of KEY, but there's no single column that would survive UNIQUE. https://www.mediawiki.org/wiki/Manual:L10n_cache_table looks like there's still no primary key. CREATE TABLE `oldimage` ( `oi_name` varbinary(255) NOT NULL DEFAULT '', `oi_archive_name` varbinary(255) NOT NULL DEFAULT '', `oi_size` int(10) unsigned NOT NULL DEFAULT 0, `oi_width` int(11) NOT NULL DEFAULT 0, `oi_height` int(11) NOT NULL DEFAULT 0, `oi_bits` int(11) NOT NULL DEFAULT 0, `oi_description` varbinary(767) NOT NULL, `oi_user` int(10) unsigned NOT NULL DEFAULT 0, `oi_user_text` varbinary(255) NOT NULL, `oi_timestamp` binary(14) NOT NULL DEFAULT '\0\0\0\0\0\0\0\0\0\0\0\0\0\0', `oi_metadata` mediumblob NOT NULL, `oi_media_type` enum('UNKNOWN','BITMAP','DRAWING','AUDIO','VIDEO','MULTIMEDIA','OFFICE','TEXT','EXECUTABLE','ARCHIVE') DEFAULT NULL, `oi_major_mime` enum('unknown','application','audio','image','text','video','message','model','multipart','chemical') DEFAULT NULL, `oi_minor_mime` varbinary(100) NOT NULL DEFAULT 'unknown', `oi_deleted` tinyint(3) unsigned NOT NULL DEFAULT 0, `oi_sha1` varbinary(32) NOT NULL DEFAULT '', KEY `oi_usertext_timestamp` (`oi_user_text`,`oi_timestamp`), KEY `oi_name_timestamp` (`oi_name`,`oi_timestamp`), KEY `oi_name_archive_name` (`oi_name`,`oi_archive_name`(14)), KEY `oi_sha1` (`oi_sha1`) ) ENGINE=InnoDB DEFAULT CHARSET=binary ROW_FORMAT=DYNAMIC More or less the same here - and I'd even say that introducing UNIQUE in this table could create problems in rare situations (for example two users deleting or replacing the same picture in the same second). If you ignore that, oi_archive_name might in theory be a candidate for UNIQUE, but I don't recommend it (besides that - a varbinary as primary key might not help much with the performance) Sidenote: this table changes rarely (only when a picture gets replaced or deleted). https://www.mediawiki.org/wiki/Manual:Oldimage_table - no sign of an added primary key. CREATE TABLE `querycache` ( `qc_type` varbinary(32) NOT NULL, `qc_value` int(10) unsigned NOT NULL DEFAULT 0, `qc_namespace` int(11) NOT NULL DEFAULT 0, `qc_title` varbinary(255) NOT NULL DEFAULT '', KEY `qc_type` (`qc_type`,`qc_value`) ) ENGINE=InnoDB DEFAULT CHARSET=binary ROW_FORMAT=DYNAMIC Empty table. https://www.mediawiki.org/wiki/Manual:Querycache_table - also no primary key added CREATE TABLE `user_properties` ( `up_user` int(11) NOT NULL, `up_property` varbinary(255) DEFAULT NULL, `up_value` varchar(152) NOT NULL, UNIQUE KEY `user_properties_user_property` (`up_user`,`up_property`), KEY `user_properties_property` (`up_property`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=DYNAMIC Another case where you can't add a single-column UNIQUE. This table contains user settings, which probably doesn't get too many writes. https://www.mediawiki.org/wiki/Manual:User_properties_table still shows the same table layout. To sum it up - I'm sorry that I can't give you a more positive answer.
* On the opposite side, there are ~20 unused indexes. I will remove them next week.
Note that they might come back when running upgrade scripts ;-) Luckily superfluous indexes are not a serious performance problem (admittedly, I'm mostly familiar with standalone servers, not sure if they cause slowdown in a cluster) Missing indexes are more noticable. I remember a case where adding an index speeded up a query from some minutes (2 GB full table scan!) to 0.001 seconds (even answering the query from the new index) ;-) Regards, Christian Boltz -- Unterdessen hat sich die Filmindustrie was ganz, ganz tolles ausgedacht: Ultraviolet (Man hätte einen beknackteren Namen finden können, aber das wäre Arbeit gewesen) ist DRM-as-a-service, 100% Buzzword Compliant und in der Cloud. [http://blog.koehntopp.de/archives/3068-Diesmal-also-eBooks-und-Filme.html] -- To unsubscribe, e-mail: heroes+unsubscribe@opensuse.org To contact the owner, e-mail: heroes+owner@opensuse.org