Topic on Project:Support desk

Database engine and collation

10
Jonathan3 (talkcontribs)

On my installation:

What it is What I think it should be
DB engine Mixture of MyISAM, InnoDB All InnoDB
DB collation Mixture of latin1_swedish_ci, utf_general_ci All binary
Column collation Mostly blank, a few are latin1_bin Blank
There is no consistent correlation between the Engine and the Collation.

My questions are:

  1. What changes, if any, should I make to my database(s)?
  2. ...and how? :-) Is it just a matter of backing up then changing the relevant settings (which would be simple using HeidiSQL)?

The reason I'm asking now is:

  • My Cargo tables are all InnoDB and latin1_swedish_ci so I get an "Illegal mix of collations" error if I use some foreign letters. For the time being I am avoiding those characters.
  • The advice here - Topic:Uurup80n0yzgmw88 - included that the database should be "either utf8 or binary". But phab:T194125 seems to suggest that binary is or will be required, so I'm asking now in the hope of getting the latest advice.

Thank you in advance.

Jonathan3 (talkcontribs)
Bawolff (talkcontribs)

innodb is a lot better than myisam. Myisam will work fine with mediawiki, but innodb is a lot more reliable.

Generally mediawiki tries not to care about db collation. Binary is the one that causes the least fuss so is a good reccomendation.

Collation, charset and engine can be changed using the ALTER command https://mariadb.com/kb/en/library/alter-table/

Jonathan3 (talkcontribs)

I imagine HeidiSQL uses ALTER when the fields are changed. Cheers. As I don’t have any funny characters at the minute, do you expect it’ll be straightforward to go ahead and make these changes? Can you think of any possible side effects, or other steps I would need to take?

Bawolff (talkcontribs)

alter statements can take a while to run, your wiki might not work well thats happening. If there are misencoded non-ascii characters it can sometimes cause problems (but as you say you have no funny characters). When switching to a multibyte charset sometimes you can have issues with keylength, but switching to binary shouldnt be an issue.

So basically no. However as always its a good idea to make a backuo before doing anything complicated to db just in case.

Jonathan3 (talkcontribs)

That's great. When I first installed MediaWiki in 2006 it wouldn't install because of a key length error, and I think I just shortened the length of whatever the key field was by two-thirds or whatever it was... in hindsight that probably wasn't the right answer :-)

Jonathan3 (talkcontribs)

I'll add (partly) for my own reference that tables.sql says: "Only the 'searchindex' table requires MyISAM due to the requirement for fulltext index support, which is missing from InnoDB."

Jonathan3 (talkcontribs)

Two last questions, I hope: (@Bawolff)

  1. When I alter the table collation, will that automatically alter the columns which have their own collation setting?
  2. I don't think character set is defined anywhere, so I imagine some default is used, and therefore I won't need to change anything here - sound right?
Bawolff (talkcontribs)

I'm not sure, but i imagine that if a column has an explicit character set, it won't be changed just by changing the table, but if its just using the default then it would, but i don't actually know.

If character set is not defined, it would usually use the server default. I think that's usually latin1. MediaWiki will normally override that when creating tables via $wgDBTableOptions depending on what settings were set in the installer

Jonathan3 (talkcontribs)

I’ll mess around on a duplicate of the database and report back! Thank you.

Reply to "Database engine and collation"