Thread:Project:Support desk/How to recreate the search index/reply (8)

If I remember correctly, then changing the collation with COLLATE... does not only change the collation name, which MySQL uses to store stuff in the column. But changing the collation also makes MySQL convert content, which already is present in the column, from the old to the new collation. So basically: If content is displayed in one way before, then it should display the same way after changing the collation.

Apart from doing a COLLATE query to manipulate columns, you can also use a CONVERT query. The difference is (if I remember correctly), that CONVERT changes the actual content without changing the charset name of the column.

On your PROD system, doing a CONVERT is not what you want. There the content is actually displaying correctly. Note that MediaWiki expects the content to be UTF-8-encoded. Since MediaWiki displays the content correctly right now, it obviously is of the right charset already. Changing the collation from latin1_swedish_ci to utf8_general_ci should correct the wrong collation, while the content should still display correctly. Make sure you have a working backup before you try anything like that!

For your DEV system, ® is displayed as Â®. According to http://www.i18nqa.com/debug/utf8-debug.html the problem is being caused by UTF-8 bytes being interpreted as latin1. (Other letters should be broken the same way.) So I believe that for the DEV system this can be "fixed" by converting the actual content of the columns.