Topic on Project:Support desk

[RESOLVED] Cannot input UTF-8 characters (Japanese) when editing wiki pages

3
Genesishana (talkcontribs)

I have a fresh installation of MediaWiki 1.24.2 and I have difficulties inputting Japanese characters (and probably UTF-8 characters in general). When I am editing a wiki page using the editor, I start typing some Japanese characters and then click 'Save Page'. But then the characters become "???" after the edited page loads. Attempting to edit the page again shows the same "???" kind of characters.

I initially suspected it had something to do with how my database was configured since some of the character_set variables were set to 'latin1' but I have modified my database's /etc/my.cnf.d/server.cnf file to use utf8 (as shown below).

Default charset of httpd and php are set to utf8 as well.

Just now, I setup the VisualEditor (and parsoid service) and it seems I can input Japanese characters just fine within the VisualEditor and after saving, the wiki page looks fine. When I edit using the WikiEditor, the wiki page has the same garbled text after saving. I've also noticed when using the VisualEditor, I can create a link to a new page whose title consists of Japanese characters but when I proceed to actually create the new page, I get garbled text using the WikiEditor and when trying the VisualEditor, I get a bad title error.

What could be the problem with my MediaWiki's configuration? Any input is greatly appreciated.

Update: I've also realized when using the 'edit source' tab, I can correctly preview utf-8 edits using the 'preview tab' and the 'changes tab' without a problem. The encoding problem seems to occur only after I saved the page. In the VisualEditor case, I can successfully save page and it will reload correctly. However, pages whose titles are utf-8 based behave strangely, for example: I create a link to a new wiki page using utf-8 characters, then I click its red link to create its content. For some reason, utf-8 based content won't save but ASCII characters do. Also, it feels like it made two versions of the page, one with the correct utf-8 title, and then another with a broken title.

Solved: I did a complete re-installation of the LAMP stack on the server, taking extra care that utf-8 is the default charset everywhere during configuration. I also upgraded PHP to 5.6.7 and MariaDB to 10.0.17. Now MediaWiki behaves well with UTF-8 characters in both titles and content of pages.

Environment Specifications

  • MediaWiki 1.24.2
  • PHP 5.4.16 (apache2handler)
  • MariaDB 5.5.42-MariaDB-wsrep
  • CentOS Linux release 7.1.1503 (Core)


mysql> show variables like "collation%";
+----------------------+-----------------+
| Variable_name        | Value           |
+----------------------+-----------------+
| collation_connection | utf8_general_ci |
| collation_database   | utf8_general_ci |
| collation_server     | utf8_general_ci |
+----------------------+-----------------+
mysql> show variables like "char%";
+--------------------------+----------------------------+
| Variable_name            | Value                      |
+--------------------------+----------------------------+
| character_set_client     | utf8                       |
| character_set_connection | utf8                       |
| character_set_database   | utf8                       |
| character_set_filesystem | binary                     |
| character_set_results    | utf8                       |
| character_set_server     | utf8                       |
| character_set_system     | utf8                       |
| character_sets_dir       | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
2401:4900:1908:9404:1034:59E5:BB04:CBE5 (talkcontribs)

How to change the values of variables permanently ?

Ciencia Al Poder (talkcontribs)
Reply to "[RESOLVED] Cannot input UTF-8 characters (Japanese) when editing wiki pages"