Thread:Project:Support desk/Duplicate article name in Category page/reply (6)

Its the difference between combining characters ('BENGALI LETTER YA' (U+09AF) followed by 'BENGALI SIGN NUKTA' (U+09BC)), and precomposed characters ( Just 'BENGALI LETTER YYA' (U+09DF)).

MediaWiki expects all text to be in unicode Normal form C (NFC). Which means U+09DF is the proper encoding. Everything is supposed to be converted to this on save. Some import tools may import things incorrectly.

You can fix this by running the command cleanupTitles.php. Its also possible to clean up specific titles via the api (e.g. commons:MediaWiki:Invisible_characters_unveiled.js), but I would reccomend the maintenance script