Extension talk:MultiReplace

Stripping accents and diacritics from a text
I was trying to do this with #replace but quickly found out that it has a limited nesting level, so I looked for alternatives and found this great extension. Here's the simple function to strip all diacritics from the Latin alphabet, perhaps it may be useful for someone else:
 *  

I inserted it in a template called StripAccents and it's working fine. Very few letters with diacritics were left out. Don't forget to include the upper case equivalents if necessary. —Capmo 03:24, 13 May 2009 (UTC) Just to make it more clear, it's not impossible to do the same using the #replace function, nested like this:
 *  ),

but as soon as the maximum level of nested #replaces is reached, you'll have to continue the replaces using a second (a third, etc.) template. Then, to achieve the same result as the StripAccents template of the above example, you'd have to write something like  . —Capmo 05:51, 13 May 2009 (UTC)


 * Thanks! This might be very useful in many situations. But there's an easier way to do this with regular expression:
 *  


 * &mdash;Matěj Grabovský 12:16, 13 May 2009 (UTC)


 * Great, Matěj! I tried doing this with RegExp but couldn't find the exact syntax, thanks for that! Do you have any idea of which of the solutions is less memory/processor demanding? Capmo 18:40, 13 May 2009 (UTC)


 * Hi Matěj, me again. Sorry to inform, but something didn't work well when I tried to use your syntax: "Antonín Dvořák" became "Antonain Dvooeaak" and "Gegrüßet" displayed as "Gegrauaget". I had to revert to my previous syntax. It seems the RegExp is getting messed up with all those Unicode characters... I read somewhere that RegExp requires Unicode parameters in hex format, but then it wouldn't be worth using RegExp at all. Any idea on how to fix it in a simple way? Capmo 20:45, 13 May 2009 (UTC)


 * Crap, I'll take a look into it. &mdash;Matěj Grabovský 05:33, 14 May 2009 (UTC)


 * Hey Matěj, solution found! We need to use the option /u "which turns on the Unicode matching mode, instead of the default 8-bit matching mode". I already updated your example above with this option, ok! By the way, only now did I notice that you're the developer of this extension, thanks a lot for it! :) —Capmo 18:46, 16 May 2009 (UTC)


 * Hell yeah! That's it, thanks you very much for finding it. &mdash;Matěj Grabovský 05:38, 19 May 2009 (UTC)

needless cache for replacing
Have you ever tried something like:  The result is: aaaaaaa and not as expected bababab. I think this behavior is nonsensical because if someone won't as result aaaaaaa he wouldn't replace a=b because a already is a. I would consider this behavior as a bug. --Danwe 12:49, 13 May 2009 (UTC)


 * Hum, that's interesting! Based on your example I see that the extension scans the whole text for the first replace argument, then it scans again all the text for the second argument, and so on. I would expect the opposite too: that the text would be scanned just once, and all replacements made during this process.
 * As an alternative, you can use an intermediary variable. For example,  produces the result you want. —Capmo 18:53, 13 May 2009 (UTC)


 * Well, I'll take a look into this, too. &mdash;Matěj Grabovský 05:33, 14 May 2009 (UTC)


 * The idea with the variable isn't bad but the risk is that the variable appears somewhere else in the string and then you have a problem. Or you have to use a very complex variable string which makes the whole function call longer and confusing. --Danwe 12:27, 14 May 2009 (UTC)

Error Message
I've notice when I run the php program  from the UNIX prompt that there is an incessant error message: PHP Warning: preg_match: Compilation failed: reference to non-existent subpattern at offset 33 in /**/**/**/extensions/MultiReplace.php on line 86 How can this be help? I run MediaWiki 1.16alpha (r50326), PHP 5.2.1 (cgi-fcgi) and MySQL 5.0.22. --Aquatiki 10:59, 22 May 2009 (UTC)