Extension talk:Scribunto/Victor's API proposal

A few nice-to-have things: --Tgr (talk) 09:04, 3 June 2012 (UTC)
 * an iterator for going through the characters of an UTF-8 string (you could do that via len+sub, but that is uglier and probably much less efficient)
 * higher-level Unicode functions: normalizations, character classes, sort keys
 * encoding conversion (this is useful when creating external links to search etc. services which expect non-utf8 input)
 * basic data structures (one thing I was missing immediately was a set for efficient whitelist/blacklist lookup) and better support for the built-in table structure (find, index, map etc.)
 * I've added an iterator to the specification. I'm not sure about Unicode normalization — I think we should just normalize all Scribunto output to NFC, which is what MediaWiki uses as a convention (or it is probably already done somewhere else in parser or OutputPage). Unicode character data is possible, but I am not quite sure we want to ship it with the extension (it's quite large). Data structures may be implemented as user libraries; we may ship them with Scribunto if many users find it useful. vvvt 12:40, 3 June 2012 (UTC)

ustring OOP
You could actually make ustring work with OO fairly easily: since all strings have their metatable.__index set to string by default, anything you add to that will show up as a method on all strings. So, something like this:

string.ufind = ustring.find

would enable this:

someUnicodeString:ufind(...)


 * That is a nice approach we should consider; I am wary of possible side effects though. vvvt 12:38, 3 June 2012 (UTC)