Extension talk:ParserFunctions/String functions/Archive

Note: There are plans for installing this extension on Wikimedia wikis, see 6455.

sub + len
Sounds good. How about adding a "pos" (first position of substring in string) to the set? Example, splitting a string at the first colon / space / slash can be interesting. --&#160;Omniplex&#160;(w:t) 04:43, 16 May 2006 (UTC)


 * Unfortunately, all such functions have complexity of O(mn), where m and n are the sizes of the substring and string, respectively. I could impose a hard limit on the substring size in order to reduce the complexity, but such a limitation seems arbitrary and awkward.  Alternatively, I could limit searches to a single character, but this could severely hamper the function's usefulness in non-ASCII environments.  In short, I'm at a loss as to how to proceed.  Suggestions are welcome. --Algorithm 02:18, 17 May 2006 (UTC)


 * It's some decades ago when I had to know what O(mn) stands for, but if m is the length of the needle and n the length of the haystack you'd start at most n-m+1 comparisons. Actually comparing m bytes (ignoring potential UTF-8 optimizations) is the worst case, but unlikely, often you'd get "no match" much earlier. Reduced to ASCII and m > 2 the chance that you need more than two bytes are already near 1/(128*128), let's say below 1/10000. Overall I'd guess that 2*(n-m+1) is very pessimistic, about 2n. Is O(nm) possible, in theory? I see where you might need m at a certain position, but then the next position can't take m again, or can it? --&#160;Omniplex&#160;(w:t) 02:43, 17 May 2006 (UTC)


 * The cases in which it matters are few, but they're still important, as they are exploitable. For example, you could try to find "11111111111111111111111111" inside (assuming no linebreaks):

11111111111111111111111110111111111111111111111111101111111111111111111111111011111111111111111111 11111011111111111111111111111110111111111111111111111111101111111111111111111111111011111111111111 11111111111011111111111111111111111110111111111111111111111111101111111111111111111111111011111111 11111111111111111011111111111111111111111110111111111111111111111111101111111111111111111111111011 11111111111111111111111011111111111111111111111110111111111111111111111111101111... (ad nauseum)
 * This process would eventually return false, but it would have to check far more than O(n) possibilities. --Algorithm 07:42, 17 May 2006 (UTC)


 * That example is skewed, you've reduced it to (in essence) two characters instead of my 128 for ASCII. To get below a probability 1/10000 you need 2**14 (= about 14 comparisons) instead of 128**2 (= about 2 comparisons) until no match is very likely (99.999%).
 * For your example I'd expect about 14n comparisons as already too pessimistic. Checking that wild theory with your example I get:
 * M = 26, N = 488, no match after 6303 comparisons. And 6303 is 12.91*488 near the expected 14n, less than mn = 26n. --&#160;Omniplex&#160;(w:t) 00:59, 18 May 2006 (UTC)


 * You could get m(n-m+1) from checking "111111111111111111111111112" against a string of only ones, when you didn't expect the string to contain only ones. Any result worse than that is impossible, but that's pretty bad. --Brilliand 04:03, 6 February 2007 (UTC)

Thinking about that experimental result, maybe n*m/c is realistic, where c is the number of common characters. Uncommon characters are irrelevant, they only accelerate the no match conditions. Maybe c = 100 makes sense in practice, with that we'd get n*m/100. So if you limit m to 100 or less characters you should arrive at n comparisons as average case. --&#160;Omniplex&#160;(w:t) 01:20, 18 May 2006 (UTC)

Hello, heard of the Boyer-Moore string search algorithm? The longer the key the faster the search. Unless you want to reinvent the wheel, Wikipedia is a good place to start. 59.112.41.191 17:56, 19 June 2006 (UTC)


 * Heard: yes, studied or implemented: no. Cute, thanks for the link. Firmly in O(n) territory for this. But probably #pos: uses an existing function, not this compare backwards and jump optimization. --&#160;Omniplex&#160;(w:t) 16:19, 20 June 2006 (UTC)

#urlencode
The new #pos: is nice, if it's enabled here together with #len: and #sub:.

For #urlencode: the difference from the new magic word urlencode: would be interesting, I know sh*t about PHP, it has apparently two similar functions, one styling itself as "raw", and that's not your #urlencode: (?)

For #urldecode: I've no clue where that could be ever useful, can they together by chance protect the infamous "{", "|", "}", and "=" in templates? --&#160;Omniplex&#160;(w:t) 18:13, 21 May 2006 (UTC)


 * #urlencode should be absolutely identical to the new urlencode: function. They both use the same PHP function.  As for #urldecode, it will indeed protect those characters, but #urlencode won't be able to encrypt all of them.  Hence, if you want a "|" in the output, put %7C in its place and #urldecode it when it's no longer dangerous. (Note that this has its own issues; namely, you can't put raw "+"s or "%"s into encoded output and expect them to be unaffected by the decode.) --Algorithm 22:18, 21 May 2006 (UTC)


 * Ganglieri's (sp?) proposal on Talk:ParserFunctions was interesting, a function keeping all (top level) "|" as is. As if it accepts only one argument (ignoring "|"), actually accepting zero or more arguments returning the concatenation of the zero or more result strings separated by "|". Something that's better than my template:! ( edit•talk•links•history ) kludge. Probably #urldecode: %7C isn't the solution, unless it works as Wiki table markup like . --&#160;Omniplex&#160;(w:t) 16:44, 22 May 2006 (UTC)


 * FYI: 6219, maybe this also hits #urlencode:. --&#160;Omniplex&#160;(w:t) 11:50, 6 June 2006 (UTC)

Character substitution/replace
Hi,

I thought about adding very simple replace function that would all to substitutin/replace string very easily (unlike the "subst"). I though about and wrote the following function: function runReplace( &$parser, $inStr = , $ReplaceFrom = , $ReplaceTo = '' ) {   	$inStr = str_replace($ReplaceFrom,$ReplaceTo,$inStr); return $inStr; }

I think that its good candidate for generic function. I used it to convert windows paths ("\\server\folder1\folder2\file name with space.doc" to file:// URI: "file:\\server\folder1\folder2\file%20name%20with%20space.doc". URL encode is not good because if replaces the "\").

Thanks, Meir :->


 * I also think that it is a useful function with a simple code. --jsimlo 13:46, 18 July 2006 (UTC)

Wikipedia
Is there any plan on implementing this StringFunctions on Wikimedia projects? Borgx 08:13, 6 June 2006 (UTC)
 * I would really find some of them useful creating templates. --81.231.179.17 11:28, 23 July 2006 (UTC)


 * So would I. Especially since we're indexing articles by the second letter at sv-wikt. //Shell 16:10, 14 October 2006 (UTC)


 * I agree. 72.139.119.165 19:03, 17 October 2006 (UTC)


 * Please implement it in wikinews, too. It can be used for automatically generated teasers. --84.178.44.163 15:06, 3 December 2006 (UTC)


 * StringFunctions would be invaluable for increasing the complexity of template calls to decrease needed work by humans on Wikipedia. I can think of a concrete example: automatic template categorization. Under the categorization system, items are by default categorized by letter in a category based on their title. This is appropriate, but the system fails to recognize that titles starting with "The" should be categorized as though their titles were "Title, The". While this is irrelevant while ordinarily titling a page, since humans will add the category, it is relevant for categories embedded within a template, since the template is not even directly edited for a single article. StringFunctions would allow an article to be automatically sorted under such a proper title, by checking only the first 4 characters for the string "The " (note the space character, added so that articles with titles like "Thespian" are not mistreated). This functionality exists under StringFunctions. I hope it is implemented throughout Wikimedia as a useful tool. Nihiltres 05:02, 17 February 2007 (UTC)


 * Is anybody going to answer the above questions? When will these functions be implemented on Wikimedia projects?
 * See 6455. Hillgentleman | 書 |2007年03月25日( Sun ), 15:02:37 15:02, 25 March 2007 (UTC)
 * Just to chime in my support, I too would find changes to respect whitespaces very useful for the exact reason that Nihiltres mentions above... using something like  where title = "The Facts of Life" leads to alphabetization under " Facts of Life" instead of "Facts of Life".  71.137.18.65 05:15, 27 March 2007 (UTC)
 * Unfortunatelly, this is (AFAIK) not possible directly in StringFunctions. The parameters are being parsed and handled by the global wiki Parser, which is also responsible for the trimming. StringFunctions receive all parameters already trimmed, unless you use &lt;nowiki> workaround. --jsimlo(talk 07:13, 27 March 2007 (UTC)
 * Is it possible to use the nowiki workaround in the second parameter (the one specifying the string you're searching for) rather than the third parameter (the string you're putting in as a replacement)? Until then, I think I've figured out a work around... To change "The Facts of Life" to "Facts of Life":  It's a bit cumbersome, but I think it will work... 71.137.18.65 05:05, 28 March 2007 (UTC)
 * And what about  instead? --jsimlo(talk 20:59, 28 March 2007 (UTC)
 * This seems to be a simple problem using the #sub function and a couple of ParserFunctions: Nihiltres 07:07, 16 April 2007 (UTC)

Proposals
There is still no reaction on several requests for this extension. I guess it is mainly because there is no one to handle it. Correct me, if I am wrong.. ;) Otherwise, I would like to improve the extension and satisfy some of the reasonable requests here, if there is no one against it. Please, add comments and votes for adding the proposed functions below into the main code. Thank you. --jsimlo(talk 11:36, 10 September 2006 (UTC)

#explode
A tokenizer that would split a string into pieces and return one of them, exposing the php explode function? Usage like this:  would give you the second sub-name of the current sub-sub-article. E.g.  would result in. Code:

function runExplode (&$parser, $inStr = , $inDiv = , $inPos = 0) {      $tokens = explode ($inDiv, $inStr); if (!isset ($tokens[intval ($inPos)])) return ""; return $tokens[intval ($inPos)]; }

Votes:
 * Add per nom. --jsimlo(talk 11:36, 10 September 2006 (UTC)

#replace
Replaces each occurence of a needle in the haystack, exposing the php str_replace function. Usage like this:  would replace all shashes in the current page name with colons. E.g.  would result in. Code:

function runReplace ( &$parser, $inStr = , $ReplaceFrom = , $ReplaceTo = '' )   { return str_replace ($ReplaceFrom, $ReplaceTo, $inStr); }

Votes:
 * Add per nom. --jsimlo(talk 11:36, 10 September 2006 (UTC)

Would like to have the following chunk into the replace function: if ($ReplaceForm = '') {       $ReplaceForm = ' '; } The system assumes that passing a blank character means that nothing will get passed, so with this the function can search for the space character. Unless someone can think of a better way of doing it. Denomales 03:09, 22 October 2006 (UTC)
 * Change Request
 * Fixed in ver. 1.6. --jsimlo(talk 08:48, 26 October 2006 (UTC)


 * A way to use $ReplaceTo = ' ', but preserving the possiblity of using $ReplaceTo = '', would be useful too.--Patrick 13:15, 24 January 2007 (UTC)


 * I found a way: To use a space as to-string, put it in nowiki tags.--Patrick 13:28, 24 January 2007 (UTC)

There's a bug in replace that has it blow up if the string is more than 1000 characters long. I discovered it today when using #replace to surround an #ask --105.237.60.59 18:02, 30 June 2014 (UTC)
 * That's not a bug, these are safety features. They can be adjusted with $wgStringFunctionsLimitSearch and $wgStringFunctionsLimitReplace. --Theaitetos (talk) 18:11, 30 June 2014 (UTC)


 * Thank you, yes, I understand. I set:
 * $wgPFStringLengthLimit = 80000;
 * $wgStringFunctionsLimitSearch = 80000;
 * $wgStringFunctionsLimitReplace = 80000;
 * In LocalSettings.php but the error remained - the same error 'Error: String exceeds 1,000 character limit', but the '1000' might be hard-coded. To be sure, I also ran update.php and 'Data repair and upgrade' and 'Database installation and upgrade' from the SMW Admin Functions, but it still remained. 197.155.4.118 09:21, 10 July 2014 (UTC)


 * Agreed, that is strange behavior then. In that case I guess you have to file a bug. :-( --Theaitetos (talk) 12:39, 10 July 2014 (UTC)

#pad
Padds input string to a specified width, aligning it to left, right or center. Usage like this:  would result in. Code:

function runStrPad (&$parser, $inStr = '', $inLen = 0, $inWith = ' ', $inDirection = 'left') {       switch (strtolower ($inDirection)) {       case 'left': default: $direction = STR_PAD_LEFT; break; case 'center': $direction = STR_PAD_BOTH; break; case 'right': $direction = STR_PAD_RIGHT; break; }       return str_pad ($inStr, intval ($inLen), $inWith, $direction); }

Votes:
 * Add per nom. --jsimlo(talk 11:36, 10 September 2006 (UTC)
 * Remove. This function is extremely abusable in DoS attacks. Ex:  --67.171.232.100 00:08, 29 October 2006 (UTC)
 * Comment. What about limiting the length value instead, like it was limited with #pos function? --jsimlo(talk 09:54, 30 October 2006 (UTC)
 * Comment. Since MediaWiki 1.8 their is a padleft and padright core function which does quite the same, I think. See Help:Magic_words. --Majoran 05:17, 30 November 2006 (UTC)
 * Remove or limit See the page - pad is O(10^n) (remember, n is the number of digits, not the value itself) and replace is O(n^2). Suggest limiting the "from" and "to" values of replace and the "delimiter" of explode to 30 characters in length (as with pos - I would support anything from 10 to 30 characters, I'd vote against anything outside this range) and the length value of pad to 99. I'd also recommend limiting "value" or "string" for everything EXCEPT len and sub to some length limit in the range of .5K - 5K, preferably 1K. Even O(n) is a DOS attack if it is easy to make n be a 100K page templated from somewhere. And finally, even with all these limits, it would be easy to make a "replace" call that returned a 30K-long string - there needs to be a further limit targeted to the output of replace (and possibly urlencode?). For ease of programming (just check limits then expose the PHP, rather than building your own function), this could be that len(to)*len(value)/len(from) can't be over twice the limit on len(value) - that is, the "to" can only be twice as long as the "from" unless you're sure that your "value" string is a fraction of the length limit. The "len" in question is byte-length, ie 2 for each standard-codepage unicode char and 1 for each ascii. --Homunq
 * Okay, then I shall update the code ASAP. --jsimlo(talk 16:08, 24 January 2007 (UTC)
 * Done.. :))) --jsimlo(talk 13:31, 30 January 2007 (UTC)

Do these functions work?
I have jsut one question.
 * Do these functions work?

I am asking because all of my tests with #sub: and #len: have failed. --70.49.162.137 20:55, 24 August 2006 (UTC)


 * Yes, if you have them installed. They are, however, not installed here: . --jsimlo 21:56, 24 August 2006 (UTC)


 * How do I install them? 70.49.117.62 23:28, 27 August 2006 (UTC)


 * My appologies, I have not realized that there is no word about the installation in the article. I have just added such section of Installation. If any troubles should occur, leave a note here. --jsimlo 12:49, 28 August 2006 (UTC)

Update for 1.8.0
I added the following to get the code from throwing errors after upgrading to MediaWiki 1.8.0, and after a quick test everything seems to be working well.--Raran 17:31, 11 October 2006 (UTC)

$wgHooks['LanguageGetMagic'][] = 'wfStringFunctionsLanguageGetMagic';

function wfStringFunctionsLanguageGetMagic( &$magicWords, $langCode ) { switch ( $langCode ) { default: $magicWords['len']		= array( 0, 'len' ); $magicWords['pos']		= array( 0, 'pos' ); $magicWords['rpos']		= array( 0, 'rpos' ); $magicWords['sub']		= array( 0, 'sub' ); $magicWords['pad']		= array( 0, 'pad' ); $magicWords['replace']	= array( 0, 'replace' ); $magicWords['explode']	= array( 0, 'explode' ); $magicWords['urlencode']	= array( 0, 'urlencode' ); $magicWords['urldecode']	= array( 0, 'urldecode' ); }   return true; }

Documentation and Testing
As far as I can tell, replace and pad are not working, or at the very least the usage info is wrong. Also, several of these functions have NO docs. Do you think someone who is using them could, at least, edit the main page with some more complete information? Thanks! --Billwsmithjr 15:44, 9 November 2006 (UTC)

trim
Maybe a #trim function could be useful, to get rid of whitespaces. (Available Workaround:  ) --Majoran 04:57, 30 November 2006 (UTC)

Doesn't work on 1.6.8, requires modification to work
This extension adds function like len and so on, but parser.php looks for #len (maybe, it was changed in later versions?). So, the modification is following:  to. At least, works fine for me.


 * This is the same problem as with ParserFunctions for MediaWiki below 1.7. See section about Installation. Most of the stuff there applies here as well, only the names are changed. I guess those workarounds should be added to the StringFunctions as well, shouldn't they? --jsimlo(talk 21:27, 4 December 2006 (UTC)


 * We could do a simple trick like

$prefix = $wgVersion < 1.7 ? '#' : ''; $wgParser->setFunctionHook ( $prefix.'len', ...
 * to make it portable. I don't know in which version wiki switched from  to   style, so put 1.7 for example.
 * As for ParserFunction, it's generally a bad idea to use PHP5-specific stuff when one doesn't really need it :-\
 * As for ParserFunction, it's generally a bad idea to use PHP5-specific stuff when one doesn't really need it :-\

PCREs
Also, I'd love to have preg_-family functions here for splitting and replacing tokens by PCREs. I can write those functions myself, just asking if there're any objections to have them in this extension.


 * I guess this will be a looong fight to it. PCREs are a bit slow. Look all over this talk page - it is all about whether something is exploitable by a DOS attack or not. --jsimlo(talk 21:27, 4 December 2006 (UTC)


 * They are not so slow :) At least, not slower than POSIX regexes. Perl itself has only regexes to manage strings. Anyway, if page is cached they're not executed; otherwise, db queries will take much more time than those functions. But I don't insist on adding them, of course.


 * Personally, I like the idea. I am just not very sure about other users. Our current priority is to provide reasonable extension for wiki(p|m)edia projects. PCRE might not be, what they want there. Maybe an option to disable it? Or maybe, I could write a new "PregFunctions" extension... :))) -- jsimlo(talk 10:10, 28 November 2007 (UTC)

#explode
I'd like to add to #explode the ability for the position argument to be negative, returning tokens from the end of the string. This is similar to several of the other functions using negative indexes. Currently there is no way to know how many tokens there will be, so you can't figure out what the index of the last token is otherwise; this at least lets you grab tokens at the end. Vash 18:46, 29 March 2007 (UTC)
 * Sounds reasonable and simple. Will add asap. --jsimlo(talk 08:12, 30 May 2007 (UTC)

#replaceregexp
For advanced users it would be really helpful to have the full preg_replace functionality available. Unter I gave an example what you could do with such a feature. I tried to be downward comaptible by checking whetherthe argument has soem delimiter symbols. For an official implementation it would be better to have a differently named function, though.
 * --Algorithmix 06:05, 12 September 2007 (UTC)

#substrcount (new)
It is possible to implement substr_count? It seems simply (I will try on my wiki ;-) ). There are any contraindication? (I'm new of PHP...). Basically, I would try to implement a dynamic table based on variable=value pairs separated by, say, semicolon. This function could be useful to count the number of pairs.--Briotti 15:12, 19 October 2007 (UTC)
 * I also miss this function, alredy wrote one as template but this one can only count strings with lenght of 1. It should be possible to count spaces as well. My template can also use regex character classes, this is very usefull to count all numbers for example. --Danwe 21:10, 23 April 2009 (UTC)

#strtolower, #strtoupper (new)
I sometimes need to do case conversion on MediaWiki variables and these functions would be useful. --Dmulter 16:15, 23 April 2009 (UTC)
 * Why don't you use the MW functions lc, lcfirst, uc and ucfirst ? --Danwe 21:05, 23 April 2009 (UTC)
 * Didn't see those, they'll work perfectly. -Dmulter 21:33, 1 May 2009 (UTC)

Bugzilla: add on wikimedia
The bug requesting these to be added for the big sites (wikipedia, etc.) is 6455. If you really want these functions, go (register and) vote for that bug. If you have an objection, that bug would also be an appropriate place to express it.--201.216.136.95 18:23, 10 January 2007 (UTC)


 * Is there a public Wiki with StringFunctions installed?--Hillgentleman | 書 |2007年02月26日( Mon ), 19:53:30


 * has StringFunctions and DynamicFunctions. Polonium 21:27, 13 March 2007 (UTC)
 * explode seems not to be functioning over there. I came up with a bad substitute:  convert the string into unicode, edit the monobook so that "%" are automatically converted into "|", then cut and paste.---Hillgentleman | 書 |2007年03月22日( Thu ), 10:06:56 10:06, 22 March 2007 (UTC)


 * Problems on mediawiki 1.10; I can’t get StringFunctions going on mediawiki 1.10. is this a general problem or just mine?? thx --Bartleby 15:50, 18 July 2007 (UTC)

Locating a pipe in text
I'm trying to write an #explode expression on my wiki that uses a pipe ( "|" ) as the delimiter. It seems that this is impossible, though, as StringFunctions for whatever reason does not consider the | template (which contains only "|" ) a pipe. I can't use | to look for a pipe with #replace, #pos, or any StringFunction I've tried. Any help would be greatly appreciated! -69.122.203.50 22:11, 7 April 2007 (UTC)
 * Nevermind, got it! I'm using &#124; now to find pipes. -69.122.203.50 22:14, 7 April 2007 (UTC)

Page name variations
I am trying to figure out how to get all page name variations in order to list redirects for Dynamic Page List. If I have a page called "cutscene", I can easily get prefix variations like "cutscenes", "cutscened", "cutscened", etc, but how would I get "cut scene", "cut scenes", "cutscening", etc? I'm thinking string functions would be how but I'm not much of a programmer... -Eep² 06:26, 10 August 2007 (UTC)


 * Hello? I also need to be able to get other variations like "cities" from "city" and a way to remove the last "s" and replace it with "ies" too. How would I do these things with StringFunctions? —Eep² 14:01, 20 August 2007 (UTC)


 * Yes, I read you. Unfortunatelly, I have no idea about how to implement such things. What you are talking about is based on some kind of a dictionary and gramar rules. Such things differ from language to language. Though, you can replace parts of strings and retreive substrings, so you should be able to create "cut scene" from "cutscene". I am not sure about making "cities" from "city". How would you treat word "stress"? -- jsimlo(talk 09:14, 23 August 2007 (UTC)


 * How would I get "cut scene" from "cutscene"? As for stress, that's easy: just add an "es" to make "stresses" or an "or" to make "stressor". A per-word(s) style is fine but I just need to know how to do it programmatically. For "cities", how would I remove/replace the last 3 letters and add (or replace them with) a "y"? —Eep² 14:03, 23 August 2007 (UTC)
 * The first works only for "city", the second for every word ending in -ies, the third shows how to get "cut scene" from "cutscene" (it works only with this word):
 * y
 * y
 * 
 * Arath 17:19, 25 August 2007 (UTC)


 * Thanks! —Eep² 22:36, 26 August 2007 (UTC)

Documentation style and wording
Hi, sorry about reverting your entire work, but it seemed to me to be better this way because:
 * I highly disagree about replacing words returns with ->. I believe the later is much less readable and comprehendable by common users. Somehow, I think that wordy and lengthy documentations are better than packed short notations. Why? Because a lengthy documentation is simple, light and clear for the brain to process. Short notations need to be parsed and translated why reading.. :))
 * Lengthy documentation is not simple, light, and clear--that's why it's lengthy, which implies more complex, heavy, and wordy (cluttered). —Eep² 18:47, 27 August 2007 (UTC)
 * When you say the same amount of information in a lengthy style, the result is light. When you say the same in shorter way, the result is heavy. What I wanted to say is, that I am against of using short notations instead of words. E.g. do not use ->, use results in or returns. -- jsimlo(talk 20:17, 27 August 2007 (UTC)


 * I find your version of #sub quite the same (in sence of information), but more complicated (in sence of order). I would prefer to follow GNU docs rules while writting these docs. Important describtion first; notes, borders, side-effects and deviations later.
 * GNU has doc rules too? What's next, GNU rules on how to brush your teeth? I prefer contextual documentation where any notes are directly relevant to that parameter instead of at the bottom, requiring reference back to the parameter in question. —Eep² 18:47, 27 August 2007 (UTC)
 * Yes. Every bigger project has some rules. And GNU projects are usually big and working one. So, look, every usable docs should contain ALL relevant info, that may become necessary. However, a lot of such info can be quite boring to most of the common users. Therefore, projects like GNU, Microsoft, Sun and many others have a style in which they first lightly describe the overall main purpose; then describe the common purpose of each parameter; and only then describe the notes and remarks that are not very common, but may come handy, when specific troubles arise. Then the reader may read, till he is satisfied with his questions. He does not need to read through all the docs just because there are some side-efects that may happen somewhere. Btw: Browse through docs of mediawiki software. You can find there, what I am talking about. -- jsimlo(talk 20:17, 27 August 2007 (UTC)


 * There is no need to use &lt;tt> for links to php.net site.
 * The monospace is for PHP functions, not links. —Eep² 18:47, 27 August 2007 (UTC)


 * I do not find your examples to be the common examples of usage. They are rather special cases of what can be done, if wanted. Each example introduces a hidden question of "What the author wanted to say with this example?" Keep that in mind when creating examples. Always have a clear and simple vision of what you are trying to say and then ask yourself whether the readers will be able to follow you. Confusing examples can make the docs hard to read and understand. -- jsimlo(talk 11:25, 27 August 2007 (UTC)
 * And all the Žmržlina examples are common? Come on... Besides, a simple description for the examples I provided can easily answer any "hidden questions". —Eep² 18:47, 27 August 2007 (UTC)
 * Well, in the sence of multibyte utf-8 characters, yes, I think they are. Why? Because the Ž is a multibyte character. So I assume, that when the reader reads the example, he follows the clue of Ž being a multibyte character, which is anyway correctly counted as a one single character. -- jsimlo(talk 20:17, 27 August 2007 (UTC)

Again, I did not mean to offend you and argue about petty things. I simply read your version and got confused. So I asked my self: Is it just me, or is it the text I am reading. Well. After a while I decided the later would be the problem and I have reverted you. Then I realized and reproduced some of the good work you have done.

Since I am the current developer and maintainer of the extension, I believe I should be able to continue my work (updating the code and docs) whenever necessary. I can not do that when I get confused. But I also admit that my version is not the perfect one. -- jsimlo(talk 20:17, 27 August 2007 (UTC)

Delimiter-separated lists into links
How can I get a comma-separated list (like "climb, dive, grab edge, jump, pick up, pull, push, roll, run, shimmy, shoot, sidestep, swim, vault, walk") in a template field into separate article links ("climb, dive, grab edge,...")? —Eep² 08:33, 11 September 2007 (UTC)


 * I am not sure, whether this is currently possible with StringFunctions. The reason is the way the parser works and how it parses parameters and then the function results. You might need regular expressions to do that, or at least "explode and replace each piece" function for that, none of which are present in StringFunctions yet. -- jsimlo(talk 11:57, 11 September 2007 (UTC)


 * OK, well, how could that be done with an existing extension like, say, TemplateTable? Usign StringFunctions, I tried  </tt> but it doesn't work (with or without the  | </tt>). :/ —Eep² 13:54, 11 September 2007 (UTC)


 * Extension:LoopFunctions might provide a solution for you, but that just an idea. -- jsimlo(talk 10:05, 28 November 2007 (UTC)


 * with my substrcount function (which is showed later here), explode and loop extension you can easily do it (something like this):  {{#while: | {{#ifexpr: {{#var: i }} < {{#var: n }} | true }} | {{#vardefine:pp|{{#explode:{{#var:pagetypes}}|DELIM|{{#var: i }}}} }} ..................... and you get in pp parts of your list.User:Schthaxe 28 jan 2009

How is it possible to get this installed on all the Wikimedia projects?
I've missed the option to use Stringfunctions on the no:Wikipedia. What step is necessary to take to get this installed? Nsaa 23:55, 10 March 2008 (UTC)


 * See at the top.--Patrick 17:25, 23 March 2008 (UTC)

PHP errors from use of MW_PARSER_VERSION
Whenever I use this extension on my wiki (1.12.0) my php log fills with:

PHP Notice: Use of undefined constant MW_PARSER_VERSION - assumed 'MW_PARSER_VERSION' in /extensions/StringFunctions/StringFunctions.php on line 160

I've avoided this now by adding the following to my localsettings.php but I don't think that's a long term solution:

define("MW_PARSER_VERSION", "1.12.0"); </PRE>

Brian of London 18:21, 27 June 2008 (UTC)
 * Changing line 159 seemed to help:

else if ( defined( 'MW_PARSER_VERSION' ) && strcmp( MW_PARSER_VERSION, '1.6.1' ) > 0 ) It appears that MW_PARSER_VERSION</tt> was retired some time ago. -Jlerner 17:26, 29 August 2008 (UTC)


 * This seems very fishy keeping in mind that 1.13.1 < 1.6.1 on a string comparison as 1<6

"undefined constant MW_PARSER_VERSION"-error still present in StringFunctions/Mediawiki version 1.14 (both from subversion) as of 2008-11
Nothing seems to have happened in the last half year. Please, can a maintainer of StringFunctions in the subversion checkout please fix this error. It fills up logs, fills the results of maintenance function like rebuildall.php or importDump.php to the point where real errors are difficult to find, and is plain annoying, even if it should be harmless... Many thanks! --G.Hagedorn 16:43, 28 November 2008 (UTC)

Patch for MW_PARSER_VERSION warning
The following patch fixes the MW_PARSER_VERSION warning using the Parser::VERSION constant from MediaWiki's Parser class:

--- StringFunctions.php.orig	2008-07-23 20:17:28.000000000 +0200 +++ StringFunctions.php	2008-12-15 11:43:48.592706500 +0100 @@ -157,7 +157,7 @@        $prefix = preg_quote( $parser->mUniqPrefix ); if( isset($parser->mMarkerSuffix) ) $suffix = preg_quote( $parser->mMarkerSuffix ); -       else if ( strcmp( MW_PARSER_VERSION, '1.6.1' ) > 0 ) +       else if ( strcmp( Parser::VERSION, '1.6.1' ) > 0 ) $suffix = 'QINU\x07'; else $suffix = 'QINU';

Please integrate this patch. The patch I've seen in trunk (checking whether MW_PARSER_VERSION is defined) may not be appropriate.

--Jandd 11:31, 15 December 2008 (UTC)

Multibyte characters
What exactly does the "Multibyte characters" in the $wgStringFunctionsLimitSearch and other limit globals mean? Is it strictly 30 (or 60) english characters? I tried a length of  (50 characters) and it worked. Wyvernoid 06:05, 3 July 2008 (UTC)

#replace, space in search string ("needle")
I fail to get a trailing space in the search string working. The nowiki example works in the replace string:

but not in the search string:

--G.Hagedorn 06:57, 3 October 2008 (UTC)
 * Shure. replacing something with nowiki and space also doesn't insert spaces only, it also insterts the nowiki tag which in my optionion is nonsens because you can't do anything with the string after that for example it's useless to some templates or parser functions. Better would be if the nowiki is only for the function to recorgnize that the function should do something with real spaces now. For forced Spaced we better use  &#32;  --Danwe 14:51, 26 May 2009 (UTC)


 * I am also trying to replace a comma followed by a space with a pipe character. I can't seem to get this to work, trying to use nowiki tags, and also the html equivalents for spaces.  Is this possible? --Gkullberg 23:11, 17 February 2010 (UTC)

I think I have the same issue here:

I want this to basically remove the text "this is ", but it returns " text" and not "text". Thoughts? 71.255.199.130 19:21, 23 March 2010 (UTC)

make stringfunctions configurable from localsettings.php
Index: extensions/StringFunctions/StringFunctions.php

=
====================================================== --- extensions/StringFunctions/StringFunctions.php     (revision 4951) +++ extensions/StringFunctions/StringFunctions.php     (working copy) @@ -119,9 +119,9 @@       global $wgStringFunctionsLimitPad; $wgExtStringFunctions = new ExtStringFunctions ; -      $wgStringFunctionsLimitSearch  =  30; -      $wgStringFunctionsLimitReplace =  30; -      $wgStringFunctionsLimitPad     = 100; +      if (!isset($wgStringFunctionsLimitSearch))  $wgStringFunctionsLimitSearch  =  30; +      if (!isset($wgStringFunctionsLimitReplace)) $wgStringFunctionsLimitReplace =  30; +      if (!isset($wgStringFunctionsLimitPad))     $wgStringFunctionsLimitPad     = 100; $wgParser->setFunctionHook('len',     array(&$wgExtStringFunctions,'runLen'      )); $wgParser->setFunctionHook('pos',     array(&$wgExtStringFunctions,'runPos'      ));

-- Nef (talk) 14:45, 13 January 2009 (UTC)


 * Or just set the configuration variables after you include the extension (which is what you're supposed to do). —Emufarmers(T 03:25, 14 January 2009 (UTC)
 * From my testing in 1.13.2 you need that change regardless of whether you put the new values before or after the extension include. I believe that it's because the extension function initializers don't get called until after LocalSettings.php.  --Cmreigrut 19:51, 4 March 2009 (UTC)
 * Same problem here. I can't change the settings from localsettings without these changes! Even not if I define them AFTER the extension include! So please could you add these lines? --Danwe 15:50, 3 October 2009 (UTC)

about substr_count
well i need that function too and create code based on explode (this code don't stop on exploded part, and returns number of tokens instead)

if you are not administer of StringFunction, then i must note for you that you must include declaration of this function in all other related parts of extension. User:Schthaxe 28 jan 2009

array you really explodienced?
well friends don't you think you're missing something? let's start arrays!

Arrays.php

Arrays.i18n.php

arrayexplode function that you can add to stringfunction.php

and then you can use for example

and receive

5 123 456 789 qwe rty

notice that my arrayexplode returns count of parts of exploded string, so you can immediately vardefine it.

then you receive

123! 456! 789! qwe! rty!

--Schthaxe 14:49, 10 February 2009 (UTC)

Replace "needle" in documentation?
The documentation for #pos, #rpos and #replace refers to "string" and "needle" arguments. I don't think "needle" makes much sense if another argument isn't called "haystack". How about replacing "needle" with something more literal, like "search term"? Also, the third argument for #replace is called "product", which seems odd as well. How about changing that to "replacement term"? Yaron Koren 13:21, 6 March 2009 (UTC)


 * You are right, the documentation is rather inconsistent and might make no sense to the common users. Believe it or not, it does make sense to us, the developers, since we are twisted a lot.. ;) How about you try and do your best: fix what you think could be more self-explaining? Thanks.. -- jsimlo(talk 15:41, 6 March 2009 (UTC)


 * Twisted indeed! Alright, I just did the replacement; I hope it's an improvement. Yaron Koren 18:41, 6 March 2009 (UTC)


 * Looks good to me. I'm surprised that back when it said "needle" for the search term that it didn't also say "haystack" for the string.  ;) --Lance E Sloan 20:36, 6 March 2009 (UTC)

Replace links
Sorry if this question is just n00b :) How would I replace all given category links? Something like: . Thanks. --Subfader 03:06, 3 May 2009 (UTC)
 * Looks like a case for regular expressions. Check out Extension:RegexParserFunctions. I already implemented something very similiar, replacing links. For this template I use the regex:
 * This should help you. I think you will only need some smal modifications for your case. --Danwe 14:58, 4 May 2009 (UTC)

SpecialPages does not display
I have just installed StringFunctions into my mediawiki,1.14.0, When I try to display the Special pages, all I get is a blank screen.

Other extensions:


 * Semantic Form v1.6
 * Semantic MediaWiki, v1.4.2

MediaWiki	1.14.0 PHP	5.2.2 (apache) MySQL	5.0.45

Anybody got any clues?

bobj


 * Can you check your Apache error_log file, or change to "display_errors = on" in your php.ini file, so you can see what the error being generated is? Yaron Koren 18:36, 26 May 2009 (UTC)

Parameter 1 to Language::getMagic expected to be a reference...
Hi, I'm using MediaWiki 1.15.1, and when this extension is installed I get the following PHP error:

Warning: Parameter 1 to Language::getMagic expected to be a reference, value given in C:\Program Files\Apache Software Foundation\Apache2.2\htdocs\MediaWiki\includes\StubObject.php on line 58

What is wrong?

68.210.46.16 13:48, 17 July 2009 (UTC)


 * I too am getting this error since I upped the version of PHP from 5.0.5 to 5.3.0 - does anyone know the cause? - anon


 * Same here. But of course we all know that this is not the right place to do troubleshooting. - anon
 * Anyone knows the reason for it? I got the same bug :( 83.26.175.165 13:08, 13 March 2010 (UTC)


 * I got the same thing. Trying to get UNC links to work on our internal Wiki. UNC links require ParserFunctions and StringFunctions. ParserFunctions installed just fine, but StringFunctions cause this error.   Anybody have any luck yet? 205.175.129.250 17:18, 17 March 2010 (UTC)


 * Same here, ParserFunctions installed fine, but StringFunctions did not. MC10 19:08, 7 April 2010 (UTC)

I has the same error, and here is the fix: Special:Code/MediaWiki/55429
 * Nothing there worked for me. I have MediaWiki 1.15.3, PHP 5.3.3-7+squeeze3 (cgi-fcgi) and what worked for me was removing the '&' from 'function getMagic( &$mw ) {' and then editing LocalSettings.php to change its timestamp so that cache was cleared. Jabowery 14:51, 10 November 2011 (UTC)

I has the same error with the datatable extention. The fix above works for me. Anders 9:50, 7 June 2012 (UTC)

#replace bug
The following code

should result into

...bbb

but actual result is

\.\.\.bbb

--Sergey Spatar 21:48, 17 August 2009 (UTC)

Replacing New Lines, Line Breaks
How can you replace new lines? \n doesn't work. Do you know a way? Or even a workaround?

re: Replacing New Lines, Line Breaks

 * Extension:StringFunctionsEscaped Solves this problem. --jdpond 20:34, 18 September 2009 (UTC)
 * You should define a new function or something else for this. What if you simply want to replace the string "\n" and not any line breaks?? A new function is to much I think. Perhaps a new parameter would work which allows things like \n. Or you could try to use one of the regex extensions like Extension:MultiReplace, perhaps they already allow you to replace new lines. --Danwe 21:56, 3 September 2009 (UTC)
 * This is can be made to work on any wiki that has the Extension:StringFunctions installed. On the wiki I administrate, I just created a Template:Anchor using the following code:   and it removes carriage returns just fine. You may have use &amp;#10; instead in some circumstances (reference to line feed) or both. -- ShoeMaker   ( Contributions &bull; Message )   17:32, 4 March 2013 (UTC)
 * The -function of the DPL extension allows you to use regex replacements. --Theaitetos (talk) 19:52, 4 March 2013 (UTC)

#urlencode acts like PHP rawurlencode
Quite strange: if I put #urlencode in #tag it acts like rawurlencode.

Example:

if nome parameters contains a string with blank separated words (e.g. "I Sette Calici fatati") it returns "I+Sette+Calici+fatati" instead of "I%20Sette%20Calici%20fatati". --GB 13:58, 9 November 2009 (UTC)

Cannot pass in variable reference to the functions
I suspect that this is an issue probably related to my lack of general MediaWiki knowledge rather than a problem with the StringFunctions extension. I'm trying to use these functions in conjunction with DynamicPageList to format the output of a page list, but I don't seem to be able to pass in any of the variable references available at that point.

I expected this to output the page title followed by the length of the title, but using a debugger it's possible to see that the unexpanded string '%TITLE%' is passed unprocessed into the len function instead of the actual document's title, so it always outputs '7' (the number of characters in '%TITLE%').

Oddly enough though, this does output the title correctly.

Although this attempt to use the value of the #if function as an input to len does not.

And to finish, before anyone goes to too much trouble to work out how to do any of the above (which were just to demonstrate the issue as simply as possible) what I really want to do is this:

Which is to check for the presence of a specific category and if present add its name to the formatted output.

Does anyone have any idea where I'm going wrong here please?

Neutrino 14:47, 17 April 2010 (UTC)

hello
how do i use the exoplode function is it just like

but it doesn't give result its just display as how i typed it shouldn't it give

Functions

but its giving

so why is it not working? So are these functions actually installed on wikis?

Gman124 19:59, 21 May 2010 (UTC)

How to process &lt;nowiki>such a string&lt;/nowiki> ?
I don't understand the reason for stripping out text in &lt;nowiki> tag from the #len and #sub functions.

I precisely wanted to remove the &lt;nowiki> tags in a template parameter... See m:Help talk:Advanced templates : I think the only solution is to pass the parameter between &lt;nowiki> and remove these tags in the template...

Does anyone have a better idea on how to do this ? Thanks!

--Goulu 07:11, 1 June 2010 (UTC)

Replace to blank
There seems to be no way to replace to blanks: should return not. Any solution? --Subfader 15:07, 30 December 2010 (UTC)
 * Hi. I got the same problem, but as it was for display only I manage to get it by a substitution to "&amp;nbsp;" as " " was no taken into account. -- Akarys 06:32, 17 November 2011 (UTC)
 * A rather dirty but working solution would be something like this:, Also, if you don't want the side effect of non-breaking-space " ", you can still use " " instead. --Danwe 12:54, 17 November 2011 (UTC)

Missing #pad: in Parser Functions
Extension:ParserFunctions does not implement the  function as of 2010-11-18. Maiden taiwan 15:35, 18 November 2010 (UTC)

urlencode, not #urlencode

 * 1) urlencode doesn't work:

But the same thing without the # does work

I'd fix it but I'm know your policies for style. 97.120.68.8 08:33, 26 January 2011 (UTC)


 * has been deleted from the extension, since MediaWiki's own core  does the job. Even if it were still a part of the StringFunctions Extension, it wouldn't work here, since the extension is not installed on this wiki (see Special:Version). --Theaitetos 20:01, 25 February 2011 (UTC)

urldecode
Anyone knows what happened to the urldecode function? I understand that urlencode was omitted since it's built-in by now:
 * &rarr;
 * &rarr;
 * But urldecode has simply vanished:


 * &rarr;
 * &rarr;
 * --Theaitetos 22:44, 24 February 2011 (UTC)


 * In order to turn on this function, you should add  to the file LocalSettings.php
 * --Karagota 14:32, 02 August 2012 (MSK)


 * It's been a while and I figured it out by now, but thanks anyway. ^^ --Theaitetos (talk) 10:53, 2 August 2012 (UTC)

Add text before and behind string item
Not sure if "implode" would fit, but it should work like this: returns: Foo, Bar, Baz.

Or category links: returns: Foo, Bar, Baz

Would be useful ;) --Subfader 20:43, 3 April 2011 (UTC)


 * Simply use, Extension:ArrayExtension or Extension:Regex Fun. I can't see real use in this function. --Danwe 17:18, 10 November 2011 (UTC)

Return values from StringFunctions or other parser functions are not treated as parsable wiki text. I think they are still htmlized (validated & tidied), but not wikified. So, you can't generate wiki links with them.. -- jsimlo(talk 18:35, 11 November 2011 (UTC)

Nesting a magic word within a string function?
I'm trying to execute the following function on every page that exists for an uploaded file:

So, if I uploaded a file called "File.pdf", on the wiki page for File.pdf, I'd expect to return "File". I'd expect the entire function to return something like: http://mywiki.com/mediawiki/images/4/4d/File.pdf

Instead, nothing at all is returned when PAGENAME is embedded within filepath. If I type the expression without, i.e.  , it works fine and returns: http://mywiki.com/mediawiki/images/4/4d/File.pdf

Is it just not possible to embed magic words in string functions, or am I missing some escape character or something?

Thanks in advance,
 * Hi, Dana. Unfortunately you cannot have a MagicWord inside a MagicWord. Sorries. --Jeffw &bull; (talk) 02:02, 21 January 2012 (UTC)
 * UPDATE: This can be done by simply writing the ".pdf" part is unnecessary.

Color all links
How do you make all the links a certain color other than blue, purple, or red in a template perimeter? What combination of StringFunctions finds all the links and colors them? 98.220.232.207 16:11, 3 April 2012 (UTC)


 * Use inline CSS styling, or add the styles to your MediaWiki:Common.css Badon (talk) 23:11, 18 May 2012 (UTC)

First / last word
Imagine a page name is Portal:Astronomy/Events/2024_August i.e. Portal:Astronomy/Events/2024_August

How can I link to 2nd of  {{lcfirst:{{SUBPAGENAME}}  not to all page name, but to the last word in the subpage name (in this case, 2nd of May). --Diamondland (talk) 10:04, 15 May 2012 (UTC)


 * Could you please tell us more about your issue? What exactly do you try to do? Where do you want to this? What does it have to do with String Functions? In case you just want to link to a subpage from a page, check Help:Links. In case you want to extract the name of the last subpage from a given string with a string function, try something like this:  as this should give you 2012 May. --Theaitetos (talk) 00:37, 16 May 2012 (UTC)


 * I think he meant something like that:
 * Code: Link to a new subpage with last word of former subpage
 * Resulting link: Link to a new subpage with last word of former subpage
 * or with already fixed paremeters
 * Code: _|-1}}|Link to a new subpage with last word of former link separated by _
 * Resulting link cannot be displayed here. This wiki seems to have a different version of stringfunctions. In my mediawiki @ work, this code does the trick fine. Here, the stringfunction isn't recognized. I tried different methods.
 * Otherwise we would need a proper example, like Theaitetos asked formerly. - Kelrycor (guest)

Wildcard problem in #pos and #rpos
I have a problem with the stringfunctions #rpos or #pos while using a DPL query in our medawiki @ work.

The goal is a list of pages with their link to the project teams, but naming the links with the organisation name für the project. The pages are name for example: HQ\KLM2\Team or HQ\QMS\QMS1\Team My goal is optaining KLM2 or QMS1 in the example. The last subpage is always "Team" for the page to be linked to, the searched word is always 4 chars, so my idea was using #rpos to get position of "\Team" and take the next 4 lower letters. The trick would work fine and would be flexible to any depth of the subpages - but...

#rpos or #pos isn't able to get me the position, because they seem to use the % as a wildcard, not parsing the page's name, for god's sake. But the DPL function uses % for the result variables. %PAGE% would contain for example HQ\QMS\QMS1\Team I searched and found no way telling the stringfunctions to handle the % not as a wildcard, ignoring it and making it possible for the parser to get the right string from the DPL function.

Code-Example (I shortened the format-phrase down to the problem):


 *   results in NULL/notfound
 *   results in 2
 * Magic Words work well, but cannot give me the correct string data:  

Does anyone knows a trick? Is there any way to tell the stringfunctions NOT to consider % as a wildcard? Like \% in other computer languages? I would be glad if anyone has a good idea.


 * First of all, I wouldn't use #pos or #rpos. There are better ways to do this. Easiest in my opinion is to use the dpl command "replaceintitle" and put it in like this:  and then you can simply use the %TITLE% variable instead of the %PAGE% variable in your dpl output. If that doesn't work for you, try to use #sub: If the last part of the pagename is always "\Team", then you know that you have to omit the last 5 letters from the pagename, i.e.   and you should have what you want. --Theaitetos (talk) 15:46, 17 July 2012 (UTC)


 * Function #sub would really work, but I need to stack 2 #sub's to "erase" the "/team" and to get the 4 letters of the teamname. Replaceintitle is not usable for the different constellations of the subpages. Finally, I found a stringfunction that does the trick and gives me very easy the part back I need:
 * I replaced the format parameter with  ,\n* ²{#explode:%PAGE%|/|-2}², . It splits the subpages by the delimiter, and because I know the second last part is always the word I search, I can tell him with -2 to display it alone. This way it doesn't matters how many subpages there are, or how many letters the teamname has (at the moment all have 4, but that might change some day). My solution is tight and flexible - and avoids the problem that % is a wildcard ;)
 * But I really appreciated your advices and learnt something out of them anyway. Thank you - Kelrycor (guest)

Examples, please!
The examples are very welcome, which makes their absence in #urlencode: and #urldecode: all the more painful. Please add some examples there. Please? --Thnidu (talk) 23:44, 5 December 2012 (UTC)
 * urlencode has been integrated into MediaWiki itself; it can therefore be found in the Magic Words help section.
 * urldecode works the other way around: It turns URL encoded strings into readable strings. A character-code-reference can be found here at w3schools.com.
 * --Theaitetos (talk) 19:41, 8 December 2012 (UTC)

help
I'm trying to see if a parameter starts with "17-". What am I doing wrong here?

Thank you! Kwamikagami (talk) 22:39, 6 March 2013 (UTC)


 * First, are you looking for  or  ? If the latter, then you definitely need to remove the two  . Also, I think   is more reliable than , so change it to
 * and let me know if that doesn't work. --Theaitetos (talk) 11:54, 7 March 2013 (UTC)


 * Turns out what I had worked, but maybe because of a caching problem I couldn't see it. I added in the quotes later in various combos in a needless attempt to fix it.
 * Actually, padleft works, but #sub does not. Anyway, at least it's functional with padleft even if the coding isn't optimal.
 * Thanks for your help! Kwamikagami (talk) 19:51, 7 March 2013 (UTC)

Break lines and explode
I'm trying the extract page titles from a Category tree wit the explode function but it wont work, it doesn't recognise as a separator, I know I can't use a space because that might conflict with page titles.
 * Can you link me to your wiki or give some more details on the matter? Maybe there is another function or a workaround, but right now I am not sure what exactly you have, that needs to be #explode-d. --Theaitetos (talk) 16:50, 6 April 2013 (UTC)