Extension:ParserFunctions/String functions/de

Die Erweiterung StringFunctions bietet einen zusätzlichen Satz Parser-Funktionen, welche Zeichenketten bearbeiten. Die Version 2.0 behebt Probleme mit &lt;nowiki> und lässt die Notwendigkeit entfallen, auf dem Server die PHP-Erweiterung mbstring installiert zu haben.

Funktionen
Dieses Modul definiert die Funktionen: len, pos, rpos, sub, pad, replace, explode, urlencode und urldecode</tt>.

Alle diese Funktionen laufen mit der Zeitkomplexität $$\mathcal{O}(n)$$, was vor DoS-Attacken schützt.


 * 1) Zur Vorbeugung von Missbrauch unterliegen einige Parameter Begrenzungen. Siehe Abschnitt Begrenzungen unten.
 * 2) Für Funktionen, die unerwünschterweise Groß- und Kleinschreibung unterscheiden, kann das Magic Word ausdruck verwendet werden.

#len:
Die Funktion #len gibt die Länge der eingegebenen Zeichenkette aus. Beispiel:


 * → 8


 * Leerzeichen am Ende werden nicht mitgezählt. Beispiel:  → 8
 * Zeichen werden ordnungsgemäß gezählt, auch wenn sie in UTF-8 mehrere Bytes einnehmen. Beispiel:  → 8.
 * Tags wie &lt;nowiki> werden samt Inhalt nicht mitgezählt, weil sie vor dem Parser versteckt sind. Beispiel:  → 4.

#pos:
Die Funktion #pos gibt die Position eines gegebenen Suchbegriffs innerhalb der Zeichenkette aus.



Der Parameter Suchbeginn ist optional und teilt der Funktion mit, ab welcher Stelle gesucht werden soll.

If the search term is found, the return value is a zero-based integer of the first position within the string. If the search term is not found, the function returns an empty string.


 * Diese Funktion unterscheidet Groß- und Kleinschreibung.
 * The maximum allowed length of the search term is limited through the $wgStringFunctionsLimitSearch global setting.
 * This function is safe with utf-8 multibyte characters. Example:  returns 4.
 * As with #len, &lt;nowiki> and other tag extensions are treated as having a length of 1 for the purposes of character position. Example:   returns 1.

#rpos:
The #rpos function returns the last position of a given search term within the string. The syntax is:

If the search term is found, the return value is a zero-based integer of its last position within the string. If the search term is not found, the function returns -1.

Tip: When using this to search for the last delimiter, add +1 to the result to retrieve position after the last delimiter. This also works when the delimiter is not found, because "-1 + 1" is zero, which is the beginning of the given value.


 * Diese Funktion unterscheidet Groß- und Kleinschreibung.
 * The maximum allowed length of the search term is limited through the $wgStringFunctionsLimitSearch global setting.
 * This function is safe with utf-8 multibyte characters. Example:  returns 4.
 * As with #len, &lt;nowiki> and other tag extensions are treated as having a length of 1 for the purposes of character position. Example:   returns 1.

#sub:
The #sub function returns a substring from the given string. The syntax is:

The start parameter, if positive (or zero), specifies a zero-based index of the first character to be returned. Example:  returns cream</tt> returns Ice</tt>.

If the start parameter is negative, it specifies how many characters from the end should be returned. Example:  returns eam</tt>.

The length parameter, if present and positive, specifies the maximum length of the returned string. Example:  returns cre</tt>.

If the length parameter is negative, it specifies how many characters will be omitted from the end of the string. Example:  returns cr</tt>.


 * If the length parameter is zero, it is not used for truncation at all.
 * Example:  returns cream</tt>,   returns Ice</tt>
 * If start denotes a position beyond the truncation from the end by negative length parameter, an empty string will be returned.
 * Example:  returns an empty string.
 * This function is safe with utf-8 multibyte characters. Example:  returns žlina</tt>.
 * As with #len, &lt;nowiki> and other tag extensions are treated as having a length of 1 for the purposes of character position. Example:  returns test</tt>.
 * If your string contains a colon ("like:this"), removing just the text that precedes the colon has the effect of putting the remaining text indented on a new line. i.e..

#pad:
The #pad function returns the given string extended to a given width. The syntax is:

The length parameter specifies the desired length of the returned string.

The padstring parameter, if specified, is used to fill the missing space. It may be a single character, which will be used as many times as necessary, or a string, which will be concatenated as many times as necessary and then trimmed to the required length. Example:  returns xXxXxXxIce</tt>.

If the padstring is not specified, spaces are used for padding.

The direction parameter, if specified, can be one of these values:
 * left</tt> - the padding will be on the left side of the string. Example:  returns xxIce</tt>.
 * right</tt> - the padding will be on the right side of the string. Example:  returns Icexx</tt>.
 * center</tt> - the string will be centered in the returned string. Example:  returns <tt>xIcex</tt>.

If the direction is not specified, the padding will be on the left side of the string.

The return value is the given string extended to length characters, using the padstring to fill the missing part(s). If the given string is already longer than length, it is neither extended nor truncated.


 * The maximum allowed value for the length is limited through the $wgStringFunctionsLimitPad global setting.
 * This function is only partially safe with utf-8 multibyte characters. These characters will be treated appropriately if they appear in the original string, but will not be respected if they appear in the padding. Examples:
 * returns zzzzZmrzlina
 * returns zzzzŽmržlina
 * returns žžŽmržlina
 * Tags such as &lt;nowiki> and other tag extensions are not permitted in the padding. If the padstring contains such a tag, it will be truncated.

#replace:
The #replace function returns the given string with all occurrences of a search term replaced with a replacement term.

If the search term is unspecified or empty, a single space will be searched for.

If the replacement term is unspecified or empty, all occurrences of the search term will be removed from the string.


 * Diese Funktion unterscheidet Groß- und Kleinschreibung.
 * The maximum allowed length of the search term is limited through the $wgStringFunctionsLimitSearch global setting.
 * The maximum allowed length of the replacement term is limited through the $wgStringFunctionsLimitReplace global setting.
 * Even if the replacement term is a space, an empty string is used. This is a side-effect of the MediaWiki parser. To use a space as the replacement term, put it in nowiki tags.
 * Example:  returns <tt>My little home page</tt>.
 * Note that this is the only acceptable use of nowiki in the replacement term, as otherwise nowiki could be used to bypass $wgStringFunctionsLimitReplace, injecting an arbitrarily large number of characters into the output. For this reason, all occurrences of &lt;nowiki> or any other tag extension within the replacement term are replaced with spaces.
 * This function is safe with utf-8 multibyte characters. Example:  returns <tt>Žmrzlina</tt>.

Currently the syntax doesn't provide a switch to toggle case sensitivity setting. But you may make use of magic words of formatting (e.g. your_string_here ) as a workaround. For example if you want to remove the word "Category:" from the string regardless of its case, you may type:
 * Case insensitive replace

But the disadvantage is the output will become all lower cases. If you want to keep the casing after replacement, you have to use multiple nesting level (i.e. multiple replace calls) to achieve the same thing.

#explode:
The #explode functions splits the given string into pieces and then returns one of the pieces. The syntax is:

The delimiter parameter specifies a string to be used to divide the string into pieces. This delimiter string is then not part of any piece, and when two delimiter strings are next to each other, they create an empty piece between them. If this parameter is not specified, a single space is used.

The position parameter specifies which piece is to be returned. Pieces are counted from 0. If this parameter is not specified, the first piece is used (piece with number 0). When a negative value is used as position, the pieces are counted from the end. In this case, piece number -1 means the last piece. Examples:
 * returns <tt>you</tt>.
 * returns <tt>Code</tt>.
 * returns <tt>Percentage</tt>.

The return value is the position-th piece. If there are fewer pieces than the position specifies, an empty string is returned.


 * Diese Funktion unterscheidet Groß- und Kleinschreibung.
 * The maximum allowed length of the delimiter is limited through $wgStringFunctionsLimitSearch global setting.
 * This function is safe with utf-8 multibyte characters. Example:  returns <tt>lina</tt>.

#urlencode: und #urldecode:
These two functions operate in tandem: #urlencode converts a string into a URL-safe syntax, and #urldecode converts such a string back. The syntax is:


 * These functions work by directly exposing PHP's urlencode and urldecode functions.
 * For anchors within a page use instead of  . The results of a call to  are compatible with intra-page references generated with  syntax, while  -generated values are not necessarily so.

Begrenzungen
This module defines three global settings:
 * <tt>$wgStringFunctionsLimitSearch</tt>
 * <tt>$wgStringFunctionsLimitReplace</tt>
 * <tt>$wgStringFunctionsLimitPad</tt>

These are used to limit some parameters of some functions to ensure the functions operate in O(n) time complexity, and are therefore safe against DoS attacks.

$wgStringFunctionsLimitSearch
This setting is used by #pos, #rpos, #replace, and #explode. All these functions search for a substring in a larger string while they operate, which can run in O(n*m) and therefore make the software more vulnerable to DoS attacks. By setting this value to a specific small number, the time complexity is decreased to O(n).

This setting limits the maximum allowed length of the string being searched for.

The default value is 30 multibyte characters.

$wgStringFunctionsLimitReplace
This setting is used by #replace. This function replaces all occurrences of one string for another, which can be used to quickly generate very large amounts of data, and therefore makes the software more vulnerable to DoS attacks. This setting limits the maximum allowed length of the replacing string.

The default value is 30 multibyte characters.

$wgStringFunctionsLimitPad
This setting is used by #pad. This function creates a string of the specified length, which can be used to quickly generate very large amounts of data, and therefore makes the software more vulnerable to DoS attacks. This setting limits the maximum allowed length of the resulting padded string.

The default value is 100 multibyte characters.

Installation
This extension requires MediaWiki 1.7+ and PHP 5.

Anleitung
Do the following to install these functions as an extension to MediaWiki.
 * 1) Copy the source code from SVN</tt><BR>...or if you have shell access copy directly with: <BR>svn co http://svn.wikimedia.org/svnroot/mediawiki/trunk/extensions/StringFunctions/</tt>
 * 2) Add the following to <tt>LocalSettings.php</tt> (near the bottom) in the root of your MediaWiki installation:

Quelltext
StringFunctions 2.0 has been tested on MediaWiki versions 1.7 and above.

History:
 * Nov 30, 2008 -- v2.0.3 -- Fix for parser changes in MediaWiki 1.14
 * Oct 27, 2008 -- v2.0.2 -- Internationalization added for extension description
 * Aug 27, 2008 -- v2.0.1 -- Regexp handling added for #len
 * May 11, 2008 -- v2.0 -- StringFunctions now maintained in SVN.
 * Dec 10, 2007 -- v2.0 -- Rewrote the functions to eliminate dependence on mbstring and to properly handle &lt;nowiki> and other tags.
 * Aug 28, 2007 -- v1.10 -- Added negative positions to #explode.
 * Jan 30, 2007 -- v1.9 -- Added limits to #pos, #pad, #replace, #explode.
 * Oct 30, 2006 -- v1.8 -- Fixed for MediaWiki 1.8.
 * Oct 26, 2006 -- v1.6 -- Leerzeichen bei #rpos und #replace repariert.
 * Oct 1, 2006 -- v1.5 -- #rpos, #pad, #replace, #explode hinzugefügt.
 * May 18, 2006 -- v1.2 -- #toURL und #fromURL umbenannt, um sie auf die MediaWiki-Funktion &#123;&#123;urlencode:}} abzustimmen.
 * May 18, 2006 -- v1.1 -- #pos hinzugefügt.
 * May 15, 2006 -- v1.0 -- Erste stabile Version.

Siehe auch

 * DynamicFunctions
 * Extension:StringFunctionsEscaped - Functions that also allow you to use escaped characters (such as \n, \t, …)
 * MultiReplace - an excellent substitute for using nested #replace commands when you need to perform a sequence of replaces on a single text string.
 * ParserFunctions
 * Hilfe zu ParserFunctions
 * RegexParserFunctions
 * VariablesExtension