Extension:StackFunctions

What the StackFunctions Extension Is
The StackFunctions extension implements a programming language which is basically PostScript without graphics.

What It Is Useful For
This extension can be considered as an alternative to the combination of ParserFunctions with StringFunctions (and maybe other extensions). Advantages are:
 * When using conditional expressions with ParserFunctions, wikitext in the false-branch is parsed before evaluating the condition. If that text contains complex templates, such superfluous parsing may take much time. With StackFunctions, wikitext is parsed only if needed for the output.
 * To execute loops, you would need either inefficient auxiliary templates with a limited number of runs or LoopFunctions which also have limits. StackFunctions is able to execute loops with any number of iterations and any depth of nesting in a relatively efficient way.
 * If you have text which on the one hand you'd like to display between ..  and on the other hand to use as an argument to ParserFunctions, you need to build more or less complex structures using ExpandAfter. StackFunctions can handle this in a much easier way.
 * StackFunctions offer a Turing-complete programming language which can be used to implement any algorithm (with more or less effort in a more or less efficient fashion) without need for additional extensions.
 * StackFunctions easily handle complex data structures like arrays and dictionaries, which may also be nested to any level.
 * My personal experience is that StackFunctions generally tend to execute faster than a combination of other existing extensions, especially when the latter solution would need a large number of auxiliary templates which are evaluated many times.

How Io Install It

 * Ensure your PHP has multibyte functions enabled. If you do not have the possibility to do that, you might decide to do without multibyte support. In that case, just delete all occurences of "mb_" from the extension source code.


 * Save the extension source code as a file extensions/StackFunctions/StackFunctions.php.


 * Apply the patch to Parser.php.

require_once( "extensions/StackFunctions/StackFunctions.php" );
 * Add the following line to your LocalSettings.php:

$wfStackFunctionsEnableQuery = true;
 * If you want to enable, also add the following line to your LocalSettings.php:

How To Use It
Preliminary Remark: programming with a stack processor like this is a matter of taste. You might alternatively find it extremely cool or totally unusable. You have been warned.

Syntax
You can use this extension either as a parser function extension or as a parser extension.

Parser Function Extension
As a parser function extension, the syntax is:

When using this syntax, you must pay attention to some issues. The basic rule is that any   structures are parsed by the MediaWiki parser before any StackFunctions code is executed. There is a number of consequences:


 * Take care not to have any   in your code. If you have consecutive braces, put a space in between.


 * You cannot use a literal | character inside   because the parser would interpret it as a separator for an additional parameter passed to #sf:. Within a string constant, you can write its octal representation \174</tt> instead.


 * If you put a template, magic constant or parser function in a string literal, it will first be evaluated and then the result passed to the StackFunctions code. Use parsetemplate/showtemplate if you want to evaluate it only while executing the StackFunctions code. In particular, you should do so within conditional branches so that the parser spends time only on those templates which are actually needed.


 * When using html tags within string literals, take into account that the MediaWiki parser will interpret them before invoking StackFunctions which is probably not what you want. Use \074, \076</tt> instead of &lt;, &gt;</tt> to avoid this.

Parser Function Extension
As a parser extension, the syntax is:

<sf> your_stackfunctions_code </sf>

This syntax is the preferrable one because:


 * You don't have to bother with the problems listed above for the parser function syntax.
 * From theory it is likely that this executes faster than the parser function syntax because the code is not parsed before passing it to StackFunctions. (However, I didn't yet write so much code that I can confirm this from observation.)

If you want to evaluate magic words, templates or parser functions, you can do this within the code using the members of the via the statusdict and parsetemplate/showtemplate operators.

The only thing that (currently) cannot be done with this syntax is usage of template parameters , , ... </tt>.

About PostScript
StackFunctions are basically an implementation of PostScript without graphical operators, with a few modifications and with some MediaWiki-specific extensions as explained in detail below. I'm not going to explain PostScript here; you might refer to the following:
 * The PostScript Wikipedia article for a very basic introduction.
 * The PostScript Language Tutorial and Cookbook for a good tutorial.
 * The PostScript Language Reference, third edition for a complete reference.

Implemented PostScript Operators
Chapter 8.1 of the PostScript Language Reference, third edition gives a summary of PostScript operators by category. The following are implemented in StackFunctions:


 * Operand Stack Manipulation Operators: all.
 * Arithmetic and Math Operators: all except rrand</tt>.
 * Array Operators: all.
 * Dictionary Operators: all except maxlength, errordict, $error, globaldict</tt>.
 * String Operators: all except token</tt>.
 * Relational, Boolean, and Bitwise Operators: all.
 * Control Operators: all except stop, stopped, countexecstack, execstack, quit, start</tt>.
 * Type, Attribute, and Conversion Operators: all except executeonly, noaccess, readonly, rcheck, wcheck, cvrs</tt>.
 * Miscellaneous Operators: all except executive, echo, prompt</tt>.

In addition, the show</tt> operator has been implemented: it simply outputs its argument to the MediaWiki parser. In other words, the argument of show must be wikitext (not html), which may contain any kind of wiki features, including templates etc.

Differences to PostScript
There are a few things I implemented differently from PostScript because I believe this way they fit better to the needs of the MediaWiki developer:


 * The whole implementation supports multibyte character sets.
 * The show operator accepts any kind of argument (even though only strings and numbers provide useful results).
 * The string versions of get and put accept one-character strings instead of ASCII numbers. Otherwise it would be difficult to cope with multibyte characters.

Other things are not (yet) implemented because it would require some effort to implement them while I consider them less useful for the MediaWiki developer:


 * Radix numbers, such as 8#1777 16#FFFE 2#1000, are currently not supported.
 * Literal string objects can be specified as (..) only. Hexadecimal data, enclosed in, and ASCII base-85 data, enclosed in <~ and ~>, are currently not supported.
 * Single escaped parentheses with string are currently not supported (because that would make the parser more complex and probably slower). Use \050, \051</tt> if you need unbalanced parentheses within strings.
 * The executable attribute has been implemented for arrays only, and the readable/writable attributes have not been implemented at all. I think there is little point in such attributes for MediaWiki programming, and the only way to implement them would have been to represent any kind of data (including numbers and strings) as arrays in PHP. This would have made the StackFunctions code larger and slower.

Finally, some differences are due to the nature of PHP which is different from what a PostScript engine needs:
 * Test for equality of composite opjects checks whether the elements contain the same values, not whether they refer to the same object. I wouldn't know how to implement the latter in PHP.
 * Substrings are not part of string objects, but independent objects. This implies, for instance, that the copy operator for strings leaves on the stack a new string rather than a substring of the original string.
 * The operator serialnumber</tt> returns the IP adress of the webserver as a string. If you can think of any more useful usage for this operator, please let me know.
 * The loop operators for, forall, loop, repeat</tt> perform a bind</tt> on their procedure argument. If other copies of the procedure exist on the stack or in some dictionary, they will reflect this. I cannot imagine a reasonable application where this behaviour would cause a problem.

For all these differences, comments and suggestions are welcome.

Almost all typechecks have been implemented as in PostScript. Furthermore, the concept that composite objects on the stack are references has been implemented just as in PostScript. For instance, operators like dup create a new reference to the same composite object rather than copying a value.

Prologs

 * prolog : string prolog –
 * Execute StackFunctions code stored in the page indicated in string. The page should contain nothing but code, optionally included in  ..  </tt> which are ignored. If prolog is executed several times with the same argument, only the first one is evaluated. This saves parsing time when a template containing StackFunctions code is used many times on the same page: then you can store definitions (for instance, macros and dictionaries) on a separate page which is parsed only once. Note that for reasons of performance, "same" argument means literal identity; two different arguments which refer to the same page are recognized as different.


 * Prolog pages are searched in the project namespace by default if no explicit namespace is specified.


 * As the mechanism to find a prolog page is the same as that to find a template, prolog pages are listed in "Templates used on this page:" when you edit a page.

Template Evaluation

 * parsetemplate : simple string parsetemplate string array string parsetemplate string dict string parsetemplate string
 * In the first form, simple denotes an argument of any type which is neither an array nor a dictionary. As a result, the string <tt> </tt> is passed to the parser for template substitution and the result pushed on the stack.
 * In the second form, the same is done with <tt> </tt> where any0 .. anyn are the elements of the array.
 * In the third form, the same is done with <tt> </tt> where key0 .. keyn are the keys and val0 .. valn the corresponding values in the dictionary.
 * In all three forms, the result is eveluated by the parser before execeution of StackFunctions code continues. This means that you can examine the result to see what is substituted. Note that parsetemplate also works with parser functions.


 * showtemplate : simple string showtemplate – array string showtemplate – dict string showtemplate –
 * This works the same way as parsetemplate; the difference is that the result, instead of being pushed onto the stack, is written to the output.

Database Querying
Database querying allows you to access directly the database MediaWiki runs on. This might be not particularly useful on the basic MediaWiki installation. It becomes interesting when you want to display other data stored in the same database and accessible to the MediaWiki database user, or in connection with MediaWiki extensions which store data in additional tables.


 * query : dict query array true
 * or false
 * Query the MediaWiki Database with a SQL statement. The dictionary may contain the following keys:


 * Hence, the only required key is /from. The result is returned as an array of rows, where each row is either an array, a dictionary (where keys are column names) or a simple type, depending on the value of /return. Note that in the latter case only the first column is considered, and it is converted to the requested type.

The System Dictionary
As in PostScript, the system dictionary contains the definitions of all built-in operators.

The Status Dictionary

 * For each magic word in MediaWiki, there is a key in the status dictionary. When executed, it leaves on the stack the current value of this magic word. For instance,

statusdict /pagename get exec


 * supplies the current page name. It would clearly be more elegant if the key would already be associated to the current value itself, but I wouldn't know how to implement this efficiently.

Design Considerations

 * Remain as close as possible to the PostScript programming language. : As that language has been developed and used for many years, it has achieved a high degree of conceptual soundness and completeness. My personal experience until now confirms that anything you need can be expressed with the given operators and underlying concepts.


 * A drawback is that some parts, first of all string handling, are rather cumbersome while it would be easy to create something much simpler with php. But is it less easy to do this in a sound and complete way that is easy to understand, to document and to use. As a compromise, future versions of StackFunctions are likely to contains additional string functions for purposes like concatenation, replacement and such.


 * Create an extension that executes fast. : As StackFunctions are an interpreter run itself in an interpreter language, it is a priori slow. As my personal experience tells that the usefulness of a MediaWiki installation heavily depends on the time it takes to display pages, I tried to write StackFunctions in a way to execute as fast as possible; suggestions for further enhancement are particularly welcome. This implies relatively poor debugging support (for instance, when an error occurs, the stack is shown after the error has already occurred, so the arguments which caused the error are not shown any more).

Representation of Data
Booleans numbers, strings and null are implemented with the corresponding PHP types. Arrays, dictionaries, marks, names and built-in operators are represented by arrays where the component "t" shows the type, the component "v" the value (an array or string), and the component "a" in some cases additional arguments.