Extension:StackFunctions/Reference

Implemented PostScript operators
Chapter 8.1 of the PostScript Language Reference, third edition gives a summary of PostScript operators by category. The following are implemented in StackFunctions:


 * Operand Stack Manipulation Operators: all.


 * Arithmetic and Math Operators: all except rrand.


 * Array Operators: all.


 * Dictionary Operators: all except maxlength, errordict, $error, globaldict.


 * String Operators: all.


 * Relational, Boolean, and Bitwise Operators: all.


 * Control Operators: all except stop, stopped, countexecstack, execstack, quit, start.


 * Type, Attribute, and Conversion Operators: all except executeonly, noaccess, readonly, rcheck, wcheck, cvrs.


 * Miscellaneous Operators: all except executive, echo, prompt.

In addition, the following operators are implemented:


 * show simply outputs its argument to the MediaWiki parser. In other words, the argument of show must be wikitext (not html), which may contain any kind of wiki features, including templates etc.


 * = and == which behave very similar to ghostscript.

Differences to PostScript
The implementation basically follows PostScript concepts. In particular, almost all typechecks have been implemented as in PostScript. Furthermore, the concept that composite objects on the stack are references has been implemented just as in PostScript. For instance, operators like dup create a new reference to the same composite object rather than copying a value.

The following is a complete list of differences I'm currently aware of.

Unsupported features
Some things are not (yet) supported because it would require some effort to implement them while I consider them less useful for the MediaWiki developer:


 * Literal string objects can be specified as (..) or <..> only. ASCII base-85 data, enclosed in <~ and ~> is currently not supported.


 * Radix numbers, such as 8#1777 16#FFFE 2#1000, are not supported.


 * The executable attribute has been implemented for arrays, names and strings only, and the readable/writable attributes have not been implemented at all. I think there is little point in such attributes for MediaWiki programming, and the only way to implement them would have been to represent any kind of data (including numbers) as arrays in PHP. This would have made the StackFunctions code significantly larger and slower.

Differently implemented features
There are a few things I implemented differently from PostScript because I believe this way they fit much better the needs of the MediaWiki developer:


 * The whole implementation supports multibyte character sets.


 * The show operator accepts any kind of argument (even though only strings and numbers provide useful results).


 * The string versions of get and put accept one-character strings instead of ASCII numbers. Otherwise it would be difficult to cope with multibyte characters.


 * The realtime and usertime parameters return floats instead of integers, thus providing higher precision. As most operators don't distinguish between integers and floats, the deviation from the PostScript standard is minimal.

Some differences are due to the nature of PHP which is different from what a PostScript engine needs:


 * The executable attribute is part of the object itself except for strings. For instance, if you make an array executable using the cvx operator, any other reference to the array will become executable as well. To implement this differently, all PHP code which copies objects would need to behave type-dependent, making the whole code much larger and slower. This does not apply to strings because strings and executable strings are stored differently (see Internals).


 * Test for equality of composite objects checks whether the elements contain the same values, not whether they refer to the same object. I wouldn't know how to implement the latter in PHP.


 * Substrings are not part of string objects, but independent objects. This implies, for instance, that the copy operator for strings leaves on the stack a new string rather than a substring of the original string. I don't know any way to efficiently implement the PostScript behaviour.


 * The operator serialnumber returns the static member ExtStackFunctions::$mSerialNumber which defaults to an empty string and can be set to anything useful in LocalSettings.php.


 * For efficiency, the loop operators for, forall, loop, repeat perform a bind on their procedure argument. If other copies of the procedure exist on the stack or in some dictionary, they will reflect this. I cannot imagine a reasonable application where this behaviour would cause a problem.

Additional exceptions
/x { x } def x
 * recursionoverflow : Thrown in case of infinite recursions like

Strings
String handling in PostScript is cumbersome, so there is a need to add some operators. However, there is a danger of adding a large collection of partially redundant operators which wouldn't ease programming, either. Therefore, my current strategy is to add an operator only when I'm very sure I really need it.


 * concat : string string concat string
 * Concatenate two strings.


 * dbkey2text : string dbkey2text string
 * Convert the DB key form of a title (with underscores) to its text representation (with spaces).


 * explode : separator string explode array
 * Wrapper for PHP's explode function.


 * getpagecontent : string getpagecontent string
 * Get the raw content of the page indicated in string. Throw an exception if the page does not exist.


 * id2namespace : int id2namespace string
 * Convert namespace id to canonical name.


 * implode : separator array implode string
 * Wrapper for PHP's implode function.


 * namespace2id : string namespace2id int
 * Convert canonical name to namespace id. This uses Namespace::getCanonicalIndex, hence the input string must be lowercase.


 * pcrematch : subject_string pattern_string prcematch post match pre true (if found)
 * subject_string pattern_string prcematch string false (if not found)
 * Search pattern_string in subject_string, interpreting pattern_string as a Perl Compatible Regular Expression. The return values are the same as for the PostScript operator search.


 * pcrereplace : subject_string pattern_string to_string pcrereplace –
 * Replace all occurrences of pattern_string with to_string in subject_string, interpreting pattern_string as a Perl Compatible Regular Expression.


 * replace : subject_string from_string to_string replace –
 * Replace all occurrences of from_string with to_string in subject_string


 * text2dbkey : string text2dbkey string
 * Convert the text form of a title (with spaces) to its DB key representation (with underscores).


 * tolower : string tolower string
 * Convert string to lowercase.


 * toupper : string toupper string
 * Convert string to uppercase.


 * vprintf : string array vprintf –
 * Format the data in array according to the format string and show the result.


 * vsprintf : string format_string array vprintf string
 * Format the data in array according to format_string and store the result in string.

Serialization

 * serialize : any serialize string
 * Provide a string representation of the argument. The argument is first serialized with the PHP function serialize and compressed with a configurable function, then an HMAC is prepended, and the result is converted to a printable representation using base64_encode. The result is hence an authenticated compressed image of the argument consisting entirely in printable characters. This is useful mainly to generate precompiled code.


 * unserialize : string unserialize any
 * Convert a result of the serialize operator back to its original object. If the HMAC is not valid, an invalidaccess exception occurs. This ensures that no PHP code from extraneous sources can be executed in your MediaWiki instance.

The following parameters for serialization can be customized in LocalSettings.php</tt>:

gzinflate</tt> has been chosen as a default because it seems to be slightly faster than gzuncompress</tt> or bzdecompress</tt>, but this might be different in your case.

Prologs

 * prolog : string prolog –
 * Execute StackFunctions code stored in the page indicated in string. The page should contain exactly one  ..  </tt> pair containing the StackFunctions code; anything outside these tags is ignored. If prolog is executed several times with the same argument, only the first one is evaluated. This saves parsing time when a template containing StackFunctions code is used many times on the same page: then you can store definitions (for instance, macros and dictionaries) on a separate page which is executed only once. Note that for reasons of performance, "same" argument means literal identity; two different arguments which refer to the same page are recognized as different. Due to its nature, a prolog should not create any output; therefore any output created by a prolog is silently discarded.


 * You can set the parameter ExtStackFunctions::$mPrologNamespace</tt> to set a default namespace where prolog pages are searched (if no explicit namespace is specified). It defaults to the project namespace. I recommand to create a custom namespace for prologs; see /Install.


 * Prolog pages may contain precompiled code instead of source code.

Template evaluation

 * parsetemplate : simple string parsetemplate string array string parsetemplate string dict string parsetemplate string
 * In the first form, simple denotes an argument of any type which is neither an array nor a dictionary. As a result, the string  </tt> is passed to the parser for template substitution and the result pushed on the stack.
 * In the second form, the same is done with  </tt> where any0 .. anyn are the elements of the array.
 * In the third form, the same is done with  </tt> where key0 .. keyn are the keys and val0 .. valn the corresponding values in the dictionary.
 * In all three forms, the result is eveluated by the parser before execution of StackFunctions code continues. This means that you can examine the result to see what is substituted. Note that parsetemplate also works with parser functions instead of templates.


 * showtemplate : simple string showtemplate – array string showtemplate – dict string showtemplate –
 * This works the same way as parsetemplate; the difference is that the result, instead of being pushed onto the stack, is written to the output.

Note that parsing of templates is a very complex process and therefore rather slow, especially when the templates contain other templates. One of the motivations of developing StackFunctions was to provide a more performant alternative. Therefore, rather then evaluating a template, you might consider replacing it with StackFunctions code wherever feasible. For instance, Magic words can directly be read from the status dictionary.

Cache

 * disablecache : – disablecache –
 * Disables the cache for this page, which means that the page will recalculated on every access. Useful for pages whose contents are meant to depend on rapidly changing data like random numbers or time of the day.

Database querying
Database querying allows you to access directly the database MediaWiki runs on. This might be not particularly useful on the basic MediaWiki installation. It becomes interesting when you want to display other data stored in the same database and accessible to the MediaWiki database user, or in connection with MediaWiki extensions which store data in additional tables (like DataTable).

Note that this is likely to raise security concerns. Most data in the database are accessible anyway, but for instance, the email addresses of registered users might not be meant to be accessible to anybody. Therefore, this feature is disabled by default. To enable it, set ExtStackFunctions::$mEnableQuery = true;</tt> in your LocalSettings.php</tt>.


 * query : dict query array true
 * or false
 * Query the MediaWiki Database with a SQL statement. The dictionary may contain the following keys:


 * Hence, the only required key is /from. The result is returned as an array of rows, where each row is either an array, a dictionary (where keys are column names) or a simple type, depending on the value of /return. Note that in the latter case only the first column is considered, and it is converted to the requested type.

The system dictionary
As in PostScript, the system dictionary contains the definitions of all built-in operators.

The status dictionary

 * For each magic word in MediaWiki, there is a key in the status dictionary containing the current value of this magic word. For instance,

statusdict /pagename get


 * supplies the current page name.


 * There are additional entries pageid and namespaceid containing the numeric IDs of the current page and its namespace. This is useful when querying the database for something related to the current page. However, note that the database query might supply IDs as integers or strings; as StackFunctions are strictly typed, a string will not be recognized as equal to the statusdict items pageid or namespaceid which are integers. The safest solution is to always convert IDs queried from the database to integers before doing any comparison.


 * Furthermore, there is a key args. In the parser function syntax, it refers to an array containing the additional parameters. In the tag syntax, it refers to a dictionary containing the parameters given as parameter=value in the opening tag.

Precompiled code
You can use the serialize operator to convert a procedure to a string representation. My tests show that converting the string back to the object is about three times faster than creating the object by parsing the source code (and it seems that compression contributes to this gain). Therefore, the following syntax is accepted as an alternative to source code in prologs:


 * %Z
 * output of the serialize operator

</tt>

For instance, you can store the actual source code for a prolog on a subpage for the actual prolog page, transform the code into a procedure by putting braces around it, apply the serialize operator and display the result using the pstack operator. Then you copy the displayed result into the actual prolog page.

While the serialize operator works with any kind of argument, this syntax works only when the argument is a procedure.

Note that the result depends on the internal representation of data within StackFunctions as well as on your compression algorithm and ExtStackFunctions::$mAuthKey</tt>. This implies that the code is portable between MediaWiki installations only if the have the same PHP, MediaWiki and StackFunctions versions and the same ExtStackFunctions::$mAuthKey</tt>. You should recreate the serialized code when updating to future versions of MediaWiki or StackFunctions.

Exceptions
When exceptions occur, a debug information vaguely similar to ghostscript is shown, for instance:

StackFunctions error: typecheck  in  pop_num</tt>

Operand(s): (a)</tt>

Operand Stack:

''' (objects) (test) (any) '''

Backtrace: <tt>sf_tag  execute  op_add  pop_num</tt>

Dictionary Stack: <tt>-dict:119-  -dict:0-</tt>

The meaning of the elements is as follows.


 * error : The error name as in PostScript.


 * in : The php function where the error occurred. This is not necessarily a function directly related to an operator, it can also be an auxiliary function. For instance, many arithmetic operators call <tt>pop_num</tt> to pop a number from the stack or raise an exception if there is no number.


 * Operand(s) : The operand(s) that immediately triggered the exception.


 * Operand Stack : The operand(s) still on the stack when the exception occurred. This does never include the just mentioned operand(s) that triggered the exception. Note that there can be operands involved which are not displayed at all; for instance, in the above example, the string <tt>(a)</tt> was the second argument to the add operator. The first is not in the list of operand(s) that immediately triggered the exception because it was OK, and neither on the stack because it has already been consumed.


 * Backtrace : The backtrace of PHP function calls, starting from the first function belonging to StackFunctions. This backtrace definitely contains the operator.


 * Dictionary Stack : The dictionary stack at the time when the exception occurred.