Extension:Scribunto/Parser interface design

Design principles
Design principles for the parser interface:


 * It should appear to be native to Lua. It's desirable to map parser concepts from PHP, but not syntax details.
 * It should be flexible enough to allow for future developments. For example, greater integration with the current preprocessor, or close integration with Gabriel Wicke's proposed token-based parser.
 * It should encourage brief but readable code.
 * It should be efficient, or at least the interface should not preclude an efficient future implementation.

Some facts about Lua
Lua has a single data structure called a table. It is similar to a PHP array in that it is a hybrid of an integer-indexed array and a hashtable. It is similar to a JavaScript object in that a table can be used as an object that contains both methods and properties in the same namespace. It provides JavaScript-like syntactic sugar for accessing elements: foo.bar is equivalent to foo['bar'].

Named parameters or some approximation to them are commonly encouraged by experienced software developers. Lua has syntactical support for a particular implementation of named parameters: some_function{foo = bar} is equivalent to some_function({foo = bar}). This example calls some_function with a single table parameter. The table contains a single element with name "foo" and value "bar".

Normal object method calls are written with a colon: obj:func. Static method calls are written with a dot: obj.func.

Named arguments
Templates use a combination of named and numbered arguments:   and  . Parser functions have an internal interface which provides only positional arguments, and equals signs need to be interpreted by the parser function.

I propose to hide this implementation detail from Lua scripts, by providing template-like named and numbered arguments. Top-level Lua functions would receive a frame object with a getArgument method, equivalent to PPFrame::getArgument in PHP.

Parent frame access
One of the reasons we believe Lua implementations of metatemplates will be faster than the existing wikitext is because we can avoid handling large numbers of arguments in the wikitext parser entirely. In wikitext, every triple brace and every pipe has a substantial cost. By allowing Lua access to template arguments in the template the script is invoked from, we eliminate the need for large "proxy" invocations.

For example, imagine [//en.wikipedia.org/w/index.php?title=Template:Cite_web&oldid=487981696&action=edit this template] converted to Lua. If parent frame access is not allowed, then the template will be essentially the same, except with {{Citation/code replaced with {{#invoke:Citation/core. If we do allow parent frame access, then this template can be extremely short, with the task of mapping input argument names moved to Lua.

In the current parser implementation, the set of template arguments is called a frame, and the set of template arguments available to the caller of the current template is called the parent frame. If template arguments are essentially the same as Lua script arguments, then we can borrow this terminology.

So I propose to provide a getParent method in the frame object that we pass to Lua.

To protect the consistency of the empty-frame expansion cache (Parser::$mTplExpandCache), I propose that we do not provide access to grandparent frames.

Index metamethod
The current preprocessor has a kind of "dead branch elimination". The input text is converted to a tree, then if a subtree is not referenced, it does not have to be "expanded" to plain text. In particular, if you call a template like this:

Then if you never write in template1, template2 never needs to even be loaded from the database.

If we provide access to arguments as a simple table with text in the elements, then all arguments will need to be fully expanded prior to passing control to Lua.

It would be possible to provide a table with an "index" metamethod which expands the requested argument on demand. For example, frame.args.foo could provide access to the argument named "foo".

The major disadvantage to this is that iteration over frame.args is not possible using the normal Lua construct:

for k, v in pairs(frame.args) do  ... end

The loop body would not be executed at all. There is no way to hook into pairs, even to throw an error. But perhaps this can be overcome with clear documentation. Users would instead be instructed to use a special iterator factory:

for k, v in frame:argumentPairs do  ... end

I think the advantage in brevity outweighs this potential pitfall. It would even be possible to make a local alias:

local p = frame.args

if p.Surname1 then ... end