Requests for comment/New hook: ParserBeforePreprocess

Change abandoned

Proposal
A new hook: ParserBeforePreprocess. Called before preprocessing text (Parser.php, line ~2803):

function preprocessToDom( $text, $flags = 0 ) { wfRunHooks( 'ParserBeforePreprocess', array( $this, &$text, $flags ) ); $dom = $this->getPreprocessor->preprocessToObj( $text, $flags ); return $dom; }

Rationale
Existing hook ParserBeforeInternalParse is advertised as a way to implement custom preprocessors:

Replaces the normal processing of stripped wiki text with custom processing. Used primarily to support alternatives (rather than additions) to the core MediaWiki markup syntax.

But it does not completely meet the goal, because it is not called to preprocess template source. For example, on page:

hook ParserBeforeInternalParse is called 3 times:


 * 1) On original page source.
 * 2) On result of template.
 * 3) On message "This page was accessed x NaN timess."

But is is not called on template source, so it cannot be used to implement custom preprocessing. At least preprocessing, which effective in both page and template sources.

I failed to find a hook which allows custom preprocessing. This is the reason for proposing this one.

Background
I want preprocessor to recognize tag and discards the tag itself and whitespace after it.

The primary purpose — better formatting for template code. Example:

*  Some introductory text continuation and, finally, finish.

Without the tag it must be formatted as:

*  Some introductory text  continuation  and, finally, finish.

This is a simple example. In more complex templates with nested parser functions (#if, #loop, etc), the importance of good formatting increases.

Obviously, such a tag cannot be implemented as extension, because it affects not just the tag itself, but also the text after the tag.

My original implementation as a patch for parser (actually, preprocessor) was rejected because it is always dangerous to touch preprocessor. I am ok with it if there is another way to reach the goal. It seems the only way left is the proposed hook.