Topic on Project:Support desk

How can I avoid modification of my extension's HTML attributes?

5
Tigelriegel (talkcontribs)

I have a tag extension which produces html containing a data-* kind of attribute. It looks like the mediawiki parser later parses and modifies the contents of this attribute (it adds some PRE, BR and P tags). So I base64-encode that attribute (and later in the code, where the attribute is needed by some Javascript I base64-decode it again). This workaround is working :-)


Now I found Manual:Tag extensions#How can I avoid modification of my extension's HTML output? which probably is a better solution (no base64 encode-decode). However, when I wrap my HTML with an array( "..html...", "markerType" => 'nowiki') and have my extension return the parser nevertheless tampers with the attribute (adding some BR and P tags, but no PRE tags). So I am back with my base64 method for the moment. Which is *not* elegant.


What am I doing wrong? What did I misunderstand from the manual entry?


Samwilson (talkcontribs)

Could you share the code you're trying to get working?

Tigelriegel (talkcontribs)

Not in the form of a working sample as this is part of larger code base which currently is refactored by several people. But to give you a more precise idea:

When my hook returns

  return  "<a target='_blank' rel='noreferrer noopener' href='".$aHref."'><span style='display:none;'></span><img src='".$imgSrc."' data-texpara='".$arText."' data-texsrc='". $texSrc ."' onerror='window.lazyIn(event.target);' alt='Rendered TeX' style='width:800px; height:auto;'></img></a>";

then my browser receives a data-texsrc attribute with P and BR and PRE tags. which weren't there in $texSrc.

When I

  return  array( "<a target='_blank' rel='noreferrer noopener' href='".$aHref."'><span style='display:none;'></span><img src='".$imgSrc."' data-texpara='".$arText."' data-texsrc='". $texSrc ."' onerror='window.lazyIn(event.target);' alt='Rendered TeX' style='width:800px; height:auto;'></img></a>", "markerType" => 'nowiki');

I still see the BR and P tags added.

My currently unelegant solution is to base64 encode on the server and base64 decode on the client.


Samwilson (talkcontribs)

You should probably construct your HTML using the Html class. For example, the following works in my testing and doesn't introduce any wayward HTML into the texsrc attribute:

$wgHooks['ParserFirstCallInit'][] = function(Parser $parser) {
   $parser->setHook( 'foobar', function () {
      $texSrc = "\section{foo}

      Lorem ipsum.";
      $html = Html::rawElement( 'a', [ 'data-textsrc' => $texSrc], Html::element( 'img' ) );
      return $html;
   });
};
Bawolff (talkcontribs)

I second Samwilson's suggestion.

That said, can you post an example of what your parser tag actually returns? (e.g. change the return line to a var_dump, and pull the result out of the html source code).

I wonder if maybe data-texpara or data-texsrc has unescaped html in it that is getting out of the attribute. Using the Html class would fix that if that is the issue.

Reply to "How can I avoid modification of my extension's HTML attributes?"