Talk:Parsoid/Todo

Misc issues

 * Try to emulate PHP parser in treating foo as foo (low priority)
 * search for 'listItem' in http://parsoid.wmflabs.org/_rt/Takeda%20clan.
 * SSS: This is a "syntax error" with mismatched ref tags in wikitext. The specific segment that crashes it is this:  . Note the error in &lt;ref name="enc-shingen"/&gt;.  This ref tag should not be closed. This is a similar bug as the previous one where there are mismatched tags which are usually handled by the Tidy post-processor.  We need a strategy for this in general. Here is the smallest test case to reproduce this:  boo yahoo 
 * Two issues reported in http://www.mediawiki.org/wiki/User_talk:GWicke#Normalization_of_wiki_text_15976
 * The weather box in http://parsoid.wmflabs.org/Broken_Hill,_New_South_Wales is rendered incorrectly

Issue on http://parsoid.wmflabs.org/_rt/pt:Foo
Is it possible to have an https or protocol relative link for reporting bugs on this page? The address https://parsoid.wmflabs.org/_rt/pt:Foo doesn't seems to work. Helder 15:38, 8 June 2012 (UTC)

Issue on http://parsoid.wmflabs.org/_rt/pt:HTML
The article [//pt.wikipedia.org/w/index.php?title=HTML&oldid=30591267 pt:HTML] uses the non-existent " " to exemplify the way HTML works, but the code There is no " " in HTML is converted back to something else: There is no " " in HTML Helder 12:04, 9 June 2012 (UTC)


 * Thanks Helder. This is a general bug with a hanging closing tag -- applies to any tag.  Parsoid discards them.  Ex: "  " is converted back to the empty string "".  -Subbu.


 * We are using a standard HTML5 tree builder, which handles stray closing tags by simply dropping them. We could hack up the tree builder to convert those stray tags back to text instead. The disadvantage would be losing the ability to use a standard library as part of the parser pipeline.
 * The particular example at [//pt.wikipedia.org/w/index.php?title=HTML&oldid=30591267 pt:HTML] however can be handled by the sanitizer, as the imaginary tags are not part of the regular whitelist of allowed tags. The sanitizer can convert those tags back to text, just as the PHP sanitizer does. This is not yet implemented in JS, and might not be as we would prefer to reuse the existing sanitizer code from a C port of Parsoid. -- Gabriel Wicke (GWicke) (talk) 11:20, 17 June 2012 (UTC)

Links, Lists and Headings
I tried out the parsoid service, converting some fairly simple html that comprised of a div tag, header tag, nav tag, span tag, a couple links and a few headers. The Parser did a pretty good job save that on the conversion of html to wikitext it didn't convert the html links to wikitext, and the first list item ended up being * then on the next line was that li's content. Also, while html to wikitext header tag conversion worked well, wikitext to html header tag conversion didn't work. Thanks! -- Kangaroo  powah  03:27, 14 June 2012 (UTC)