Talk:Parsoid/Todo

In this page we track and report Parsoid parsing, round-tripping or serialization issues. Problematic wikitext snippets can be added in Parsoid/Bug_test_cases for direct testing.

VE testing issues
[2:12:41 AM] Gabriel Wicke: something generates new s in the nowiki section without any edits in that area: https://www.mediawiki.org/w/index.php?title=VisualEditor%3ATest&diff=552860&oldid=552858 [2:25:10 AM] Gabriel Wicke: I believe that those paragraphs are inserted on the VE side, since they seem to round-trip ok: http://parsoid.wmflabs.org/_rt/mw:Parsoid/Test/VE Likely solved once the change to nowiki as span is deployed and/or nowiki is handled in the VE -- Gabriel Wicke (GWicke) (talk) 13:37, 28 June 2012 (UTC)

Misc issues
* foo ** bar *** baz with master revision 33dc9abb0db364bb41ca0b06d368bde386719d6a. This is a problem with diffWords which swallows newlines. diffChars works better, but it takes too long and too much memory. Alternative would be to use diffChars on "small" lines.
 * Try to emulate PHP parser in treating foo as foo (low priority)
 * search for 'listItem' in http://parsoid.wmflabs.org/_rt/Takeda%20clan.
 * SSS: This is a "syntax error" with mismatched ref tags in wikitext. The specific segment that crashes it is this:  . Note the error in &lt;ref name="enc-shingen"/&gt;.  This ref tag should not be closed. This is a similar bug as the previous one where there are mismatched tags which are usually handled by the Tidy post-processor.  We need a strategy for this in general. Here is the smallest test case to reproduce this:  boo yahoo 
 * Two issues reported in Thread:User talk:GWicke/Normalization of wiki text
 * Preserve sort order in category links: Thread:User talk:GWicke/Normalization of wiki text/reply (6)
 * Thread:User talk:GWicke/Normalization of wiki text/reply (7) is expected behavior for a yet-unhandled extension tag
 * The weather box in http://parsoid.wmflabs.org/Broken_Hill,_New_South_Wales is rendered incorrectly
 * Parser: Text not wrapped in &lt;p&gt; tags. Look at HTML output for http://parsoid.wmflabs.org/_rt/mw:Parsoid/Todo  In several sections, text after headings in certain context appears bare.  I haven't yet reduced this to a small test case.
 * Diffing bug: Try roundtrip diff on a page with content
 * Roundtripping of html attributes -- needs fixing
 * Anything in particular? Attributes on plain HTML tags seem to work fine. -- Gabriel Wicke (GWicke) (talk) 13:34, 28 June 2012 (UTC)

Issue on http://parsoid.wmflabs.org/_rt/pt:Foo
Is it possible to have an https or protocol relative link for reporting bugs on this page? The address https://parsoid.wmflabs.org/_rt/pt:Foo doesn't seems to work. Helder 15:38, 8 June 2012 (UTC)


 * It certainly is possible, but not really our top priority right now. There is no authentication info involved, and all the content is public. -- Gabriel Wicke (GWicke) (talk) 22:17, 20 June 2012 (UTC)

Issue on http://parsoid.wmflabs.org/_rt/pt:HTML
The article [//pt.wikipedia.org/w/index.php?title=HTML&oldid=30591267 pt:HTML] uses the non-existent " " to exemplify the way HTML works, but the code There is no " " in HTML is converted back to something else: There is no " should produce something like   rather than marking the entire paragraph as template-generated.
 * This is pretty much what we intend to do: see Parsoid/HTML5_DOM_with_microdata. -- Gabriel Wicke (GWicke) (talk) 10:03, 20 June 2012 (UTC)


 * Add round-tripping of category links and the like, right now these are lost
 * Fix the newline-at-the-end-of-an-li-or-before-a-ul behavior such that
 * the parser doesn't output newlines before each  and before each  -within-a-
 * the serializer doesn't depend on these newlines to output correct wikitext
 * (newline handling in general is slated to be revamped but I wanted to document this case specifically because VE works around it)
 * Feature request: the first  inside an   should be ignored, whereas every subsequent   should be treated as if it had stx=html (the latter is already done). This means that   should be serialized to
 * This is because the parser doesn't wrap the text in a list item in a paragraph (i.e. the text is directly in the list item) whereas VE's linear model does wrap it in a paragraph because listItem nodes can't contain text directly. The HTML->linmod converter can deal with adding the paragraphs quite easily and cleanly, but removing these paragraphs in the linmod->HTML converter with the conditions being this specific is a pain (we currently do do this as a workaround, but it's ugly). So we can tolerate input that doesn't have wrapped first paragraphs, but Parsoid doesn't tolerate input that does have wrapped first paragraphs; it would make our lives easier if it did
 * -> Listed in the Parsoid/Todo. Will also be needed for table cells. -- Gabriel Wicke (GWicke) (talk) 10:03, 20 June 2012 (UTC)

Issue with indented tables
Compare the results of the following on http://parsoid.wmflabs.org/_rtform/: {| class="wikitable" ! Wiki code ! Expected result :
 * {| border="1"
 * {| border="1"


 * a
 * b
 * c
 * d
 * }
 * }
 * }

This kind of "indented table" is used in some articles ([//pt.wikipedia.org/w/index.php?title=Sequ%C3%AAncia_principal&oldid=30969308&uselang=en#Dados_da_sequ.C3.AAncia_principal example] / round-trip). Helder 13:25, 28 June 2012 (UTC)


 * --Fixed in c5f99614 Ssastry (talk) 17:03, 30 July 2012 (UTC)

JSON in the rendered HTML
For some reason this test shows

in the HTML version... Helder 03:06, 8 July 2012 (UTC)

önerilerim
Sektorunde 2006 yilinda faaliyet gosteren reklama dair ne varsa musterilerine daha iyi hizmet vermek ve onlarin islerini kolaylastirmak icin aradiklari hemen hemen her seyi tek bir noktada birlestirdik bunlar neler mi;

Denizlide kiralik, tabela , logo tasarim, hazir web sitesi , reklam ajanslari , Grafik tasarim , Duvar Sticker ,Dekoratif Aynalar ,plaj bayragi ,orumcek stand ,sari sayfalar ,Armine Esarp , Tesettur giyim , kedi mamasi , pet shop , adrese cicek , firma rehberi , atasehir rent a car , opel ozel servis ,