Parsoid/Deployments

Planned deployments, linked from Deployments. For a list of past deployments, look for 'parsoid' in Server Admin Log.

See Parsoid to learn how to deploy a new version of Parsoid.

Friday, Jan 30, 2015 around 2:35 pm PST: ✅
The Jan 15th deploy where Parsoid started using sitematrix info for configuring wikis was missing special handling for some wikis (commonswiki being one of them). This caused timeouts which in turn repeatedly exercised an existing memory leak. This, in turn, caused a slow buildup of leaked memory on the cluster and a higher than normal cpu load. This special Friday deploy fixed the config issues.

Specifically, the following two patches were deployed:
 * Some special wikis should use the default proxy
 * Strip TLS from sitematrix url if we're using the default proxy

Wednesday, Jan 28, 2015 around 1pm PST: ✅

 * : Correctly handle templates that generate part-attribute and part-content of a DOM node.
 * : Preserve blank template parameters
 * : Cleanup of behavior switch production
 * Updates to wikitext serializer to simplify and enable more robust wikitext escaping
 * ,, : Magic link fixes (wt2html and html2wt nowiki handling)

Thursday, Jan 15, 2015 around 1pm PST: ✅

 * ,, : Default WMF wikis served by Parsoid fetched from sitematrix API call
 * : Positional params with = in extlink are serialized as named parameters

On Jan 14th 1:20 pm PST, we reverted Parsoid to older deployed version after dirty diffs were seen during post-deploy testing. It turned out that the dirty diffs weren't related to the Parsoid deploy, but now that those issues have been fixed, we'll revisit the Parsoid deployment on Thursday.

Monday, Jan 12, 2015 around 1pm PST: ✅

 * Include location of titles in timeout logs
 * Tweaks to Parsoid's cite port to generate identical ref ids as native cite implementation

Wednesday, Jan 7, 2015 around 1pm PST: ✅

 * : Improved handling of extremely large lists -- fixes the load issues seen in production on Jan 3rd
 * Removed hardcoded HTTP 500 response for urwiki:نام_مقامات_اے (deployed on Jan 3rd to prevent this page from overloading the cluster)
 * : Fix edge case tokenizing table lines

Monday, Jan 5, 2015 around 2pm PST: ✅
Wikitext -> HTML HTML -> Wikitext Other (API, logging, etc)
 * : data-parsoid stripped from template content
 * : Context-aware parsing of definition list colon
 * : Parse extension parameters as plain text
 * : Stray is parsed to meta
 * Marginal improvement parsing templates in definition lists
 * , : Several improvements and fixes to nowiki protection for quotes
 * Other improvements and bug fixes to nowiki protection in headings, lists, tables.
 * : Insert an extra newline after new content and existing headings
 * Add logging for html2wt API endpoints
 * Fix robots.txt route
 * Send SIGKILL to kill a timed out worker
 * : API v2 parsing and serialization routes

Thursday, Dec 11, 2014 around 3pm PST ✅

 * Disable caching for the v2 entry point (currently only used by RESTbase)
 * Don't allow quotes in generic_attribute_name
 * Use localized main page name
 * Add mw:Error + error info to data-mw for missing imgs

Wednesday, Dec 3, 2014 around 2pm PST ✅

 * : Fix crash while expanding templates on older wikis
 * : Pass the revid when expanding templates
 * : Infer extension name from typeOf if data-mw not present
 * : Fix failures from v2 endpoint
 * Fix config clobbering via v2 API

Wednesday, Nov 26, 2014 around 2pm PST ✅

 * Subpage link fixes
 * Add wikidata API URL
 * Add wikispecies API URL

Monday, Nov 17, 2014 around 1pm PST ✅

 * : Expose the list of languages from ParsoidConfig
 * : Set xmlns namespace URI on elements that reference it
 * : Fix a.args crasher
 * Fix  endpoint (this is not a published or stable API).
 * Add new language code.
 * : Logging improvements.
 * : Return 404 when page does not exist.

Thursday, Nov 13, 2014 around 1pm PST ✅

 * Properly escape question marks in page titles in
 * Use correct relative link prefix for some Category links
 * : Point base href to wiki base.

Monday, Nov 10, 2014 around 1pm PST: ✅

 * Enable cpu timeouts for sufficiently new versions of node
 * Document the expected behaviour for the timeouts
 * Tweaked serializer error messages

Thursday, Nov 6, 2014 around 2pm PST: ✅

 * Update prfun to v1.0.2
 * More efficient promise use for timeouts
 * Add support for foundationwiki

Wednesday, Nov 5, 2014 around 1pm PST: ✅

 * : Generate HTML5-compliant cite id/about attr values
 * : Add normalized parameter names to templates
 * Caching related bug fix in some extension/template reuse scenarios
 * Logging: Log process events to LogStash
 * Logging: Only send warning and more severe events to LogStash (to reduce load on LogStash for now)
 * Logging: Downgrade some old error events to warnings

Monday, Nov 3, 2014
No deploys today. Cluster upgrade (ubuntu and node 0.8.x -> 0.10.x) being tested / monitored on a single node. Full upgrade if memory / load continues to be stable.

Wednesday, Oct 29, 2014 around 1pm PST: 4e21bdb6f to be deployed
Reverted deploy after running into stuck processes on a few nodes.
 * New parser tests for lang/category/wiki links
 * Request and cpu timeouts in the API
 * Log process events to LogStash
 * : Generate HTML5-compliant cite id/about attr values

Monday, Oct 27, 2014 around 1:30pm PST: ✅

 * Tweaks to logging
 * Improvements to paragraph wrapping to skip over link and other rendering transparent tags
 * More robust handling of s split across top-level pages and templates

Wednesday, Oct 22, 2014 around 1pm PST: ✅

 * Send logging events to LogStash

Monday, Oct 20, 2014 around 1pm PST: ✅

 * : Leave link tags out of p-wrappers wherever possible
 * : Tweaks to p-wrapping around formatting tags
 * : Comply with the 'body' API parameter

Thursday, Oct 9, 2014 around 1pm PST: ✅

 * Support sourceswiki (multilingual http://wikisource.org).
 * : Fix inserting category links from non-parser function extensions.
 * Remove backward compatibility support for mw:WikiLink/* types
 * Code cleanup patches: More use of promises in our API

Monday, Oct 6, 2014 around 1pm PST: ✅

 * Code cleanup patches:
 * : Move data-mw away from manual json attr loading
 * Start using promises API
 * : Add categories added directly from extensions and action=parse
 * : Set prop 'wikitext' when calling action=expandtemplates
 * Improved logging for failed API requests
 * Reduce cache request timeout to 10 sec for only-if-cached scenario (from 60 sec)

Monday, Sep 29, 2014 around 1pm PST: ✅

 * Upgrade domino lib to 1.0.18
 * : Make lang-links sol-transparent
 * : Fix the test for DU.isGeneratedFigure
 * : Do not strip empty  nodes if they have html attrs
 * : Load extension CSS modules
 * Fixes to tokenizer to parse tables one row at a time
 * Fixes to tokenizer to release backtracking memory asap
 * Leave sol-transparent tags out of p-wrappers where possible
 * : Fix paragraph-wrapping to match PHP parser + Tidy combo

Monday, Sep 22, 2014 around 1:25pm PST: ✅

 * : Bug fixes serializing modified wikilinks
 * bug 70867: Fix production crashers on some wikitionary pages

Monday, Sep 15, 2014 around 1:30pm PST: ✅

 * Empty auto-inserted nodes that are transclusion markers should not be deleted.
 * : Delete empty li and tr nodes found in transclusion content.
 * Indicate the revision in a content-revision-id header
 * : Fix paragraph wrapping to not include transclusion markers where unnecessary.
 * Edge case fix of DSR computation for fostered nodes

Monday Sep 8, 2014 around 1pm PST: ✅

 * Additional CSS classes added to Parsoid HTML elt
 * : Treat | as a magic word
 * : Fix bug parsing certain extensions found in Flow content

Wednesday, Sep 3, 2014 around 1pm PST: ✅

 * Upgrade of html5 libraries
 * : Suppress --!> as a comment closing tag in browsers
 * Allow pipe and exclamation point in attribute values
 * Improved serialization for absolute links
 * Handle local interwiki links

Monday, Aug 25, 2014 around 1pm PST: ✅

 * : Represent tags as invisible meta tags
 * : Better handle multiple empty attribute values
 * : Pass title to action=parse requests for extensions
 * : Handle empty template call more gracefully
 * : Add title attributes to wikilinks
 * : Additional fixes for template expansion failures found in production logs
 * : Fix regression: support nested tags once more
 * Fixes to index page
 * A bunch of code cleanup

Wednesday, Aug 20, 2014 around 1pm PST: 13c31fc8 (deployment abandoned)
Not deploying today since we found some regressions -- probably harmless, but requires more investigation. We'll fix this and deploy on Monday.
 * : Represent tags as invisible meta tags
 * : Better handle multiple empty attribute values
 * : Pass title to action=parse requests for extensions
 * : Handle empty template call more gracefully
 * : Add title attributes to wikilinks
 * : Additional fixes for template expansion failures found in production logs
 * Fixes to index page
 * A bunch of code cleanup

Wednesday, Jul 23, 2014 around 1:30 pm PST: ✅

 * : Dont add empty content blocks in transclusion data-mw object.

Monday, Jul 21, 2014 around 1pm PST: ✅

 * : Set page title based on DISPLAYTITLE usage
 * : Update pipe trick support to match PHP parser behavior
 * : Bug fix in parsing of interlanguage links

Wednesday, Jul 16, 2014 around 1:20pm PST: ✅

 * : Fixes for backtracking in the tokenizer that improves handling of pathological parsing scenarios.

Monday, Jul 14, 2014 around 1pm PST: ✅

 * Support for HTML tag
 * Support for "extra language links" (added to core as part of )
 * : Fix for post-edit corruption in some invalid wikilink scenarios

Wednesday, Jul 9, 2014 around 1pm PST: ✅

 * Fixes to nowiki escaping in indent-pre content
 * : Fixes to namespace handling in relative titles
 * : Fixes to trailing newline migration DOM pass + fixes to handling of table-like wikitext outside tables
 * : Fix serialization of new interwiki-like links to local-wiki

Monday, Jul 7, 2014 around 1pm PST: ✅

 * : Fixes excess whitespace in some infoboxes.

Wednesday, Jul 2, 2014 around 1pm PST: ✅

 * : Fix citation numbering issue
 * Enable parsoid on wikimania 2015 wiki

Monday, Jun 30, 2014 around 2pm PST: ✅

 * Additional CSS ResourceLoader modules added to Parsoid header
 * : Fixes to handling of lists/headings that following  tags
 * , : Correctly handle table-like wikitext (|, |-, etc.) outside wiki-tables

Wednesday, Jun 25, 2014 around 1:30 pm PST: ✅

 * : Recognize  and   as html elements
 * Allow comments and spaces before table lines
 * : Empty line with comment eats trailing nl token
 * Small tweaks to indent-pre handling in the serializer

Monday, Jun 23, 2014 around 1pm PST: ✅

 * : Fixed bug in nowiki-escaping of magic words
 * Additional tweaks and improvements to selective serializer

Wednesday, Jun 18, 2014 PST: ✅

 * : Suppress &lt;nowiki&gt;s for table WT strings outside tables
 * : Add nowiki protection around quotes adjacent to I/B tags
 * : Strip unsupported table tags during serialization
 * Cleanup, fixes and improvement to serializer around handling of leading white-space on new lines.

Monday, Jun 16, 2014: deploy cancelled
Deployment cancelled to investigate issues found in testing.

Wednesday, Jun 11, 2014 around 1pm PST: ✅

 * : DOM support for DISPLAYTITLE magic word.
 * : Invalid links in HTML are serialized to MediaWiki:Badtitletext message.
 * : Escape nowiki when combined with other wiki markup.
 * Some serialization tweaks.
 * More CSS tweaks for Parsoid HTML.

Monday, Jun 9, 2014 around 1 pm PST: ✅

 * : Mediawiki and Parsoid CSS styles linked to Parsoid HTML.
 * : Allow links with angle brackets after anchor.
 * : Accept !! in table data.
 * Parsoid enabled on outreach wiki and wikimania2014 wiki.

Wednesday, Jun 4, 2014 around 1pm PST: ✅

 * : Fix nowiki escaping bug in template args during serialization.
 * Fix to tokenizer to better handle table / indent-pre interactions.

Monday, Jun 2, 2014 around 1pm PST: ✅

 * Bug fix in upright handling for images.
 * : Fix crashers handling pre-like strings.
 * : Fix out-of-stack crashers on some wiki pages.
 * Additional selser tweaks.
 * Additional performance tweaks.

Wednesday, May 28, 2014 around 1:15pm PST: ✅

 * Bunch of tweaks to the selective serializer.
 * Performance fixes to enable parsing humongous pages.
 * Bug fix to the XML serializer around handling of HTML tags.
 * : New empty ExtLinks shouldn't be converted to interwiki.

Wednesday, May 21, 2014 around 1pm PST: ✅

 * : Replace space with underscore in namespace links
 * : Bug fix in nowiki-ing of ";" chars
 * Improved serialization of new language links.
 * Support protocol-relative urls.

Monday, May 19, 2014 around 1pm PST: ✅

 * : Deal with &lt;nowiki/&gt; escaping around url and other magic links (RFC, PMID, ISBN).
 * : Accept  and record info in data-parsoid.
 * Some edge-case improvements to template parsing (see commit summary of https://gerrit.wikimedia.org/r/#/c/133506/).

Monday, May 12, 2014 around 1pm PST: ✅

 * Bug fix nowiki-escaping transclusion args.
 * Improvements to accuracy of DSR information -- eliminates some template wrapping errors.
 * Eliminate crashers when attempting parse of deleted revisions.
 * Edge case tweaks to serializer.

Wednesday, May 7, 2014 around 1pm PST: ✅

 * : Fix for production crashers (edge case).
 * : Fixes bad nesting of formatting and figure elements.
 * Other minor fixes in the tokenizer.

Monday, May 5, 2014 around 1pm PST: ✅

 * : Additional fixes to template encapsulation code based on production crashers.
 * Handling of empty redirects (edge case bug).
 * : Upgrade tokenizer (pegjs) from 0.7 to 0.8 -- required lots of tweaking and fixing of tokenizer.
 * Simple upgrades of other libraries (See https://gerrit.wikimedia.org/r/#/c/130992/)

Thursday, May 1, 2014 around 9:20 am PST: ✅

 * A whole bunch of performance tweaks.
 * : Last set of fixes to template encapsulation code.
 * Use handlebars for ParsoidService views.

Monday, April 28, 2014 around 1pm PST: ✅

 * : Bugfix merging nested template ranges (caused by fostered content in tables) + other fixes.
 * Logging: Suppress stack traces for warnings
 * Several link handling fixes
 * : Handle unescaped single quotes in urls
 * : Correctly handle multiple # chars in links
 * : Serializer: handle full stops in link target
 * Serializer: Underscores not converted to spaces for interwiki links
 * Several other fixes (see https://gerrit.wikimedia.org/r/#/c/126853/ for more)

Wednesday, April 23, 2014 around 1pm PST: ✅

 * Fix oldid logging with error/fatal log entries.
 * : Fix bug merging overlapping template ranges (caused by fostered content in tables)

Monday, April 21, 2014 around 1:45pm PST: ✅

 * : Accept comments in eofl position
 * Support comments before table lines
 * Improved handling of "bogus" image options

Wednesday, April 16, 2014 around 1:30pm PST: ✅

 * : Serialize links with wikitext chars correctly (ex: foo '' bar)
 * : Match fixed PHP behavior for framed images with a height specification
 * : Multiple commits to fix crashers found in RT testing
 * Accept entities in ref attributes
 * Improvements to wrapping of fostered transclusions

Monday, April 14, 2014 around 1pm PST : ✅

 * : Handle multiple colons in titles in subpage-supporting namespaces
 * , : Improvements to serialization of interwiki links
 * : Fix parsing and serialization of invalid wikilinks
 * : Fix some edge case template encapsulation scenarios

Wednesday, April 2, 2014 around 1pm PST : ✅

 * : Improved serialization of empty i/b nodes.
 * : Fix serialization of headings, etc. after categories.
 * : Accept multiple comments in start-of-line context (headings, etc.).
 * : Accept multi-line comments after headings.
 * Accept comments in template targets.
 * Sanitizer fix for handling protocols like news: and javascript: (no security issue, validation happens elsewhere too)

Monday, Mar 31, 2014 around 1pm PST: deploy canceled
Canceled deployment to investigate issues caught in testing.

Monday, Mar 24, 2014 around 1pm PST: ✅

 * Dont generate NaN dimensions after edits.
 * Fixed bug in detecting unresolvable tpl targets.

Wednesday, Mar 19, 2014 around 1pm PST ✅

 * Improved connection timeout handling
 * Handle non-string extension attribute values
 * Allow scaling of Vector images

Monday, Mar 17, 2014 around 1pm PST ✅

 * Support for manual thumbnail option (thumb=) on images.
 * Roundtrip empty image attributes.
 * : Improvements to RT-ing of fostered content.

Thursday, Mar 13, 2014 around 4pm PST ✅

 * Redeployed.

Wednesday, Mar 12, 2014 around 1pm PST ✅

 * Parse and roundtrip invalid image options
 * Fix image up-scaling for 'format unspecified' images
 * A bunch of code cleanup.

Because of a bug in the deployment system, the deployment did not happen and Parsoid remained stuck at 98936e7a according to http://parsoid-lb.eqiad.wikimedia.org/_version.

Monday, Mar 10, 2014 around 1pm PST ✅

 * New logging framework deployed with improved error reporting to production logs.
 * Eat > and [ in table / tr attribute names -- improves parsing / serialization of pages with broken wikitext.

Because of a bug in the deployment system, the deployment did not happen and Parsoid remained stuck at 98936e7a according to http://parsoid-lb.eqiad.wikimedia.org/_version.

Monday, Mar 3, 2014 around 1pm PST ✅

 * Treat all block tags identically in pre-handler
 * DSR computation: Properly handle tags nested in.
 * New tags are now serialized on their own line.
 * More liberal parsing of broken table and table-row attributes in wikitext.
 * Fixed regression dealing with fostered text nodes from tables.

Wednesday, Feb 26, 2014 around 1pm PST ✅

 * Emit | chars outside tables as | text
 * Handle multiple conflicting image options properly
 * Handle templated image options in inline images
 * Bug fixes in pre-handling and DSR output.

Monday, Feb 24, 2014 around 1 pm PST ✅

 * Enabled CORS on all API endpoints.
 * : Support trailing 'pxpx' in image size options.
 * : Correctly handle duplicate options in image wikitext.

Wednesday, Feb 19, 2014 around 1 pm PST ✅

 * Additional fixes for link trail / template interaction.
 * Support link trails for interwiki links.
 * Allow template attributes for image attributes.
 * Support image options that have the "|" char in them.
 * template on nlwiki pages handled properly: parses as expected and serialized properly.

Wednesday, Feb 12, 2014 around 1 pm PST ✅

 * HTML PRE tsr calculation fixes
 * Several clean-up and refactor patches
 * Error logging clean-up
 * Further fixes for link trail / template interaction

Monday, Feb 10, 2014 around 3 pm PST ✅

 * Emit non-piped links for edited redirects
 * Handle linktrails/prefixes correctly for templated links
 * Correctly render p-tags in blockquotes

Thursday, Feb 6, 2014 around 12:30 am PST ✅
Deployed after failed code update of Feb 3, 2014 was fixed.


 * Add Wikiversity to site list in ParsoidConfig

Monday, Feb 3, 2014 @ 11:30 am PST ✅
This was the first deploy from our new repository /mediawiki/services/parsoid/deploy. This deploy includes all fixes over the last 6 weeks (from December 16th, 2013).

This deployment saw the following code improvements go out:
 * Fixes to GC issues that led to memory leaks in node 0.10
 * First pass over long-standing image handling cleanup.
 * First steps implementing a logging subsystem in Parsoid.
 * Code quality fixes to improve robustness of code.

Besides these changes, these specific bugs were fixed.

Images
 * Wikitext tables inside image captions accepted
 * ,, , Use edited image attributes over original values.

Links
 * Interwiki links pointing to current wiki parsed as plain links
 * Update to linktrail/prefix regexp code
 * Update to linktrail/prefix regexp code
 * Serialization of new link redirects serialized fixed
 * handled correctly
 * [[Foo]] handled correctly
 * Trailing extlink-like text in wikilink handled correctly ..[Foo]

Refs &amp; extensions
 * Accept unclosed &lt;references&gt; tag
 * Multiple &lt;references /&gt; tags handled properly
 * Non-standard WS in extension tags accepted

Misc Tokenizer
 * Stray table-end tags ignored in some contexts
 * ISBN with an X recognition

Wikitext escaping fixes
 * Url parsing fix during nowiki escaping
 * Fixes for nowiki escaping of ext-tag like text
 * Fixes to wikitext escaping of link text

Misc edit/serializer fixes
 * Fixed serialization of edited magic words
 * Table end tags always serialized on new lines
 * Whitespace edits properly recognized

Misc
 * Improvements to handling of fostered table content
 * Parsoid binding to specific IP or interface
 * Parsoid now handles OBJECT element

Thursday, December 26, 2013 @ 20:45 UTC

 * Pushed updated Parsoid config to fix broken support for wikis with "-" in their prefix (ex: nds-nl and others).

Thursday, December 19, 2013 @ 00:10 UTC

 * Pushed updated Parsoid config to add support for tyv and min wikipedias.

Monday, December 16, 2013 @ 13:00-14:00 PST ✅

 * Fix for production crashers.
 * Fix for indent-pre parsing in the presence of block tags.
 * Support for per-wiki API proxies.

Wednesday, December 11, 2013 @ 13:00-14:00 PST ✅

 * Fix for broken HTML-pre serialization that lost newlines after opening tag in some cases.

Tuesday, December 10, 2013 around 10 am PST

 * Reverted Parsoid cluster to node 0.8 after discovering memory leaks in production

Monday, December 9, 2013 @ 15:00-16:00 PST

 * Upgraded Parsoid cluster to node 0.10 after running it in round-trip testing without issues since last week (and for months locally)
 * Configured Parsoid to use api.svc.eqiad.wmnet directly rather than going through the Varnishes (51273)

Monday, December 9, 2013 @ 13:00-14:00 PST ✅

 * Fixes to HTML and Indent-Pre handling
 * Serialization improvements
 * Additional tweaks to the DOMDiff algorithm
 * Tweaks to newline separator handling to minimize dirty diffs
 * URL link parenthesis heuristic
 * Performance: Added API proxy configuration to bypass caching layers in front of Mediawiki API (Config change deployed, but proxy not yet enabled)
 * Changed default thumbnail size to 220px (matching WMF site defaults, bug 50523)
 * Add Wiktionary as /enwiktionary/, /dewiktionary/ etc (bug 58212)

Wednesday, December 4, 2013 @ 13:00-14:00 PST ✅

 * Fix for crasher that was filling up production log
 * Enable gzip compression support
 * Handle page names starting with a slash
 * Serialize new headings with spaces around '=' char
 * Initial support for time/data/mark HTML5 elts
 * ISBN links now assigned mw:ExtLink type to conform to Parsoid DOM Spec
 * A bunch of fixes to the selective serializer
 * Improvements to DOMDiff algorithm
 * Bug-fixes in nowiki escaping before/after linktrails/prefixes
 * Parse attributes in a case-insensitive manner
 * A bunch of other assorted fixes

Wednesday, November 20, 2013 @ 13:00-14:00 PST ✅

 * Correctly serialize magic words added on client
 * Various DSR fixes (suppress spurious warnings, fix errors)
 * Bug parsing indent-pres following a html-pre

Monday, November 18, 2013 @ 13:00-14:00 PST ✅

 * Fix for (eliminates whitespace diffs on frwiki on template edits)
 * Fix for (incorrect use of TSR while detecting stray closing tags)
 * Fix for serialization of new categories
 * API fixes
 * Improved error handling

Wednesday, November 13, 2013 @ 13:00–14:00 PST: ✅

 * Improvements to image option parsing, DOM diffing and Wikitext escaping

Thursday, November 7, 2013: ✅

 * DOM spec clean-up (delayed deploy for VE compat)
 * A lot of fixes and performance improvements