Parsoid/Deployments/2021

From mediawiki.org

Dec 14-16: Yes V0.15.0-a13 as part of 1.38.0-wmf.13[edit]

  • Revert "Disable translate annotations + revert 2.3.0 to 2.4.0 version bump"
  • T228616: Fix serializing links when a namespace conflicts with an local interwiki

Dec 7-9: Yes V0.15.0-a12 as part of 1.38.0-wmf.12[edit]

Non-translate patches:

  • T287216: Add ContentMetadataCollector interface
  • WikiLinkHandler: use the value offset for the fragment TSR
  • Remove the return value from DOMNormalizer::normalize()
  • When splitting nodes in PWrap, clone the NodeData
  • T214651: Implement diffHandler for Cite extension
  • T263203: Handle stripped characters in free external links in html2html
  • T292022: Fix serializing links using local interwiki plus language link

Translate patches (all disabled except for HTML -> WT to provide compatibility for HTML version 2.4.0)

  • Disable translate annotations + revert 2.3.0 to 2.4.0 version bump
  • Fix "undefined index" on annotation nesting removal
  • T295233: Avoid crashers on bad annotation nesting
  • T296107: Treat translation unit marker comments as not-SOL transparent
  • T296169: Don't generate nested annotated ranges in HTML output
  • Only accept </> for tvar annotation when the page has annotations
  • Only accept </> when tvar is annotation tag
  • T295233: Hack: Add partial support for older tvar syntax to prevent crashers
  • T295406: Hack: Drop annotation tokens in template context
  • T295406: Don't break template continuity when moving annotation range metas
  • T295330: Fix DSR computation when end tag is pulled out of range
  • T295233, T295236: Fix global state of annotation id and DOMPostProcessor pass order
  • T295243: Fix regression in DOMNormalizer
  • Bump content version from 2.3.0 -> 2.4.0
  • T261181: Add support for annotation tags in Parsoid

November 16-18: Yes V0.15.0-a10 as part of 1.38.0-wmf.9[edit]

  • T295104: Don't record serialization metrics for empty html
  • Add RemexPipeline
  • Get rid of the foster comment hack
  • T214648, T294450: DOM diff galleries
  • T214651: Add an experimental method for extensions to diff nodes
  • Remove check for duplicate data IDs
  • Have Remex tell us when it clones attributes instead of detecting that

November 2-4: Yes V0.15.0-a7 as part of 1.38.0-wmf.7[edit]

  • Deduplicate subtree recursion in DOMDiff.php
  • Dump data attributes without cloning
  • Rename isTopLevel to atJsonRoot in DOMDiff.php
  • T235295: Replace DOMCompat::attributes() with DOMUtils::attributes()

Oct 27: tagged v0.15.0-a6 for vendor[edit]

  • T226428: DOMRangeBuilder cleanup and data flow improvements
  • T288640: WrapSections: Don't get tripped by <section> tags found in content
  • Require RemexHtml 3.0.0
  • T283560: Bump some dependencies to match upstream
  • Drop composer v1
  • Bump composer for dependabot alert

October 26-28: Yes V0.15.0-a5 as part of 1.38.0-wmf.6[edit]

  • T292923: Deal with more malformed transclusion parts
  • T291692: Account for php zero string, "0"
  • Unbreak ContentUtils::dumpDOM for DocumentFragments
  • Remove deprecated PegTokenizer::tokenizesAsURL
  • Remove temporary tsrDelta
  • Add a namespace for HTML5TreeBuilder
  • Set the bag property on child documents
  • WrapSections cleanup
  • WrapSections OOP state class
  • Update wikimedia/zest-css to 2.0.2
  • Don't call stashDataAttribs() for text tokens
  • Split out TemplateHandler::encapTokens() to its own class

October 19-23: Yes V0.15.0-a4 as part of 1.38.0-wmf.5[edit]

  • Merge TempData booleans into a bit field
  • T226428: Add a class for template ranges
  • Replace $dp->tmp->tplRanges with SplObjectStorage
  • Use OOP in WrapTemplates
  • Fix call to deprecated method ParserOutput::getProperties()
  • Sanitizer: Replace RFC 3454 by RFC 8264 for clearUrl
  • Sanitizer: Use \u{xxxx} syntax in cleanUrl
  • T293308: massageLoadedDataParsoid: Ignore null source ranges
  • Followup to 6bddf56e: Resync Grammar.pegphp and Grammar.php
  • Declare TempData->tagId
  • Add TempData class for DataParsoid::$tmp
  • Lazy-initialise the DataParsoid->tmp property
  • Have massageLoadedDataParsoid return a DataParsoid object
  • Clone DataParsoid property by property
  • Remove unused property brokenHTMLTag
  • Make DataParsoid be a real class
  • Add NodeData class to replace the stdClass objects that DataBag stores

October 12-14: Yes V0.15.0-a3 as part of 1.38.0-wmf.4[edit]

  • T292115: Generate timing metrics per KB of HTML (output or input)
  • Fix typo in data-parsoid property use in Linter
  • Log slow wt2html operations in ParsoidHandler
  • Allow composer 2.1
  • Don't try to deep clone DOM nodes
  • T251624: Deduplicate file info requests
  • Refactor WTSUtils::origSrcValidInEditedContent to pass SerializerState

October 5-7: Yes V0.15.0-a2 as part of 1.38.0-wmf.3[edit]

  • T292250: Log errors from malformed data-mw->parts
  • T261181: Add annotation tags to SiteConfig.php
  • Improve test for pipeline expansion of complex attributes
  • T291741: Account for clients leaving off the template params array in data-mw

September 28-30: Yes V0.15.0-a1 as part of 1.38.0-wmf.2[edit]

  • Make extension-output p-wrapper stripping more robust
  • T291452: Raise a client error if data-mw->parts is not an array
  • Remove duplicate EOFTk stripping from ATM attribute processing
  • T291234: Set an appropriate locale in maintenance scripts
  • 804cd7b6 followup: Fix breakages
  • ParagraphWrapper: fix slow array_shift() loop
  • Followup to 1051b80f: Simplify statsd metric name for extension tracking
  • html2wt: Don't update sol state in appendSep
  • Remove in-actionable / stale warning log messages
  • DOMUtils: Get rid of isElt, isText, isComment helpers
  • Use instanceof Element instead of DOMUtils::assertElt in conditionals

September 21-23: Yes V0.14.0-a19 as part of 1.38.0-wmf.1[edit]

  • Improve mw-empty-elt detection in Cleanup DOM handler
  • T291234: Replace preg_replace_callback() with strtolower()
  • Faster dedupeHeadingIds
  • T291234: Verify that strtolower() works for all byte values
  • ParagraphWrapper: fix property grouping
  • ParagraphWrapper: fix bug from array_merge() patch
  • Remove emit(Start|End)Tag helpers
  • AttributeTransformManager: Don't duplicate tokens unnecessarily
  • Don't defeat the empty-argument expansion cache
  • Reimplement TokenHandler profiling and tracing using a proxy
  • Simplify skipOnAny usage in token handlers
  • T290938: Loosen constraints on wrapper unmodified,
  • HTML5TreeBuilder: use associative arrays for attributes
  • Bug fix followup to 05cccaa5
  • Use a class for the return value of TokenHandler subclass methods
  • Make TokenHandler not override PipelineStage
  • T289358: Drop config vars if they fail to JSON strinfigy
  • In TemplateHandler use string functions instead of regexes

September 14-16: Yes V0.14.0-a18 as part of 1.37.0-wmf.23[edit]

  • Optimize access of attribute name/value
  • T290697: Allow PHP 8.0 polyfills
  • Fix safesubst matching
  • Use CharacterData::nodeValue instead of CharacterData::data
  • T282031: Suppress end format newline if no params
  • T290044, T271566: Force block imagemaps
  • Learnings from [[User:Wyang/basic-Chinese-words]]
  • T221488: Add "decoding=async" attribute to img tags
  • Hand-coded matcher for plain sequences of urltext
  • Track uses of extensions in Talk namespaces via statsd
  • Fix inappropriate usages of array_merge()
  • Fix TokenHandler::process() O(N^2) performance

August 31-Sept 2: Yes V0.14.0-a17 as part of 1.37.0-wmf.21[edit]

  • Migrate out valid follow contents after processing refs
  • Reserialize processed refs if content differs
  • Cite: Rename functions pushing/popping embedded content flags
  • T289331: Don't process ref-in-ref as embedded, unless content differs
  • Move content differ check up higher
  • Only call ReferencesData::add when adding
  • html2wt: Tweak handling of excess nls around rendering transparent nodes
  • T264027: Stop stripping trailing <nowiki />s

August 24-26: Yes V0.14.0-a16 as part of 1.37.0-wmf.20[edit]

  • T289107: Suppress recursion protection when doing a full table parse
  • T288715: Add class="extiw" to interwiki link <a> tags
  • T272186: Add noresize class on imagemaps
  • T287156: Replace Content::preSaveTransform call to ContentTransformer::preSaveTransform
  • Update wikipeg in package.json

August 9-12: Yes V0.14.0-a15 as part of 1.37.0-wmf.18[edit]

  • Update RemexHtml namespace
  • Update WikiPEG namespace

August 8: Yes V0.14.0-a14 to beta[edit]

  • T287972: Update Dodo to 0.3.0, Remex to 2.3.2, and Zest to 2.0.1
  • T287972: Update wikimedia/langconv to 0.4.2
  • T287972: Bump versions of wikimedia/alea and wikimedia/wikipeg
  • T287163: Avoid using deprecated ParserOptions::getUser
  • Reinstall service-runner to bump loose dependencies

August 3-August 5: Yes V0.14.0-a13 as part of 1.37.0-wmf.17[edit]

Contains all the changes in -a11, -a12, and in addition:

  • Don't use non-standard Document::saveHTML() method
  • The ::querySelectorAll() and ::getElementsBy* helpers don't always return array
  • T254804: Copy some language used in the core sanitizer
  • T254804: Remove unused methods from TestUtils.js
  • Allow Node::getAttribute() to return `null`
  • Introduce DOMCompat::nodeName($node)
    • Move nodeNameCheck to linting
  • html2wt: Simplify logic to make separators indent-pre safe
    • And followup: Fix regexp that looks for indent-pre whitespace
  • Bump content version from 2.2.0 -> 2.3.0
  • Sync citeParserTests.txt with Cite extension

July 30: Yes V0.14.0-a12 to beta[edit]

As with -a11, this version of Parsoid is being released to mediawiki-vendor to verify that CI and other issues are fixed in beta before the train rolls for production, and to unblock rt testing. It is expected that 1.37.0-wmf17 will have a follow up build as v0.14.0-a12 was deployed to beta without rt-testing (but has since been rt-tested).

This version contains all the changes from -a11 and in addition:

  • T287611: Fixes for Dodo issues w/ CI and DiscussionTools:
    • Use Parsoid's version of idle-dom and dodo when testing in integrated mode
    • Only set up DOM aliases once
  • Minor cleanup in ExtensionHandler.php
  • T162399: Export ResourceLoader modules & JS config vars in meta tags in <head>
  • T275444: Add baseconfig for banwiki
  • Documentation updates:
    • Include CODE_OF_CONDUCT.md and docs/ in our generated Doxygen documentation
    • Add documentation about information representation in Parsoid output
  • Get rid of unneeded li-hack handler which is a Tidy-era relic

July 29: N V0.14.0-a11 to beta *REVERTED*[edit]

This version of Parsoid is being released to mediawiki-vendor in order to verify that phab:T287419 and similar issues are fixed in beta before the train rolls for production. It is expected that 1.37.0-wmf17 will have a follow up build as v0.14.0-a11 was deployed to beta without rt-testing (but has since been rt-tested).

  • Fixes for Dodo issues w/ CI and DiscussionTools:
    • T287419: Upgrade wikimedia/dodo to 0.2.0
    • Add DocumentType and ProcessingInstruction to our DOM alias list
    • Be DOM-agnostic in DOMCompat/TokenList
    • T287611: Don't strictly enforce type hints in DOMCompat methods
  • T287463: Wrap next siblings in fixUpMisnestedTagDSR
  • html2wt: Centralize wikitext escaping to one place
  • html2wt: Use consistent casing for escapeWikitext
  • Add docs/*.md to the automatically-generated documentation
  • Rename mw:html:version to mw:htmlVersion
  • PHPUtils::jsonEncode: Borrow some code from FormatJson::encode in core
  • T286840: Add leniency for active formatting elements in AddMediaInfo

This version was reverted after causing issues with DiscussionTools CI (fixed with 708618) and Parsoid CI (two issues under investigation). Plan is to investigate and fix the two Parsoid CI issues, then tag an -a12 to beta tomorrow.

July 27-29: Yes Deployed v0.14.0-a10 as part of 1.37.0-wmf.16[edit]

  • T286839: P-wrap: Fix failing invariant by fixing undoIndentPre handling
  • P-wrap: Minor code simplification
  • T286786: Fix backtracking in solRegexp
  • Remove optional match
  • P-wrap: Minor code consistency tweaks

July 26: N V0.14.0-a9 REVERTED and UNPUBLISHED[edit]

V0.14.0-a9 was the same as -a10 with the addition of patches to prepare Parsoid for a shift to using the Dodo DOM library. These patches caused core CI to break, and the problem wasn't solved by reverting Parsoid's deployment on mediawiki-vendor due to a bug in the mediawiki-core-php72-phan-docker job (phab:T287419). The v0.14.0-a9 tag was deleted from gerrit and removed from packagist to prevent its installation by the buggy mediawiki-core-php72-phan-docker job.

July 20-22: Yes Deployed v0.14.0-a8 as part of 1.37.0-wmf.15[edit]

  • T286786: Strip the double underscores from the extension bswRegexp
  • T286401: Fix DSR for unstripped stray closing tags & add b/c handling in selser
  • Ultra rare edge case: Fix bad check in html2w
  • T276512: Tweak heuristic to trim excess newlines from a separator string
  • html2wt: Rename awkwardly named function ( isForceSOL -> forceSOL )
  • html2wt: Fix incompatible min/max nl constraints early
  • html2wt: Minor code simplification and assorted minor cleanups
  • SerializerState: Minor cleanup to mimic code in Separators.php

July 13-15: Yes V0.14.0-a7 as part of 1.37.0-wmf.14[edit]

  • T277760: Stop adding newlines in manglePreprocessorResponse
  • Lower the level of some noisy logs
  • Attribute exceeding limit to the right resource
  • T280381, T211946, T221238: Stop throwing on arbitrary resource limits,

June 29-July 1: Yes V0.14.0-a6 as part of 1.37.0-wmf.12[edit]

  • html2wt: Simplify mergeSeparatorConstraints
  • T280381, T211946, T239841: Enforce wikitext limits like in the legacy parser
  • T283273: Replace freenode references with libera references
  • T283961: Prevent inline breaks in language variant text

June 8-10: Yes V0.14.0-a5 as part of 1.37.0-wmf.9[edit]

  • Followup on bdf4029: Add missing update to DDHandler

May 25-27: Yes V0.14.0-a4 as part of 1.37.0-wmf.7[edit]

  • T247143: Extension: require MW 1.37+ and remove support for Revision objects
  • Minor cleanups: Remove dead code, use foreach, simplify exprs
  • Merge encapsulateTemplate and encapTokens
  • Pass media structure to figure handler

May 18-20: Yes V0.14.0-a3 as part of 1.37.0-wmf.6[edit]

  • Account for all trailing newlines when wrapping text nodes
  • Upgrade to mediawiki/mediawiki-codesniffer 36
  • Only select on mw:Image when adding info
  • Magic word pipe isn't going to match table_start_tag

May 4-6: Yes V0.14.0-a2 as part of 1.37.0-wmf.4[edit]

  • T279682: Handle optional spaces after table_attributes for table_row_tag
  • T279963: Fix interplay between recoverTrimmedWhitespace and DisplaySpace
  • T278565: Don't p-wrap <aside> tags in extension HTML
  • Always provide string to preg_match subject param

April 27-29: Yes V0.14.0-a1 as part of 1.37.0-wmf.3[edit]

  • Bump wikimedia/zest-css version
  • T279682: Don't break on pipe in linkdescs if we're in an ext tag
  • T279803: Fixing asserting an about id
  • T264028: Disable single line context when serializing nowikis
  • T280050: UnpackDOMFragments: Improve DSR fixup for misnested A tags
  • T279867: Fix collecting attributish content in table fixups
  • html2wt: Use trimmed whitespace recovery heuristics only if needed
  • T280449: Add some logging around failing preg_match when serializing
  • T280672: Account for nested transclusions in table fixups

Apr 13-15: Yes V0.13.0-a32 as part of 1.36.0-wmf.39[edit]

  • T279451: Use a protected key to distinguish comments internal to Parsoid
  • Remove option to tunnelFosteredContent
  • T279184: Fix undefined DSR notice
  • T279182: Handle comments that decode to valid json
  • T279223: Handle empty text nodes in Selective Serializer
  • Only call encapTokens if we're wrapping

Apr 6-8: Yes V0.13.0-a31 as part of 1.36.0-wmf.38[edit]

  • Process jsconfigvars from core parser output
  • Minor bug fix in handling of {{{arg}}} in TemplateHandler
  • T277800: Add some logging around failing preg_replace when serializing

Mar 30-Apr 2: Yes V0.13.0-a30 as part of 1.36.0-wmf.37[edit]

  • T269749, T277415: ListHandler: when in EOL state, close lists always
  • DOMPostProcessor: Extract function to update <body> classes
  • DOMPostProcessor: Extract function to export style modules in <head>
  • T276620: html2wt: Improve heuristics enabling reuse of separators from source
  • T274521, T30980: Be more permissive for extension tag names
  • T278074: Handle wikilinks misnested in media links
  • T278074: Log an error if media structure is messed up
  • Allow use of newest version of wikimedia/remex-html

Mar 23-25: Yes V0.13.0-a29 as part of 1.36.0-wmf.36[edit]

  • phab:T275918: French spacing: don't require non-space before French spacing
  • phab:T223797: Strip newlines from Category sortkeys
  • ListHandler: Close holes in tracing code
  • WrapTemplates: Extract functions to improve code comprehensibility

Mar 17-18: Yes Backport of v0.13.0-a28 to 1.36.0-wmf.35[edit]

  • Fix roundtripping interwiki links with complex targets that have colons (follow up to patch in -a27 for phab:T276649 to fix regression).
  • Separators: Code cleanup and documentation fixes

Mar 16-18: Yes Deployed v0.13.0-a27 as part of 1.36.0-wmf.35[edit]

  • phab:T199070: More permissive regexp for nested extension start tags
  • No need to protect opening angle bracket in extension tag
  • Stop allowing spaces before extension closing tag name
  • Add some explanatory comments for ref in ref
  • phab:T276649: Subpages on interwiki / language links are invalid
  • phab:T276388: Check for multiples doesn't apply to follows

Mar 2-4: Yes Deployed v0.13.0-a26 as part of 1.36.0-wmf.33[edit]

  • T248369: Follow on patch to wikilink in extlink for video and audio content
  • T248369: Adding linter case for media in extlink
  • T275503: WTUtils::isFirstEncapsulationWrapperNode expects a node
  • TplWrap: Fix edge case bug that expanded template scope unnecessarily
  • T240642: WrapSections: Don't crash if we have incomplete DSR information

Feb 23-25: Yes Deployed v0.13.0-a25 as part of 1.36.0-wmf.32[edit]

  • T215999 Lint duplicated media width options; lint bogus media width options
  • T255007 Don't apply French spacing in raw text elements
  • T272232 Modify UTF-8 regex to use builtin PCRE validation
  • T242068 Add lint for Parsoid wikilinks in extlinks with italic or bold
  • T265720 TableFixups: One more mishandled scenario with newlines
  • Minor robustness fix in WikitextEscapeHandlers
  • WrapTempates: Get rid of unused property in template ranges

No new deploy with 1.36.0-wmf.31[edit]

Feb 16: Yes Deployed v0.13.0-a24 as part of 1.36.0-wmf.30[edit]

  • TableFixups: Minor tweaks
  • Don't apply border class to thumbs

Feb 16: Yes Deployed v0.13.0-a23 as part of 1.36.0-wmf.30[edit]

  • Template Wrapping: don't expand range unnecessarily
  • phab:T270373: Use prefixed text for content of links up the path
  • Separate arguments to getPipeline
  • Get rid of parseToplevelDoc
  • Add $frame to ParserPipeline and remove from pipeline stages
  • Refactor sanitization in a normalizeKey function
  • phab:T267974: Contract multiple underbars in a row in refnames to a single underbar
  • Get rid of rtTestMode (used for pre-production testing only)

Jan 18 - 22: No deploy[edit]

No deploy due to week shortened by WMF holiday.

Jan 11 - 15: Yes Deployed v0.13.0-a22 as part of 1.36.0-wmf.26[edit]

  • T270180: Handle selser edge case for first content-node of <body> (follow up to T262448 patch included in -a18)
  • T267974: Fix for Parsoid Cite refname whitespace handling
  • T237538: Disentangle Disambiguator extension from Parsoid
  • T260082, T271357: More papering over in References.php (follow up to T259676 patch included in -a6)
  • T265094: Handle newlines in wikilinks for selser as well (follow up to T265094 patch included in -a17)
  • Other: Disable rt-testing mode, clean up most old code from Parsoid/JS, tweak rt test configuration

Jan 5 - 7: Yes Deployed v0.13.0-a21 as part of 1.36.0-wmf.25[edit]

  • T251641: Emit span tags instead of figure-inline
  • Bump output content version to 2.2.0
  • T51538: Add parameters to various cite errors
  • T270307: Allow Parsoid extension modules to be unregistered
  • Tokenizer: Don't eat leading spaces from template values
  • T269719: PHP 8.0 compatibility, Remove PHPUtils::coalesce