Parsoid/Roadmap/2014 15

High level goals

 * Q1 Jul - Sep 2014: Parsoid HTML for page view (finish up visual diffs, set up RT testing, use for pdf rendering, mobile?), Native gallery impl, start experimentation with content widgets
 * Q2 Oct - Dec 2014: Parsoid HTML for page view (beta for all page views?), Wikilint/LintTrap, stable elt ids, HTML templating, start experimentation with content widgets
 * Q3 Jan - Mar 2015: Sister projects/extension (other extensions), Wikilint/LintTrap, stable elt ids, HTML templating
 * Q4 Apr - Jun 2015: Support for HTML-only wikis

Interdependencies (with other projects):

 * Rashomon / Content API is needed for html page views, stable element ids, efficient template updates.
 * VE, Flow, Mobile depend on Parsoid output
 * i18n consultation for language variant support
 * Content translation group on stable ids
 * Collaboration / interaction with editors community for wikilint/linttrap and possibly content widgets
 * Wikidata for content widgets

Constraint:

 * ~30-40% time (a somewhat arbitrary number) on maintenance and non-roadmap tasks (bug fixing, deploys, gsoc, presentations, etc.)

Start using Parsoid HTML for page views

 * Benefit: make page views for logged-in users as fast as those for anons
 * Requires more QA on rendering accuracy (visual diffing)
 * Service team has high-level goal of building the infrastructure for this (Rashomon, API with redlinks etc)
 * PDF rendering from Parsoid HTML?
 * enforce nesting, tag, linting

Stable element ids in HTML / switching between wikitext & HTML

 * Supports content translation, authorship maps, possibly efficient diffing, document part retrieval
 * Important for save performance (see bug 64171)
 * Challenge: need to be preserved across wikitext edits; have ideas on this that should also improve performance for switching from wikitext to HTML

Content widgets

 * Research new ways to mark up / integrate specific classes of content
 * Provide alternatives for data tables (football, discographies etc)

HTML templating

 * Goals: performance, possibly client-side rendering / preview
 * Can build on knockoff
 * Need to investigate if this can be represented transparently using existing transclusion syntax
 * Service team interested too

HTML-only wiki support

 * possibly in collaboration with Services, Platform, Features (HTML diff UI?)
 * requires everything that's needed for Parsoid-HTML page views & HTML-based templating

"Sister projects" / extensions

 * section support / LST
 * language variants
 * Native gallery port
 * One other extension relevant to a sister project?

Notes: A laundry list of tasks
This is more of a laundry list of tasks not all of which show up in the earlier sections. This can be fleshed out more and also used to figure out how much time / resources we will spend on these tasks. This need not be part of the final roadmap, but should be there somewhere for us to have an overview of all that needs to get done. This could even be folded into the previous section, if need be.

Functionality

 * Support for language variants
 * Support for wikitext
 * Scope range of transclusions. Two notions to deal with:
 * Well-formedness: Unbalanced tags, partial DOMs, etc. This concern basically sets up the scope of tree-builder fixup. ( tag, for ex.)
 * Content-model constraints: Even if well-formed, you cannot transclude A-links inside another A-link. Basically, the overriding concern is: what is required to simply "drop-in" the DOM output of a transclusion into a DOM-tree? One way is to enforce constraints on what a template can produce in all possible expansions for all possible inputs ("static typing" and automatic type coercion). (See Parsoid/DOM_notes)
 * LintTrap/WikiLint: GsoC project
 * Support for authorship maps
 * Requires stable element ids
 * Editing support:
 * Support for switching between HTML/Wikitext in the editor. Naive thing is not too difficult to support, but will not be very performant likely. To be investigated.
 * Support for HTML editing of transclusion parameters (in progress).
 * Possibly support content widgets for common tasks (for which a combination of tpls are currently used; infoboxes, football tables, discographies, etc.)
 * Support for any common but unsupported extensions including porting
 * Native gallery port
 * LST
 * Other extensions in non-wikipedia projects (wikisource, etc.)
 * Support for HTML wikis
 * HTML-based templating
 * Content widgets
 * HTML diffs
 * Abuse Filter

Testing

 * Parser tests
 * Selser-testing is still pretty painful. As selser is getting more refined, and as our accuracy in general improves, it is getting harder and harder to trust both "green"/"red" results  from parser test runs. We may need to consider more controlled edit generation where we can construct an oracle to give us authoratitive edited wikitext to compare selser against.
 * Porting PHP preprocessor and eliminating our native full expansion pipeline.
 * RT-testing
 * Fix our mysql-based rt testing or move over to cassandra.
 * Upgrades to selser testing. (in progress)
 * Automated diffs against PHP rendering to detect problems with rendering (for HTML page views).

Performance (ongoing)

 * More efficient re-rendering of pages after edits / template changes.
 * Ongoing identification of bottlenecks.

Maintenance (ongoing)

 * Regular deployments and monitoring.
 * Hooking up our logging infrastructure with logstack/bunyan.
 * Ongoing bug fixes.
 * node upgrades (from 0.8 to 0.10 and onwards).
 * code cleanup and rewrite as we upgrade node versions.

Mentoring / documentation / talks (ongoing)

 * GSoC, OPW, others
 * Maintaining our documentation
 * We should probably maintain a docs/ repo that outlines strategies or preferably, maybe add broad outlines of algorithms at the top of files.