Parsoid/Roadmap/2014 15

Roadmap

 * Jul - Sep 2014: Parsoid HTML for page view (finish up visual diffs, set up RT testing, use for pdf rendering, mobile), Native gallery, start experimentation with content widgets (?)
 * Oct - Dec 2014: Parsoid HTML for page view (beta for all page views?), Wikilint/LintTrap, stable elt ids, HTML templating
 * Jan - Mar 2015: Sister projects/extension (other extensions), Wikilint/LintTrap, stable elt ids, HTML templating
 * Apr - Jun 2015: Support for HTML-only wikis?

Interdependencies (with other projects):

 * Rashomon / Content API is needed for html page views, stable element ids, efficient template updates.
 * VE, Flow, Mobile depend on Parsoid output
 * i18n consultation for language variant support
 * Content translation group on stable ids
 * Collaboration / interaction with editors community for wikilint/linttrap and possibly content widgets
 * Wikidata for content widgets

Constraint

 * ~30-40% time on maintenance (bug fixing, deploys, gsoc, etc.)

Start using Parsoid HTML for page views

 * Benefit: make page views for logged-in users as fast as those for anons
 * Requires more QA on rendering accuracy (visual diffing)
 * Service team has high-level goal of building the infrastructure for this (Rashomon, API with redlinks etc)
 * PDF rendering from Parsoid HTML?
 * enforce nesting, tag, linting

Stable element ids in HTML

 * Supports content translation, authorship maps, possibly efficient diffing, document part retrieval
 * Challenge: need to be preserved across wikitext edits.

Content widgets

 * Research new ways to mark up / integrate specific classes of content
 * Provide alternatives for data tables (football, discographies etc)

HTML templating

 * Goals: performance, possibly client-side rendering / preview
 * Can build on knockoff
 * Need to investigate if this can be represented transparently using existing transclusion syntax
 * Service team interested too

HTML-only wiki support

 * possibly in collaboration with Services, Platform, Features (HTML diff UI?)
 * requires everything that's needed for Parsoid-HTML page views & HTML-based templating

"Sister projects" / extensions

 * section support / LST
 * language variants
 * Native gallery port
 * One other extension relevant to a sister project?

Functionality

 * Support for language variants
 * Support for wikitext
 * Scope range of transclusions. Two notions to deal with:
 * Well-formedness: Unbalanced tags, partial DOMs, etc. This concern basically sets up the scope of tree-builder fixup. ( tag, for ex.)
 * Content-model constraints: Even if well-formed, you cannot transclude A-links inside another A-link. Basically, the overriding concern is: what is required to simply "drop-in" the DOM output of a transclusion into a DOM-tree? One way is to enforce constraints on what a template can produce in all possible expansions for all possible inputs ("static typing" and automatic type coercion). (See Parsoid/DOM_notes)
 * LintTrap/WikiLint: Gsoc project
 * Support for authorship maps
 * Requires stable element ids
 * Editing support:
 * Support for switching between HTML/Wikitext in the editor. Naive thing is not too difficult to support, but will not be very performant likely. To be investigated.
 * Support for HTML editing of transclusion parameters (in progress).
 * Possibly support content widgets for common tasks (for which a combination of tpls are currently used; infoboxes, football tables, discographies, etc.)

Testing

 * Parser tests
 * Selser-testing is still pretty painful. As selser is getting more refined, and as our accuracy in general improves, it is getting harder and harder to trust both "green"/"red" results  from parser test runs. We may need to consider more controlled edit generation where we can construct an oracle to give us authoratitive edited wikitext to compare selser against.
 * Porting PHP preprocessor and eliminating our native full expansion pipeline.
 * RT-testing
 * Fix our mysql-based rt testing or move over to cassandra.
 * Upgrades to selser testing. (in progress)
 * Automated diffs against PHP rendering to detect problems with rendering (for HTML page views).

Performance (ongoing)

 * More efficient re-rendering of pages after edits / template changes.
 * Ongoing identification of bottlenecks.

Maintenance (ongoing)

 * Regular deployments and monitoring.
 * Hooking up our logging infrastructure with logstack/bunyan.
 * Ongoing bug fixes.
 * node upgrades (from 0.8 to 0.10 and onwards).
 * code cleanup and rewrite as we upgrade node versions.

Mentoring / documentation / talks (ongoing)

 * GSoc, OPW, others
 * Maintaining our documentation
 * We should probably maintain a docs/ repo that outlines strategies or preferably, maybe add broad outlines of algorithms at the top of files.