User:Daniel Kinzler (WMDE)/Frontend Architecture

This document describes how MediaWiki's user interface should function, if Daniel was king. It is intended to provide constraints and guiding principles for feature development.

The front-end architecture defines the flow of information and control between the client device and the application servers (or, more generally, the data storage layer). In practice, this mostly mostly means defining when, where and how HTML is generated, manipulated, and assembled.

Goals
''Note: this is a straw-man vision. It's here for the sake of discussion.''

Provide consistent modern user interface across platforms.

1) A (mostly) data driven single page web application that heavily relies on client side JS and uses a modern framework for state management and template rendering. This should be the default view for both desktop and mobile clients. [Alternative: provide the APIs needed for 3rd parties to build such an experience, and use the same APIs to drive our native mobile aps].
 * My implicit assumption: custom internet facing APIs will be needed for distribution channels other than classic (desktop) and mobile web UAs. ABaso (WMF) (talk) 16:28, 28 February 2018 (UTC)
 * I don't like single-page applications for things as simple (in visual/UX terms) as wikis. They tend to be heavy, slow, and break down badly when the connection is not optimal. I would go as far as to say that we would probably benefit from having separate frontends for separate functions (mobile, desktop, app, etc). GLavagetto (WMF) (talk) 16:50, 28 February 2018 (UTC)
 * Agree we can expect multiple client implementations, though a JS-assembled client-side page should be more robust against intermittent connections than a multi-page CRUD app if it's properly written to handle when things fail. If the 'smarts' are oriented around service workers, then the two styles also theoretically combine well. --Brion Vibber (WMF) (talk) 17:49, 28 February 2018 (UTC)

2) A (mostly) static view for use by client software without the necessary level of JS support for the single page App. This may still have some optional bits of "old school" JS. The static view is served as a full HTML page which is rendered server side based on the same templates also used on the client side by the single page experience.

Both views share the same URLs, and the appropriate view is selected by detecting client capabilities. This could perhaps be done by initially serving the static view, which then gets replaced with the single page app if possible.
 * In practical terms, this seems to suggest that minimally there will be two edge cache variants for a given page - one HTML and one HTML inside of a JSON payload. (I'm glossing over user-specific variants for the moment, as that's separable and discussed elsewhere in this document). ABaso (WMF) (talk) 16:28, 28 February 2018 (UTC)
 * This would mean we will cache an initial page served to any new request, which will be the same, rendered server-side. The javascript code in that basic html version should then determine if the SPA approach would be taken. I do see such an approach as problematic in terms of caching though, as we would surely see a multiplication of cached objects and this should surely be taken into consideration when designing such an approach. GLavagetto (WMF) (talk) 16:50, 28 February 2018 (UTC)
 * What if the edge cache always contains precomposed pages, while the client page & service worker JS bypasses the cached pages if/when possible? We should be able to have the user see a "/wiki/Foobar" URL without that being what actually gets fetched by catching the navigation attempt on-page, or the request in the service worker, and returning the lightweight page harness plus client-side fetching and composing. --Brion Vibber (WMF) (talk) 17:49, 28 February 2018 (UTC)
 * (By bypass cache, I mean 'fetch the separate components, which themselves will be edge-cached separately'.) --Brion Vibber (WMF) (talk) 18:54, 28 February 2018 (UTC)

Desires
Straw-man list of desired capabilities:
 * JS-UI: We want to allow data driven UIs, though it is acknowledge that some HTML will always be generated on the server (PARSER-API) and some HTML will be generated on the server for a while (NO-DATA, FRAGMENTS). Eventually, all functionality shall be available via an API.
 * REST: Client code should not need to know how or where a service is implemented. We offer a comprehensive REST API. The web interface (JS client) should be using the same RESTful (in a borad sense) APIs as the native clients (apps).
 * PARSER-API: We want to offer access to the rendered content of page content (for all revisions and slots, in all target languages)via the REST API.
 * ONE-UI: server-side rendering shall use the same templates as client-side rendering (needs 2L10N, TEMPLATES). The web interface (JS client) should be the same for desktop and mobile devices. We expect the line between mobile and desktop to blur and finally disappear over the next 5 years.
 * Granted, mobility hardware for web consumption will advance. This said, I perceive lightweight markup and lightweight initial bootstrapping static assets to continue to be important for the foreseeable future. I see in ASSET a consideration for modernizing ResourceLoader. I support that, and from various discussions it seems to be desired and achievable. ABaso (WMF) (talk) 16:28, 28 February 2018 (UTC)


 * MULTI-DC: We want full Multi-Data-Center support. This means all information that is needed to decide whether a request needs to be routed to the master DC needs to be in the request. For the application servers, this mostly means "don't write to the database in GET requests".
 * JS-IFACE: We want to expose narrow, stable interfaces for client-side customization (gadgets)
 * AGNOSTIC: It should become possible to implement an API module and an associated special page purely in JS. It should remain possible to implement an API module and an associated special page purely in PHP.
 * To clarify, is this suggesting that extension authors can choose which server side language to use from a choice of JS (presumably, Node.js) or PHP (or whatever supported languages), provided they follow a particular specification? ABaso (WMF) (talk) 16:28, 28 February 2018 (UTC)


 * REACTIVE: We do not want to serve different content to different kinds of clients, beyond the split between the single-page app and static pages. The content we serve should adapt to the device client-side.
 * In other words, is this presuming the client side ServiceWorker tech will apply transforms for classic (desktop) and mobile web? ABaso (WMF) (talk) 16:28, 28 February 2018 (UTC)


 * MULTILANG: We want the ability to serve renderings/presentations in different target languages. URLs for different renderings (languages) of content should be different, to allow explicit linking to a specific rendering, and to simplify caching.
 * L10N-JS: we want to be able to use MediaWiki's L10N mechanisms (messages, parameter substitution and formatting, plural handling, etc) in client-side JS.
 * SOME-SOA: Through the use of API routing, dependency injection, server side template rendering and page composition at the edge, eventually allow APIs, HTML output and internal services to be implemented in PHP, JS, or some other language.
 * The freedom this service oriented architecture (SOA) provides has to be balanced against the overhead of crossing the language barrier, against the requirements of the installation environment, as well as the complexity of managing the deployed application. That is, we want to have this freedom, but use it wisely. While it should be possible to use standalone services to back most of the service interfaces defined in MW core, this should only be done if it's actually worthwhile, considering all overhead in operations and maintenance.

Assumptions
These are straw-man assumptions, presented here to be challenged!
 * LAMP: The basic version of MediaWiki has to be installable on a plain LAMP stack without root access (shared hosting use case).
 * NO-V8. We cannot rely on being able to run PHP and JS code in the same process / without the need for communication via the network stack (no v8js in PHP). This implies that we can't call out from PHP to JS code from template rendering, nor can we call PHP code from JS for localization.
 * ESI: we can use Edge-Side-Includes (ESI) for page composition the WMF cluster. We want to to use this to compose static pages from HTML fragments (see FRAGMENTS) to satisfy NO-JS. To also satisfy LAMP, we need an alternative composition mechanism (or ESI emulation) in index.php.
 * ESI is a mess, not properly implemented anywhere in caching software, and a nightmare for debugging. Been there, done that. I'm sure I express the sentiment of others in the operations team: if we need to create page compositions server-side from fragments, that should be done in a page composition service (which could also be internal to the frontend service/MediaWiki), not via the use of technology from the nineties that everyone has basically abandoned. GLavagetto (WMF) (talk) 16:55, 28 February 2018 (UTC)
 * I recall concern about ESI as the appropriate means of composition. Is there a reason to not use a more powerful technology independent of the edge caching architecture itself? If I understand correctly, the reason why ESI is considered for composition is to deal with variants principally related to authenticated user preferences (i.e., RL/gadgetry/origin-side variants). But it's conceivable you might want variance based on more sophisticated conditioning, in which case a more flexible technology may be sensible. I'm not saying ESI isn't an option, but was curious about the emphasis specifically on ESI and any corresponding architectural and maintenance risk. In a way this goes back to my question on the REACTIVE piece. ABaso (WMF) (talk) 16:28, 28 February 2018 (UTC)


 * NODE: For optionaly functionality, we can execute JS on the server. In particular, optional API actions may be implemented in JS. However, calls between JS and PHP are expensive and should be avoided (because of NO-V8). Also, no core critical functionality must require JS execution on the server (because of LAMP).
 * NO-DATA: For the foreseeable future, some output (particularly of Special pages) may only be available in HTML, since it will take time for all extensions to be converted to a fully data driven approach. That is, some functionality will not be available via a JSON based REST interface, but instead as HTML (FRAGMENTS).
 * Is it possible to send the core HTML payload, bundled inside a JSON payload? I ask mainly for the simplicity of clients. Although clients may adapt their rendering strategy based on a registered property of the response's backing special page, should they have to know these details in order to support the default workflow? ABaso (WMF) (talk) 16:28, 28 February 2018 (UTC)


 * PREFER-JS: JS-enabled environments are preferred. Not all user workflows have to be supported on non-JS clients. However, see NO-JS.
 * NO-JS: For clients with no (sufficient) JS support, we still need to support a) consumption of all content, including meta-data, b) editing of text-based (bot not non-text) page content c) basic (but not all) curation. Among other things, this means we need to be able to serve complete static HTML pages from index.php.
 * Curation in this context I believe refers to simplistic talk page and page revision workflow interactivity from a noscript/JS-impaired context. Do I understand correctly? Also, I think I heard someone say that search ought to be supported (I think the basis of this is for hi-sec browser configurations where the wiki is accessed directly instead of through an external discovery (e.g., search engine) service). ABaso (WMF) (talk) 16:28, 28 February 2018 (UTC)


 * SOME-JS: We can require either server side JS support (NODE) or client side JS support. We do not have to provide full functionality in a situation where neither the server nor the client can execute JS.
 * NO-CLIENT-PARSE: We won't render actual Content (like wikitext) on the client. We want a single source of truth for rendering wikitext (and any other content type) as HTML.

Needs
Straw-man list of needed high level components to fulfill the above goals and desires under the given assumptions:
 * COMPOSE: Page composition at the network edge (see ESI), backed by an API for serving HTML fragments (see FRAGMENTS), needed to satisfy NO-JS while also supporting JS-UI.
 * This should be used at least to combine the parts of the skin with the page content.
 * Usage of this mechanism is optional (to satisfy LAMP). The index.php entry point still needs to deliver a fully composed page per default, so MW is usable without an ESI capable caching layer.
 * This requires a dependency tracking and puring engine (see PURGE).
 * It would perhaps be useful to support additional massaging/hydration, beyond what ESI supports. This would allow use to do localization here, as well as adapt for client devices.
 * ESI requires that the caching layer needs to be able to predict what kind of content it will be getting by looking at the request. This means e.g. that Special pages that serve non-HTML output would not be possible or would have to be especially registered.
 * FRAGMENTS: Fragments of HTML (page content, skin parts, special page forms, etc) can be requested from a dedicated endpoint (may be part of or separate from the REST API). Needed by COMPOSE, which is needed by NO-JS. Things that expose HTML fragments are mainly: the skin, ContentHandlers, input forms for Special pages and action handlers, results (listings) on for Special pages and action handlers, dynamic content for some page types (file pages, category pages).
 * Must at least serve all bits needed for the skin and rendered page content (including special pages, action handlers, etc)
 * Could also serve bits of composite page content (e.g. infoboxes)
 * Could also be used as a template rendering API, if it accepts complex data structures as input
 * TEMPLATES: ONE-UI requires the ability to render declarative templates on the client as well as on the server, in PHP and JS. This probably means the template engine has to be implemented in PHP and in JS (implied by NO-V8 and REST and STATIC).
 * Needs L10N-JS for templates to be fully data driven.
 * Alternatively, templates could use pre-formatted data: this would probably require an API mode that doesn't return abstract JSON nor rendered HTML, but some kind intermediate form (which may be annotated HTML). This could be interpreted as being a "view-model" in the sense of an MVVM architecture.
 * L10N-X2: To satisfy L10N-JS and TEMPLATES, we probably need a full JS port of the relevant l10n formatting code. NO-V8 implies that we'd otherwise need to call out to a PHP based L10N API. This is probably too slow, but should still be investigated.


 * PURGE: Unified dependency tracking and purging mechanism to enable FRAGMENTS and REST.
 * Using Kafka as a bus and some graph database for storing the dependency graph.
 * What about LAMP? Are the event-driven characteristics fulfilled in basic LAMP? My sense is the cross-project and deeply nested template needs of the Wikimedia projects are different than basic LAMP. But can dependency tracking and purging be supported only in the sophisticated installation without requiring it in LAMP or making the stock PHP MediaWiki installation dramatically more complex? ABaso (WMF) (talk) 16:28, 28 February 2018 (UTC)


 * Used to re-generate HTML snippets, JSON data, and other derived artifacts.
 * This makes it easy to introduce new kinds of artifacts or change granularity, without having to implement a tracking and updating mechanism for each use case.
 * MVC: JS framework that maps between a data model and the DOM, and manages API calls to the backend (MVC/MVVM). Needed for JS-UI, and should be backed by REST.
 * This kind of framework is designed for a fully data driven environment, with all rendering done in JS. However, we will still have HTML snippets coming from the backend. At the very least for rendered page content. These snippets need to be integrated into the DOM, and they may need massaging/hydration.
 * This requires TEMPLATES and L10N-JS.
 * NO-DATA and NO-CLIENT-PARSE means this needs FRAGMENTS.
 * ROUTE: A common REST API interface for functionality implemented in PHP, JS, or whatever other language.
 * Clients should not be aware of where and how an API is implemented (as per REST)
 * Routing should be possible in the CDN layer / load balancer.
 * Routing should also be possible inside MW core, so it is available without a CDN layer.
 * The new API should map to the existing action API for most if not all cases, to we don't have to re-implement all API functionality.
 * ASSET: An endpoint for asset delivery (next generation ResourceLoader) is needed for JS-UI / MVC.
 * For JS and CSS resources, associated icons
 * For localization resources (message bundles)
 * Should make aggressive use of caching on all levels.
 * Not for embedded media (probably), that should have it's own end point (httpd, nginx, swift...)
 * I'd add here that asset compilation would be an important step to modernize at the same point. ABaso (WMF) (talk) 16:28, 28 February 2018 (UTC)