Latest comment: 9 years ago by EBernhardson (WMF) in topic 2013-03-04 better Flow API

Old comments[edit]

Hello! Unsure if this is the right place to ask - but hopefully Flow will have good APIs from the start baked in, so that they can be used by gadgets / tools / bots in awesome ways that Talk pages can never be? I assume we'll have OAuth by then, so integration with that would make a lot of the 'spam' / 'permission' issues go away, and will let us truly build awesome tools. So question is, 'are there APIs planned for everything from the beginning, rather than being something tacked on as an afterthought at the end'? Thanks! :) Yuvipanda (talk) 21:39, 17 July 2013 (UTC)Reply[reply]

Full support for interacting with Flow via the API is absolutely necessary; there should be no excuse for making it so that bots cannot interact with Flow discussions in all ways that users can via the web UI. And yes, I mean full support, not the "limited fallback support for people who can't use VisualEditor" that I've heard mentioned a few times. If that means the bot has to work with VisualEditor's flavor of HTML, so be it, but then make sure that's well documented (if it isn't already) and clearly linked from the appropriate pages. BJorsch (WMF) (talk) 14:08, 19 July 2013 (UTC)Reply[reply]
@BJorsch (WMF): No, this is not sufficient, and I would put it stronger: Flow's human-facing UI must only use the APIs it creates. There should be no magic hooks that aren't exposed as API calls. If we can't be honest and architecturally-correct in developing our software, we have failed. Jdforrester (WMF) (talk) 17:28, 19 July 2013 (UTC)Reply[reply]
I'm not sure what you're saying isn't sufficient. But I'll point out that Flow's human-facing UI can't only "use the APIs it creates", because using the MediaWiki API from a web UI requires JavaScript and AJAX and I hope you aren't planning on leaving users without JavaScript out of Flow entirely. For example, I'd guess that viewing a Flow page would serve HTML containing the discussions and some subset of "action" links/buttons to provide reasonable functionality to non-JS clients, rather than serving a stub page without any content except some JS that queries the API to get all the discussions. On the other hand, it would be possible for Flow to add "APIs" that are so special-purposed to the web UI that any other client would be effectively "screen"-scraping it. That wouldn't be good enough either, were such a thing to be done. BJorsch (WMF) (talk) 15:20, 22 July 2013 (UTC)Reply[reply]
@BJorsch (WMF): I'm confused. How does writing our code to be architecturally-isolated, like all code should be, require AJAX? I've written dozens of MVC-pattern systems in non-Web technologies; am I missing something? Also, you seem to think that I'm involved in Flow, which isn't true. :-) Jdforrester (WMF) (talk) 15:28, 22 July 2013 (UTC)Reply[reply]
Sanity check: I'm talking about mw:API here. Are you talking about a different sort of API? BJorsch (WMF) (talk) 12:24, 23 July 2013 (UTC)Reply[reply]
@BJorsch (WMF): No. :-) If I'd meant "the Web API", I'd have said as such. Jdforrester (WMF) (talk) 15:18, 23 July 2013 (UTC)Reply[reply]

2013-03-04 better Flow API[edit]

There are various benefits we get from a REST api, the biggest being that individual resources are much easier to manage cache control headers on. The current API responses join together disparate bits of data into a response that we cannot reasonably tell varnish to cache, invalidation is just too difficult. A RESTfull resource-based API breaks things up into smaller pieces and multiple requests, but modern protocols like SPDY and the coming HTTP/2.0 specifically handle this concept of making multiple smaller requests rather than a single large request. The reason for nodejs is partially tied to those smaller requests, MediaWiki has a fairly large startup time. If we were to issue one request that returns a list of topics on a page and then 5 requests for each of the new topics, then MediaWiki would potentially pay the startup cost 6 times, rather than nodejs which has almost no request initialization delay.

The other half of this is related to the SOA RFC (approved with overwhelming support at the architecture summit). ... Part of the implication of a service-oriented approach is that a service returns data suitable for a computer to interact with, rather than HTML which is more suitable for human interaction. Its unwritten but expected that a service should return structured metadata rather than user-facing HTML. Any kind of user interface, be it in a native mobile app, or the MediaWiki extension rendering a page, is built on top of the API responses.

-- Erik Bernhardson e-mail

Does nodejs really have no startup time? Or is it that Parsoid etc were written to be continuously-running daemons while MediaWiki's PHP currently isn't, and similar results could have been realized by writing a service daemon in PHP while making better use of the existing codebase? Anomie (talk) 13:57, 7 March 2014 (UTC)Reply[reply]
Actually nodejs has worse startup time than php :) Thats part of why i called it request initialization for nodejs. Its specifically about the daemon nature where much of the application can be loaded and booted ahead of time rather than per request. Part of the implication there is that the application needs to be mostly stateless such that multiple requests can run through the same library routines. Much of the work done passing IContextSource around rather than globals is probably a big help in this direction. I've heard of a couple attempts to daemonize php, but not much concrete so this is fairly unexplored territory. EBernhardson (WMF) (talk) 20:37, 7 March 2014 (UTC)Reply[reply]

2013-03-04 conversation with Gabriel Wicke[edit]

This is follow-on from Flow providing a better API for mobile apps.

Parsoid HTML doesn't match PHP parser HTML[edit]

Image HTML is different (bug 61786), Video HTML is different and links are broken (bug 61769), Interlanguage links are losing the :colon prefix when saved (bug 61725), etc.

Regarding images, Parsoid HTML is intentionally cleaner and better. gwicke wants it to render like parser so that eventually Parsoid can show HTML output for visual fidelity comparisons. Flow should work with VE team to get the CSS necessary to do it moved out of VE.

BadImageList handling is integrated into the PHP wikitext parser. Parser.php's replaceInternalLinks2() calls wfIsBadImage() (bug 61772).

Parsoid and link metadata[edit]

Parsoid will soon provide redlink info, will output more metadata about links, categories, etc. This overlaps with the parsing of links and templates that Flow has to do for WhatLinksHere, Special:LinkSearch, and Category support; see Flow/Link table spec, Flow should take advantage of the information Parsoid can provide.

Parsoid will update this information in its cache as articles change, and maybe regenerate HTML, e.g. if an image changes size. Flow could take advantage of this for some content, but maybe not all – e.g. templates in Flow header should update, but not (?) templates in Flow posts.

Parsoid caches its HTML in Varnish, Rashomon will be a permanent store of it.

Leveraging Rashomon storage[edit]

Flow could store the HTML of its "items" in Rashomon instead of ExternalStore. A Flow board is the wrong level to cache at, we would store either topics or individual posts.... which relates to whether the Flow API is at the level of operations on topics or at the post level. Leaning towards the former.

Rashomon knows about items via a page-like name, but a post reference like Talk:Sandbox?topic_postId=rqa8q1vz478mmrtu&workflow=rp7to9rygxadbohw doesn't qualify. The Flow team has talked about providing direct access to items at Special:Flow/post/<uuid>, Special:Flow/topic/<uuid>, Special:Flow/header/<uuid>, but the semantics of Special pages mean these would not be treated like pages without some work.

Why not allow access to Flow topics at a dedicated namespace? That's what LiquidThreads does, Flow team considered it early on, but hasn't pursued it.



  • Flow needs a templating solution soon for Shahyar's front-end rework, see

Requests for comment/HTML templating library.

  • In order for Flow's API to return more sane information as well/instead of full client HTML (with reply/edit/hide/permalink links, and textareas for new posts), we really want server-side templating as well

An implementation of a subset of KnockoutJS is a strong contender, see Requests for comment/HTML templating library/Knockout - Tassembly The subset of Knockout already works with Knockout.js on the client. Matt Walker is working on a PHP implementation.

Will this be ready in a month for us to use for Shahyar's Flow front-end rework?

There isn't really an alternative with both JS and PHP implementation likely to get WMF support.

gwicke suggests: rather than templating in PHP, try using data attributes (?) and have CSS pull out the information.

Later talk to mwalker[edit]

The PHP implementation of our KnockoutJS subset (knockoff? dropout? :) ) isn't ready, but he still thinks using KnockoutJS for Flow front-end templating would be the best choice.

Next steps[edit]

  • Flow API: talk again in a week
  • Shahyar may try KnockoutJS in Flow front-end rework.

Notes from recent meeting[edit]

Here are some raw meeting notes from this week's meeting: etherpad:FlowArchitecture