Parsoid/Ops needs for July 2013

Overview
We are planning to make VisualEditor the default editor on all (or at least most) Wikipedias by July 1st, 2013. This will cause most edits to be made by VisualEditor, which will put a lot of load on Parsoid. We need to make sure that Parsoid and the other affected subsystems can handle this.

Impact
(For the current architecture of our Parsoid deployment, see Parsoid.)

Every time a user clicks the Edit button, VisualEditor loads the HTML for the page from Parsoid. This is a GET request going through the Varnish caches, so it may be served from cache. If and when the user actually saves their edit, VisualEditor POSTs the edited page to Parsoid and gets wikitext back, which it then saves to the database as a regular edit. These POST requests obviously are not cacheable. The load on the Parsoid backends should therefore be one POST request per edit, and one GET request per edit (there is one GET request per edit attempt, but these are cached in Varnish and invalidated when the page is edited).

Each such request may take a lot of time and resources on the Parsoid backends. Each request also causes Parsoid to request the wikitext source of the page from api.php, and may cause additional API requests if there are templates, extensions or images on the page that cannot be reused from the previous revision. In general we can however reuse most template expansions from the previous version's cached DOM.

Updates after template edits are more problematic. During the recent wikidata-related bot runs up to 9 million pages were scheduled for re-rendering per day. Even at a more conservative 5 million pages per day this results in 57 requests per second on average over the day. Without more precise dependency information per template-generated fragment all templates need to be re-expanded in these requests, which means that linksUpdate jobs will generate the bulk of our API requests. We might have to limit the rate of re-renders by delaying them for a day or longer so that several re-renders per page are collapsed into one. See Parsoid/Minimal performance strategy for July release for the detailed plan for the July release.

Benchmarks
Based on his current numbers, Gabriel believes that two Parsoid backends can sustain the throughput of an average day's edits (estimated at 140k edits, TODO: verify), but we'd need more than that to be able to sustain peak rates. We don't have any data on actual peak edit rates. For serious planning, the peak edit rate for a 1 or 5 minute period needs to be determined. A very rough first estimate without peak edit rate data would be 10 backends (servers identical to wtp1004).

The last measurement for en:Barack Obama on a single backend yielded 0.4reqs/sec at a concurrency of 10. 26 seconds per request. For planning purposes, we assume that all pages parsed are as complex as Obama.

We have not yet collected overly precise data on the API request volume. What we have so far:
 * complex pages can result in hundreds of API requests (one for the page source, template and extension)
 * the Parsoid round-trip setup already performs the API requests corresponding to about 70k pages in 24 hours, and has not caused issues in the API cluster
 * the throughput for all wikis is probably roughly 2-3x this rate
 * the more relevant peak API request rate is a function of the peak edit rate, which is to be determined