Parsoid/Ops needs for July 2013

Overview
We are planning to make VisualEditor the default editor on all (or at least most) Wikipedias by July 1st, 2013. This will cause most edits to be made by VisualEditor, which will put a lot of load on Parsoid. We need to make sure that Parsoid and the other affected subsystems can handle this.

Impact
(For the architecture of our Parsoid deployment, see Parsoid.)

Every time a user clicks the Edit button, VisualEditor loads the HTML for the page from Parsoid. This is a GET request going through the Varnish caches, so it may be served from cache. If and when the user actually saves their edit, VisualEditor POSTs the edited page to Parsoid and gets wikitext back, which it then saves to the database as a regular edit. These POST requests obviously are not cacheable. The load on the Parsoid backends should therefore be one POST request per edit, and one GET request per edit (there is one GET request per edit attempt, but these are cached in Varnish and invalidated when the page is edited).

Each such request may take a lot of time and resources on the Parsoid backends. Each request also causes Parsoid to request the wikitext source of the page from api.php, and may cause additional API requests if there are templates on the page. The Parsoid team is looking into storing the generated HTML for each revision to reduce the need for API fetches for POST requests. An increase in Parsoid request volume will therefore also cause an increase in API request volume.

Benchmarks
Based on his current numbers, Gabriel believes that two Parsoid backends can sustain the throughput of an average day's edits (estimated at 140k edits, TODO: verify), but we'd need more than that to be able to sustain peak rates. We don't have any data on actual peak edit rates. For serious planning, the peak edit rate for a 1 or 5 minute period needs to be determined. A very rough first estimate without peak edit rate data would be 10 backends (servers identical to wtp1004).

The last measurement for en:Barack Obama on a single backend yielded 0.4reqs/sec at a concurrency of 10. 26 seconds per request. For planning purposes, we assume that all pages parsed are as complex as Obama.

We have not yet collected overly precise data on the API request volume. What we have so far:
 * complex pages can result in hundreds of API requests (one for the page source, template and extension)
 * the Parsoid round-trip setup already performs the API requests corresponding to about 70k pages in 24 hours, and has not caused issues in the API cluster
 * the throughput for all wikis is probably roughly 2-3x this rate
 * the more relevant peak API request rate is a function of the peak edit rate, which is to be determined