Requests for comment/Text extraction

Currently, Wikimedia sites have API action=query&prop=extracts that can be used when someone wants a text-only extract of page content.

Extract storage
Currently, extracts are generated on demand and cached in memcached, however this results in a bad worst-case behaviour when a lot of extracts are needed at once like for queries over several pages or action=opensearch which returns 10 results by default. Text extraction involves DOM manipulations and text processing (tens of milliseconds) and potentially a wikitext parse in case of cache miss (can easily take seconds or even tens of them). Such timing is less than optimal, I propose to extract text during LinksUpdate and store it in page_props. This will allow efficient batch retrieval and 100% availability.