Extension:TextExtracts

The TextExtracts extension provides an API with allows to retrieve plain-text or limited HTML extracts of page content.

Configuration settings

 * is an array of, .class, . , and # which will be excluded from extraction.
 * For example,  removes indented text, often used for non-templated hatnotes that are not desired in summaries.
 * TextExtracts.php defines the defaults, of which the class "noexcerpt" is one - this may be added to any template to exclude it.
 * defines whether TextExtracts should provide its extracts to Extension:OpenSearchXml. The default is "false".

API
This extension's query module, prop=extracts returns article extracts (truncated article text). Two formats are available: cleaned up HTML and plain text.

Parameters:
 * exchars: Length of extracts in characters.
 * exsentences: Number of sentences to return.
 * exintro: Return only zeroth section.
 * exsectionformat: What section heading format to use for plaintext extracts:
 * wiki — e.g., == Wikitext ==
 * plain — no special decoration
 * raw — this extension's internal representation: <ASCII 2> <ASCII 2><ASCII 1>.


 * exlimit</tt>: Maximum number of extracts to return. Because excerpts generation can be slow, the limit is capped at 20 for intro-only extracts and 1 for whole-page extracts.
 * explaintext</tt>: Return plain-text extracts.
 * excontinue</tt>: When more results are available, use this parameter to continue.

Example:

[//en.wikipedia.org/w/api.php?action=query&prop=extracts&exchars=100&titles=Earth&format=jsonfm api.php?action=query&prop=extracts&exchars=100&titles=Earth&format=json]