Extension:TextExtracts/oc

The TextExtracts extension provides an API with allows to retrieve plain-text or limited HTML extracts of page content.

Configuration settings

 * is an array of, .class, . , and # which will be excluded from extraction.
 * For example,  removes indented text, often used for non-templated hatnotes that are not desired in summaries.
 * TextExtracts.php defines the defaults, of which the class "noexcerpt" is one - this may be added to any template to exclude it.
 * defines whether TextExtracts should provide its extracts to OpenSearchXml extension. The default is "false".

API
This extension's query module, prop=extracts returns article extracts (truncated article text). Two formats are available: cleaned up HTML and plain text.

Parameters:


 * exchars: Length of extracts in characters.


 * exsentences: Number of sentences to return.


 * exintro: Return only zeroth section.


 * exsectionformat: What section heading format to use for plaintext extracts:


 * wiki — e.g., == Wikitext ==


 * plain — no special decoration


 * raw — this extension's internal representation: <ASCII 2> <ASCII 2><ASCII 1>.


 * exlimit</tt>: Maximum number of extracts to return. Because excerpts generation can be slow, the limit is capped at 20 for intro-only extracts and 1 for whole-page extracts.


 * explaintext</tt>: Return plain-text extracts.


 * excontinue</tt>: When more results are available, use this parameter to continue.

Example: api.php?action=query&prop=extracts&exchars=100&titles=Earth&format=json