Extension:Quotation

Extension:Quote is a small extension to check if the quoted string is contained on the page referenced. The purpose for this is to make it simple to verify if the quote is in the page, and if not to mark the quotes that does not check out.

Usage
There are both a tag function and parser function, with similar function. &lt;quote src="url"&gt;some text thats quoted&lt;/quote&gt; "some text thats quoted"

Both forms gives the same text encapsulated in a cite element, in one of several forms. &lt;cite class="valid"&gt;some text thats quoted&lt;/cite&gt; &lt;cite class="invalid"&gt;some text thats quoted&lt;/cite&gt; &lt;cite&gt;some text thats quoted&lt;/cite&gt;

First form have passed validation, second form have not, third form could not complete validation of some reason.

Algorithm
The quoted (marked) text will be transformed into single string before the equivalence check, with whitespace squashed and characters transformed into normalized form C, and with inserted bracketed text removed. The src will then be used for downloading the source text, the source will then be stripped for some elements (tags and their content) that otherwise could give problems (like &lt;script&gt; ), the remaining text stripped for all remaining tags, the whitespace squashed and characters transformed into normalized form C. If the quote is contained within the processed source text the quote is marked as valid, if not the quote is invalid. This is done by setting a class in the containing cite element. If there is no src the quote will be neither valid nor invalid. Marking the quote as verified from a live site will also lead to the quote being marked as alive.

The result from the processing will be stored by memcached for later reuse, with a timeout sufficient to handle continuous editing, and logged for later referral. When the page is cached then also the result from the processing is cached, so no further processing will take place before the page is rebuilt the next time. When it is rebuilt the result is stored as a page property and becomes the default state for the next run. Normally this could last from days to weeks or even months. If the site, or page, goes away the logged information can be used but the quote is then marked as archived according to the last known state.

A special page will be available for look up of the stored page properties, to make it possible to find pages with invalid and/or outdated quotes.

The logged information could include additional information, like the quote itself with additional context.

Stuff

 * Manual:Job queue/For developers
 * Manual:Purge
 * memcached