Extension:Proofread Page

From MediaWiki.org
Jump to: navigation, search
MediaWiki extensions manual - list
Crystal Clear action run.png
Proofread Page

Release status: stable

Wikisourceex2.jpg
Implementation Page action, ContentHandler, Tag, API, Database
Description The Proofread Page extension can render a book either as a column of OCR text beside a column of scanned images, or broken into its logical organization (such as chapters or poems) using transclusion.
Author(s) ThomasV (original author)
Tpt (current maintainer)
Latest version continuous updates
MediaWiki 1.21+
PHP 5.3+
Database changes yes
License GPL
Download
Example s:Index:Wind in the Willows (1913).djvu
Namespace Page, Index
Parameters

$wgProofreadPageNamespaceIds

Added rights

pagequality

Hooks used
SetupAfterCache

ParserFirstCallInit
BeforePageDisplay
GetLinkColours
ImageOpenShowImageInlineBefore
ArticleSaveComplete
ArticleDelete
ArticleUndelete
ArticlePurge
SpecialMovepageAfterMove
LoadExtensionSchemaUpdates
OutputPageParserOutput
wgQueryPages
GetPreferences
CustomEditor
CanonicalNamespaces
SkinTemplateNavigation
APIEditBeforeSave
UnitTestsList
InfoAction

Translate the Proofread Page extension if possible

Check usage and version matrix; code metrics

The Proofread Page extension can render a book either as a column of OCR text beside a column of scanned images, or broken into its logical organization (such as chapters or poems) using transclusion.

The extension is intended to allow easy comparison of text to the original and allow rendering of a text in several ways without duplicating data. Since the pages are not in the main namespace, they are not included in the statistical count of text units.

The extension is installed on all Wikisource wikis. For the syntax, see oldwikisource:Wikisource:ProofreadPage.

Installation[edit | edit source]

Extension[edit | edit source]

  1. Install LabeledSectionTransclusion (not required but strongly recommended)
  2. Download the files from Git or download a snapshot (select your version of MediaWiki) and place the files under $IP/extensions/ProofreadPage. Warning: Current master branch of the git repository is only compatible with with MediaWiki 1.21 and above. In order to use ProofreadPage with MediaWiki 1.19 or 1.20 use the REL1_19 branch.
  3. Add to the end of LocalSettings.php:
    require_once( "$IP/extensions/ProofreadPage/ProofreadPage.php" );
  4. Add the required tables to the database; on the command line, enter:
    php maintenance/update.php
    (Note: Your designated database user needs to have CREATE rights on your MediaWiki database.)
  5. Installation can now be verified through Special:Version on your wiki

Thumbnailing[edit | edit source]

The extension links directly to image thumbnails which often don't exist. You must catch 404 errors and generate the missing thumbnails. You can do this with any one of these solutions:

  • Set an Apache RewriteRule in .htaccess to thumb.php for missing thumbnails:
        RewriteEngine On
        RewriteCond %{REQUEST_FILENAME} !-f
        RewriteCond %{REQUEST_FILENAME} !-d
        RewriteRule ^/w/images/thumb/[0-9a-f]/[0-9a-f][0-9a-f]/([^/]+)/page([0-9]+)-?([0-9]+)px-.*$ /w/thumb.php?f=$1&p=$2&w=$3 [L,QSA]
    
  • or set the Apache 404 handler to Wikimedia's thumb-handler. This is a general-purpose 404 handler with Wikimedia-specific code, not simply a thumbnail generator.
        ErrorDocument 404 /w/extensions/upload-scripts/404.php
    
  • For MediaWiki >= 1.20, you can simply redirect to thumb_handler.php:
        RewriteEngine On
        RewriteCond %{REQUEST_FILENAME} !-f
        RewriteCond %{REQUEST_FILENAME} !-d
        RewriteRule ^/w/images/thumb/[0-9a-f]/[0-9a-f][0-9a-f]/([^/]+)/page([0-9]+)-?([0-9]+)px-.*$ /w/thumb_handler.php [L,QSA]
    
  • Or in apache2.conf :
        ErrorDocument 404 /w/thumb_handler.php
    

WARNING: There is an .htaccess file in the images directory that may interfere with any .htaccess rules you install.

Namespaces[edit | edit source]

ProofreadPage create by default two custom namespaces named "Page" and "Index" in English with respectively ids 250 and 252.

Their names are translated if your wiki use an other language. Full list.

You can customize their name or their id: Create namespaces by hand and set their ids in LocalSettings.php using $wgProofreadPageNamespaceIds global. You will do something like:

define(NS_PROOFREAD_PAGE, 250);
define(NS_PROOFREAD_PAGE_TALK, 251);
define(NS_PROOFREAD_INDEX, 252);
define(NS_PROOFREAD_INDEX_TALK, 253);
$wgExtraNamespace[NS_PROOFREAD_PAGE] = "Page";
$wgExtraNamespace[NS_PROOFREAD_PAGE_TALK] = "Page talk";
$wgExtraNamespace[NS_PROOFREAD_INDEX] = "Index";
$wgExtraNamespace[NS_PROOFREAD_INDEX_TALK] = "Index talk";
$wgProofreadPageNamespaceIds = array(
    'index' => NS_PROOFREAD_INDEX,
    'page' => NS_PROOFREAD_PAGE
);

Other useful extensions[edit | edit source]

ProofreadPage use is highly improved by the use of the following extensions:

The configuration of native DjVu handler is also needed in order to use DjVu file.

Configuration[edit | edit source]

Configuration of index namespace[edit | edit source]

The configuration is a JSON array of properties. Here is the structure of a property in the array, all the parameters are optional, the default value are set:

{
  "ID": { //id of the metadata (first parameter of proofreadpage_index_attributes)
    "type": "string", //the property type (for compatibility reasons the values have not to be of this type). Possibles values: string, number, page
    "size": 1, //only for the type string : number of lines of the input (third parameter of proofreadpage_index_attributes)
    "values":  {"a":"A", "b":"B","c":"C", "d":"D"}, //an array values : label that list the possible values (for compatibility reasons the stored values have not to be one of these)
    "default": "", //the default value
    "header": false, //add the property to Mediawiki:Proofreadpage_header_template template
    "label": "ID", //the label in the form (second parameter of proofreadpage_index_attributes)
    "help": "", //a short help text
    "delimiter": [], //list of delimiters between two part of values. By example ["; ", " and "] for strings like "J. M. Dent; E. P. Dutton and A. D. Robert"
    "data": "" //proofreadpage's metadata type that the property is equivalent to
  }
}

The data parameter can have for value: "type", "language", "title", "author", "translator", "illustrator", "editor", "school", "year", "publisher", "place" or "progress".

Creating your first page[edit | edit source]

  • Before following these steps ensure you have followed the instructions in Using DjVu with MediaWiki.
  • Create a page in the "Page" namespace (or the internationalized name if you use an not-English wiki). For example if your namespace is 'Page' create 'Page:Alice in Wonderland.djvu'
  • Create the corresponding file for this page File:Alice in Wonderland.djvu
  • Create the index page 'Index:Alice in Wonderland.djvu'
    • Insert the tag <pagelist/> in the Pages field to visualize the page list
  • To edit page 5 of the book navigate to 'Page:Alice_in_Wonderland.djvu/5' and click edit

OAI-PMH[edit | edit source]

Since gerrit:28904, the extension has an OAI-PMH API for index pages. This API is implemented in a new special page Special:ProofreadIndexOai using a basic OAI-PMH protocol with Simple Dublin Core (oai_dc) and Qualified Dublin Core (prp_qdc). This repository provides the data stored in index pages. Example in oldwikisource. It is based on MediaWiki:Proofreadpage_index_data_config (see Configuration of index namespace section), using especially "data" and "type" configuration entries.

Sets based on MediaWiki categories can be configured in Mediawiki:Proofreadpage_index_oai_sets that contain a JSON array like:

{
  "test": { //spec of the set ie its ID
    "name": "Test", //The set name
    "category": "tests_list", //The category to use, without the "Category:" prefix
    "description": "A test set." //Description of the set, optional
  }
}

Usage[edit | edit source]

This extension introduces the following tags: <pages> and <pagelist>.

See also[edit | edit source]


Language: English  • français