Help:Extension:ProofreadPage/2013 draft

= Proofread = Proofreading produces the works on Wikisource from page scans. Page scans are normally in DjVu or PDF format which are uploaded to Wikimedia Commons. Proofreading takes place in the Index and Page namespaces before being transcluded into the main namespace. The proofreading process is split into different phases which are indicated by each page's page status. Wikisource has a style guide and certain formatting conventions that should be used during proofreading to make sure that our texts look correct and function properly. This proofreading function is provided by the ProofreadPage extension.

New users new to proofreading can experiment with the concept, and test their abilities with these simple introductory tests on the Distributed Proofreading's website.

The Proofread of the Month (PotM) is a good place to start for people who want to learn how proofreading works on Wikisource. This project runs a new work each month and invites all user to take part.

Help

 * Page scans


 * Index pages


 * Page numbers


 * Page status


 * Formatting conventions


 * Transclusion

= Proofread Page extension = The Proofread Page extension can render a book either as a column of OCR text beside a column of scanned images, or broken into its logical organization (such as chapters or poems) using transclusion.

The extension is intended to allow easy comparison of text to the original and allow rendering of a text in several ways without duplicating data. Since the pages are not in the main namespace, they are not included in the statistical count of text units.

The extension is installed on all Wikisource wikis. However, for this to work the editor's browser (and extensions such as NoScript) must allow script processing. Your Special:Preferences page (section "Gadgets") allows you to control certain features, such as whether the OCR button is enabled and whether the text by default appears side by side or one above another.

Anybody is able to proofread and correct most pages at Wikisource. However, editors must log into an account in order to change the proofread status. IP addresses cannot change this status. When corrections and formatting are complete, the page is marked as proofread and is ready for the main namespace, leave the page as 'not proofread' until it is done. Mark as problematic if appropriate.

Extension

 * 1) Install LabeledSectionTransclusion (not required but strongly recommended)
 * 2) Download the files from Git or download a snapshot (select your version of MediaWiki) and place the files under $IP/extensions/ProofreadPage. Warning: Current master branch of the git repository is only compatible with with MediaWiki 1.21 and above. In order to use ProofreadPage with MediaWiki 1.19 or 1.20 use the REL1_19 branch.
 * 3) Add to the end of LocalSettings.php:
 * 4) Add the required tables to the database; on the command line, enter:  (Note: Your designated database user needs to have CREATE rights on your MediaWiki database.)
 * 5) Installation can now be verified through Special:Version on your wiki

Thumbnailing
The extension links directly to image thumbnails which often don't exist. You must catch 404 errors and generate the missing thumbnails. You can do this with any one of these solutions:  Set an Apache RewriteRule in .htaccess to thumb.php for missing thumbnails:  or set the Apache 404 handler to Wikimedia's thumb-handler. This is a general-purpose 404 handler with Wikimedia-specific code, not simply a thumbnail generator.   For MediaWiki >= 1.20, you can simply redirect to thumb_handler.php:   Or in apache2.conf :  

WARNING: There is an in the images directory that may interfere with any .htaccess rules you install.

Namespaces
ProofreadPage create by default two custom namespaces named "Page" and "Index" in English with respectively ids 250 and 252.

Their names are translated if your wiki use an other language. .

You can customize their name or their id: Create namespaces by hand and set their ids in LocalSettings.php using $wgProofreadPageNamespaceIds global. You will do something like:

Configuration

 * In order to use the page quality system, it is necessary to create four categories. The names of these categories must be defined in s:Mediawiki:Proofreadpage_quality0_category to s:Mediawiki:Proofreadpage_quality4_category.
 * Ensure that you have installed Extension:ParserFunctions

Configuration of index namespace
The configuration is a JSON array of properties. Here is the structure of a property in the array, all the parameters are optional, the default value are set: The data parameter can have for value: "type", "language", "title", "author", "translator", "illustrator", "editor", "school", "year", "publisher", "place" or "progress".
 * You need to create MediaWiki:Proofreadpage_index_template in order to display index pages. This page is a template that receive as parameter entries of the edition form.
 * You need to create MediaWiki:Proofreadpage_index_data_config that contain the configuration of the index form.

Creating your first page

 * Before following these steps ensure you have followed the instructions in Using DjVu with MediaWiki.
 * Create a page in the "Page" namespace (or the internationalized name if you use an not-English wiki). For example if your namespace is 'Page' create 'Page:Alice in Wonderland.djvu'
 * Create the corresponding file for this page File:Alice in Wonderland.djvu
 * Create the index page 'Index:Alice in Wonderland.djvu'
 * To edit page 5 of the book navigate to 'Page:Alice_in_Wonderland.djvu/5' and click edit

OAI-PMH
Since 28904, the extension has an OAI-PMH API for index pages. This API is implemented in a new special page Special:ProofreadIndexOai using a basic OAI-PMH protocol with Simple Dublin Core (oai_dc) and Qualified Dublin Core (prp_qdc). This repository provides the data stored in index pages. [//wikisource.org/wiki/Special:ProofreadIndexOai?verb=ListRecords&metadataPrefix=prp_qdc Example in oldwikisource].

Sets based on MediaWiki categories can be configured in Mediawiki:Proofreadpage_index_oai_sets that contain a JSON array like: