Accuracy review

= Accuracy review =


 * Tool Labs Instance
 * http://tools.wmflabs.org/arowf/
 * Public URL
 * Accuracy review


 * Phabricator report: T89416
 * Announcement: wikitech-l/2015-February/080766.html
 * Project repository: https://github.com/priyankamandikal/arowf
 * Project Blog: https://priyankamandikal.wordpress.com/wiki-accuracy-review/
 * Etherpad: https://etherpad.wikimedia.org/p/accuracyreview

Contact information

 * Email: info@arowf.org
 * Web Page: http://arowf.org

Synopsis
Create a medawiki-utilities bot to find articles in given categories, category trees, and lists. For each such article, find passages with (1) facts and statistics which are likely to have become out of date and have not been updated in a given number of years, and optionally (2) phrases which are likely unclear. Add an indication of the location and the text of those passages either to the page in question using templates, to a bookkeeping page with other page names as headings, and/or to a database local to the bot.

Use a customizable array of keywords and regular expressions, and measures of text comprehensibility (or optionally, the DELPH-IN LOGIN parser) to find such passages for review. Use an algorithm at least as good as that in to pre-compute the age of each word in an article, to avoid the move and blanking issues described in e.g.,  before processing each article of interest.

Present candidate suspect passages to one or more subscribed reviewers. Update the source template with the reviewer(s)' answers to the GIFT question, but keep the original text as part of the template, if any. When reviewers disagree, update the template, if any, to reflect that fact, and present the question to a third reviewer to break the tie.

Tasks completed

 * Deploying and running Wikiwho code in PythonAnywhere
 * Reviewer reputation database design
 * Login system in Python Flask for registering and logging in reviewers
 * SMOG Readability Testing

Next steps

 * Implement the Flesch–Kincaid readability test: This test is widely used for determining the audiences that a particular text caters to. A higher score indicates that it can easily be understood by younger audiences (education wise). The task is to implement this index on a sentence-to-sentence basis and additionally also normalize these scores.
 * Review item queue database design: The review item queue db will tentatively consist of title of review item; article name; project e.g. enwiki or dewiki; link; type tag for whether the link is a diff, permalink, ordinary article URL, or other; pertinent excerpt text; source of review item e.g. manual entry, student editor, bias concern, SMOG flagged, etc.
 * Manual list-based input: The manual review item input will just be a web form with which you can add review items to the reviewers' queue once you have decided on the database structure for them. For example, you might enter a wiki article URL (probably in either permalink or diff form) and/or a text passage and/or notes on or a question about the passage.
 * Reviewer workflow: Implement the reviewer workflow as given in the flow diagram. Each review item is acted on by at least two reviewers and there is a third tie-breaking reviewer in case of disagreements. The reputation scores of the reviewers are automatically updated based on the ratio of agreements and disagreements. Note that the reputation scores won't be generally displayed to the reviewers but the review items that reduced the score will be displayed to them.

Call for participation
There are additional detailed plans for testing which Accuracy Review of Wikipedias Foundation staff will be happy to discuss with interested co-mentors, because depending on available resources, there could be ways to eliminate substantial duplication of effort. Co-mentor volunteers include Maribel Acosta and Fabian Flöck.

If you are interested in volunteering, please click [edit source] to add your name and preferred contact information here. If you wish to be contacted by email, mailto:info@arowf.org.


 * 1) Adriana Bachmann, [email address elided]
 * 2) Shrutika Gulati @Shrutika719 on Phabricator
 * 3) Priyanka M.P. @prnk28 on Phabricator
 * 4) @Minervaxox on Phabricator
 * 5) @gda94 on Phabricator
 * 6) Jnanaranjan_sahu @Jnanaranjan_sahu on Phabricator
 * 7) @wbm1058 on Phabricator
 * 8) @Vlkyrie on Phabricator