Accuracy review/docs

From mediawiki.org
Jump to navigation Jump to search

This application provides editors with a peer-review system to find and review inaccurate content in Wikipedia articles. This has been achieved with the help of algorithms that are specifically designed to flag suspected content, which the editors can then review by a three-way review mechanism. The reviewer bot finds articles in given categories, category trees, and lists. For each such article, the bot creates questions from the following sets:

  1. Passages with facts and statistics which are likely to have become out of date and have not been updated in a given number of years.
  2. Passages which are likely unclear.
  3. Student edits.
  4. Content from Wikipedia Backlog categories

Question candidates from each set are then ranked by taking into account the pageview count of the article and the high ranking candidates are made into questions. These questions are then open to the reviewers for reviewing and resolving them. A three-way peer-review mechanism ensures that questions are resolved based on common consensus. Two reviewers work on each question, and in the case of a conflict, it is sent to a third reviewer. The first reviewer provides a solution to the question posed. The second reviewer can decide to either 'Endorse' or 'Oppose' a proposed solution as valid/invalid. In case of a conflict, the third reviewer decides between supporting the first or second reviewer's viewpoint. Reviewer reputation scores are computed based on a mechanism of acceptability of reviews by other peer reviewers. Reviews which lowered the scores can be optionally displayed to the reviewers.

The app is up and running here!

Action endpoints[edit]

The application has six action end-points:

Technical Implementation Details[edit]

In the newly devised NoSQL approach, we decided to use a purely file-based system for storing questions and answers. We have separate files for every reviewer’s actions (ask, answer, recommend).

The files can be of 6 types:

  • q – question
  • a – answer
  • e – endorses original answer
  • o – opposes original answer
  • t – tiebreaker
  • d – diff

The system currently assigns a 9-digit code (ranging from 000000001 to 999999999) as the name for every question that is asked to the system. These numbers aren’t assigned randomly and follow a round robin approach. This number is followed by ‘q’ to denote that the file contains a question. So the first question asked will have a name like ‘000000001q’. Next when a reviewer wishes to answer a question, they are presented with the questions and answers if present. If the question hasn’t been answered before, the reviewer needs to answer it. The file name has the same 9-digit code as the question followed by an ‘a’. If the reviewer is presented with a question and corresponding answer, they can choose to endorse or oppose the answer which creates ‘e’ or ‘o’ files respectively. in case the answer is endorsed, it is now ready to be opened by a recommender to implement the necessary changes suggested in the answer file. In the case that the original answer has been opposed, the question along with the answer and comments passed by the opposer are presented to a third reviewer, who can finally decide to endorse or oppose the original answer.