Article feedback/Extended review

Because of the lack of a standard, readily-available tool to create and store quality reviews of Wikipedia content, several groups and organizations have created their own ad-hoc tools for this purpose. This page describes a standard system for such organizations to conduct open quality review of Wikipedia content, and surface the results on the article page.

This system is primarily intended for Wikipedia, but could also be used on other Wikimedia projects.

System

 * What form does the system take?

A MediaWiki extension is the easiest implementation form for the review system: it will make integration with Wikipedia user accounts easier, and review data will be stored locally. The Article feedback tool provides an existing framework that could be extended to support a more detailed quality review process.

Authentication

 * How does a credentialed reviewer authenticate?
 * How can the authentication process be scaled / automated? (use case: mass outreach from organizations to their members)

E-mail is the safest assumption we can make on the partner organization's infrastructure.

Organization members would get an invitation by e-mail to create an account on Wikipedia; the token/link would lead them to a modified version of the sign-up page, where they would confirm their organization, real name, e-mail address, and possibly volunteer to list their credentials and/or field of expertise.

If the partner organization agrees to provide us with a structured document containing this information (e.g. a CSV file), a script could be run to generate these e-mail invitations.

If the partner organization prefers not to share this information with us, we would provide them with a modified script that would only identify the organization; their members would then enter the rest of the information themselves.

The reviewer should be able to attach their existing Wikipedia account if they have one, instead of creating a new one.

Review submission

 * Who decides who reviews what?
 * How is the review submitted?

A voluntary model, where people can review any page they want, is the simplest implementation, and the most likely to fit within the existing article feedback infrastructure. Restricting the scope of reviewable articles doesn't appear to be necessary: existing expert review systems show that reviewers usually stick to their field of expertise when their name is publicly associated with the review.

Because articles can grow fairly long, it would be better to allow the reviewer to scroll through the article while they're reviewing it, while keeping the review fields always visible (some suggested a setup similar to common the fixed-position "feedback tab").

Review content

 * What is the content of the actual review?

Preliminary considerations
Analysis of the Article feedback experiment shows that Wikipedians have a consistent grasp of what criteria like "neutrality" and "well-sourced" mean, and rate them fairly consistently. The general public, however, who accounts for about 95% of the feedback provided, doesn't have the same model and provides ratings that vary greatly.

For the same reason, readers rarely rely on numeric ratings like Likert scales, as suggested by UX research on the current Article feedback tool. They're more interested in reading well-built comments and reviews. They're also interested in information about who the reviewer is, so as to gauge the relevance of the comment/review for their personal situation.

Some criteria, like the well-sourcedness of an article, can be assessed by an aggregate of automated quantitative metrics (like the number of references relative to the length of the article) and human-generated qualitative feedback (like the appropriateness of the references, and their reliability).

Questions & issues
Based on these considerations, it appears it would be better to move to a system where the reviewer is invited to answer a series of questions (some open-ended) to help readers and editors identify possible issues (and thus areas of improvement) with the article.

"Simple" readers and subject-matter "experts" (whether they're credentialed or not) have a different use for the article, and can provide different levels of feedback. A reader's main purpose may be to quickly find a specific piece of information, while an expert may want to check that the quality of the whole article. Asking the reviewer if they believe to have knowledge on the topic could be used to ask different questions, relevant to each profile.


 * Provide feedback
 * Do you consider yourself particularly knowledgeable on this topic? (yes/no)
 * (If not)
 * Did you find what you were looking for? (yes/no)
 * Report an issue with this article (form TBD)
 * Thank the authors (free text)
 * (If yes)
 * Would you recommend this article to a student, or for inclusion in a book?" (yes/no)
 * General assessment that can act as a "binary flag" for other purposes, like selective inclusion in collections
 * (If yes)
 * Would you like to compliment the authors? (free text)
 * (If not)
 * You have identified issues with: (checkboxes)
 * Completeness: The article doesn't provide an exhaustive coverage of the topic.
 * Readability: the article contains bad English, inappropriate grammar, vocabulary, etc.
 * Organization: The article isn't well-structured, or doesn't flow well.
 * Neutrality: The article contains opinionated material, or undue weight is given to a subtopic.
 * Verifiability: The article contains too few or too many references, or they're inappropriate or unreliable.
 * Illustration: The article contains too few or too many illustrations, or they're inappropriate.
 * Exploration: The article contains too few or too many links, or they're inappropriate or broken.
 * Other
 * ''When an issue is checked, a free-text field is enabled for the reviewer to provide more information.

Some results can be aggregated; reviewers should also be able to "approve" or "agree with" an existing review if they share the opinion of the other reviewer, in order to avoid duplication.

Things to think about:
 * Reviewers are invited to edit the page they're reviewing and fix the errors they notice.
 * [COI] Have you significantly edited the article yourself, or are you biased against its topic? (yes/no)

Review publication

 * Where is the review published?

Preliminary considerations
There are multiple reasons to integrate reviews with the existing talk page framework:
 * The talk page is the appropriate place to discuss improvements of the article; editors who watch the page will be notified of new reviews.
 * Few readers currently know of the talk page; by making it more discoverable, more readers may realize the information it contains is useful to assess an article's quality.
 * Reviewers are likely to appreciate feedback on their review and to have a venue to discuss further with editors; the talk page provides this opportunity.

However, there is also a risk that the talk page turns into a forum, or that the sheer amount of useless or irrelevant comments overwhelm editors on the talk page. Processes will be necessary to assess the usefulness and relevance of a review/comment to the article's improvement.

Workflow
A way to deal with the flow of reviews/comments would be to be able for users to "promote" a review/comment to the talk page if it's relevant. Readers should also have the ability to sort or filter the list of reviews/comments for a given article, for example by date (to show the latest review first), by reviewer (to show self-assessed "experts" first, by usefulness, etc.

Users should be able to "archive" reviews identifying issues that have been resolved. Similarly, off-topic or forum-like comments should be "archived" as not requiring follow-up.

Existing processes for treating inappropriate text (personal attacks, personally identifiable non-public information, libel, etc.) would continue to apply.

A visual indicator could also give an overview of the content of the reviews and thresholds of serious issues (or lack thereof).


 * Public list of one's reviews
 * API to access the entirety of the reviews and their specifics

API

 * standards / policies for integration of data from external review systems with ours

Quality indicators
"green flag"
 * threshold of recommendations vs low threshold of issues

"neutral flag"
 * not enough data

"red flag"
 * threshold of issues reported
 * unpatrolled edits