Page Curation

'''This document is a work in progress. Comments are appreciated but this is not a final draft.'''

This document describes the design of a new interface for triaging "New Pages". This document is a work in progress. Feedback is welcome on the talk page.

This project is envisioned with multiple phases. By necessity (both of resourcing and change control), the project has been split into sections.

The design as published is incomplete and has many holes. This document is intended as a starting line for discussion about improving the overall experience of patrolling New Pages

Notes on Nomenclature
This document has developed a new term, "triaged". We believe that "triage" is a more descriptive term than "patrolled." Further, it does not evoke feelings of militarism or police work; rather, that of a doctor trying to save patients rather than prevent them from treatment.

This document will refer to "New Pages" as any page that has not been marked as "Patrolled". Edit count on the page is not taken into consideration.

Further, pages that are marked for speedy deletion will be simply referred to as "marked for deletion"; these pages are also technically marked as "patrolled".

For the sake of verbiage, this document will assume three states of an article:


 * Unpatrolled - The article has not been marked patrolled.
 * Marked for Deletion - The article has had one or more speedy deletion criteria applied.
 * Patrolled - The article has been marked as "patrolled". Zero or more tags requesting improvement may have been applied.

Rationale

 * New Page Patrol is a complicated process that is poorly supported by the MediaWiki software itself.
 * No two patrollers seem to utilize the same process.
 * Users who perform New Page Patrol report high levels of frustration and burn out due to feeling overworked:
 * Because inexperienced patrollers aggressively over-template, requiring work to be rechecked;
 * Because inexperienced patrollers often don't identify or fix major problems with new articles, requiring work to be rechecked
 * Because too few users choose to become patrollers
 * Because education about the patrolling process is difficult
 * Because optimizing a system for page patrol is a "Power User" job, requiring greater-than average computer savvy as well as (oftentimes) downloading of third party software

Hypotheses

 * Providing a native, easy-to-use interface for New Page Patrol will increase the number of users who choose to become patrollers, reducing workload
 * Providing a native, easy-to-use interface will help to establish better education about the process as is, resulting in lower "false positive" rates
 * Providing a native interface will allow future expansion and modification of the system to support different backend systems and logic screens
 * Providing a native, easy-to-use interface may prove to serve as an engagement point for mobile and tablet users, for whom editing is currently not feasible
 * Providing a native interface that utilizes positive messaging features will reduce new editor bite, thus promoting editor retention

Feature Requirements

 * Track the users who have triaged a page and the dates that they did so.
 * Provide a list view of New Pages.
 * This list must be filterable.
 * This list must easily show the state of a page, whether or not it has been triaged.
 * This list view must provide as much useful information as possible about an article.
 * This list view must allow for selection of multiple articles to be brought into the "zoom" view.
 * Provide a pageable, easy-to-use, and intuitive "zoom" interface that allows page examination and tagging in situ
 * This interface must provide meta-data about the article
 * This interface must show the article in the interface
 * This interface must be pageable without leaving the interface
 * Ideally, the interface's "paging queue" will be smart and modify itself according to behaviors of other patrollers and their work.
 * This helps to prevent a race condition wherein two patrollers work on the same article simultaneously, and generate edit conflicts.

Triage Principles
In order to combat the problem of inexperienced users marking pages as "patrolled" and requiring their work to be double checked by other, more experienced patrollers, the process of triaging a page to "patrolled" status will change. Pages may become "fully triaged" (patrolled) when one of the following criteria have been met:


 * A user with the PAGEPATROLLER right has marked it as "Triaged" (see below); or
 * A community-defined number of users without the PAGEPATROLLER right have marked it as "Triaged". This number should be between 3 and 5.

PAGEPATROLLER User Right
A new userright, PAGEPATROLLER (name change possible) will be created. This is a userright that must be granted by an administrator (or possibly others with the same userright). Users with this right are assumed to know what they are doing with regards to page triaging and will not require their work to be double-checked.

The "Triage Stack"
Currently, problems exist with users selectively editing from either the front of the queue or the back of the queue. This results in too many pages reaching an atrophied state of attention within the "middle" of the queue. Further, when patrolling, edit conflicts and duplicative work are rampant (since most patrollers will be operating in near to the same space).

A solution is proposed, therefore, of a "Randomized Stack" of pages that are keyed to a single user within a "Triaging Session". Upon choosing to start a session (by selecting a one or more articles for triage), the user's "stack" is stuff with anywhere from 5 to 10 random pages through the entire queue. These pages are "tagged" to the user in question and cannot show up in another person's stack (for the period that the session lasts, or until the page has been marked as partially-triaged).

Front or Back Stack Flow Mechanics
When a user chooses to operate in the front or the back of the queue, the stack will automatically add the next item in the queue that has not already been claimed. In the case of a race condition (two or more users starting at the same time), the next N pages will be shuffled and dealt to each user.

Selected Article Stack Flow Mechanics
From the List View, users may define the contents of their stacks from the List interface by selecting articles to be held. Once these articles have been selected and submitted, they are considered "claimed". In the case of a race condition, articles that have already been claimed will be silently removed from the user's stack.

User Experience: List View


The proposed List View interface explodes the current "unpatrolled" list into a more readable and scannable format.

Filter Mechanisms
Ideally, there will be multiple ways to filter the List Interface:


 * Show/Hide Triaged Pages
 * Show/Hide Bot pages
 * Show/Hide redirects
 * By Creator Username
 * By Namespace
 * By Category
 * By WikiProject (ohman, this would be awesome)

Bulk Selection
Currently, many users who perform patrolling simply go down the list of new pages and open pages that they wish to patrol in new tabs. This is inefficient.

The List Interface allows the user to place a check next to each entry that they wish to add to their Triage Stack. The user then clicks the "Triage Checked" button and is immediately brought to the "Zoom Interface", centered on the first item in the list.

Individual Entries
Each entry within the List View contains the following elements:


 * A "Bulk Selection" checkbox
 * The Page title, along with its size and number of edits
 * A count of images and categories.
 * If there are no images or catagories, this shall be called out in bold and red
 * If the page is an orphan, this too shall be called out boldly


 * The date the page was created
 * The user name of the page creator, his or her edit count, and when he or she started editing;
 * The summary message of the creation
 * A "Triage" button
 * A "Triaged" or "Not Triaged" indicator.
 * If a page has been fully triaged, a green checkmark will appear
 * If the page has not been fully triaged, a red alert mark will appear
 * The user names and edit counts of the individuals who have triaged the article will appear to the right of the icon.
 * If any one of these individuals has the PAGEPATROLLER right, this will be called out.

Clicking on the "Triage" button next to any page entry will bring the user to the "Zoom" interface, centered on the selected page. The user's stack will randomly auto-populate in both directions (nominally by one entry only).

The List Interface is envisioned to be infinitely scrolling. The "Triage Checked" controls will persist at the top and the bottom of the page, but scrolling within the page will infinite scroll through the entire queue.

User Experience: Zoom Interface
Currently, New Page Patrol requires that all actions taken on an article from the list interface happen on a separate page outside any specialized patrolling interface. Alternatively, the "zoom" interface is a close-up, actionable interface for New Page Patrolling. It is heavily AJAX-dependant, so Javascript is required.

Queue Direction
The default load of the queue "direction" (oldest-to-newest versus newest-to-oldest) is a complex topic because the needs of the queue direction change. A perennial problem with New Page Patrol has been that newest entries are reviewed first, while the oldest ones are left to languish (and cause the queue to grow longer). There has long been a need to recruit more experienced patrollers to focus on the back of the queue once they've become proficient at patrolling at the front of the queue.

The bulk of the most "scandalous" pages and revisions are found and dealt with at the front of the queue. These include vandalism, attack pages and others that could also possibly create legal problems such as unsourced negative BLPs. The review of these edits can be seen as a higher priority, and is a more of an entry level task than dealing with the residue at the back of the queue. Most new articles including almost all badfaith ones are patrolled or tagged for speedy deletion while at the front of the queue, and usually the ones that are unusual or borderline are left to the more experienced patrollers at the back of the queue. When CorenSearchBot was working, the copyvio ones would be identified and resolved mid queue.

As currently designed, the Zoom interface works from the rear of the queue, with the option for the user to switch directions at any time. However, this default may need to be changed, as it would be unacceptable to make a change that resulted in hundreds of extra attack pages on Wikipedia at any one time, and also we don't want to restrict NPPzoom to only be used by the experienced editors who are ready to work the back of the queue..

One option - a difficult one, resource-wise - would be to develop an automated system that would be able to detect certain patterns of text that are typically associated with attack pages (e.g., "SO AND SO IS GAY!") and then mark the revision as "Suspect". Suspect revisions would then form their own sub-queue, and would float to the top regardless of the direction that the user is currently patrolling.

A second option would be to change the default according to the userrights of the patroller - reviewers would default to the back of the queue and others including newer editors would default to the front.

A third option would be to make this a user preference that would default to the front of the queue, but with perhaps a bot message after a certain number of patrols, or a barnstar and review from an admin saying that an editor is now accurate enough that they might want to try the back of the queue.

Interface Layout
In the Zoom interface, the user is presented with a dynamic screen that consists of three primary elements and several secondary, context-sensitive elements:


 * Interface Filters and Meta Information - this section (at the top) includes controls that allow the user to change the filters surrounding the list of pages that have entered the queue as well as providing additional meta data that is of use.
 * Article Viewing and Tagging Pane - this pane is context aware and associated with the article that is being reviewed. This section has several sub-components:
 * Article Metadata - size, create date, incoming links, etc.
 * User Metadata - creator, information about that user, etc.
 * Article Viewer - displays the article itself
 * Patrolled Tagging Pane - provides an easy-to-use pane for tagging articles for improvement
 * Deletion Tagging Pane - provides an easy-to-use pane for tagging articles for deletion
 * Pagination Controls - Two sections, one at the top and one at the bottom, where the user can simply skip to the next or previous article in the stack (which are shown to the user)

Workflow
Currently, it is assumed that there are three possible actions a Patroller can take when viewing a page in the Zoom interface:


 * 1) Ignore the article - the User clicks "next" or "previous" and skips this article.  The article remains unchanged.
 * 2) Nominate the article for deletion - the User selects one or more of the common tags for deletion and then clicks the appropriate "Mark and Next" button.
 * 3) Mark as Patrolled - the User can select zero or more of the common tags to mark the article as needing improvement and then clicks the appropriate "Mark and Next" button.

Ignoring an Article
If the user opts to ignore an article, the currently viewed article will not change state. The article viewing pane will be replaced with the next or previous article, depending.

Nominating an Article for Deletion
The user will select (via checkboxes) all appropriate "Deletion" tags. Some tags are multi-leveled (e.g., there are child tags for more specific cases). The system will be smart and only select the correct tags if they exist within the tree (done as flyouts), but the root level checkbox will remain.

The system will then insert the tags onto the page and mark the article as patrolled.

Attempting to nominate an article for deletion without selecting one or more tags will result in an error.


 * The system will automatically inform the creator of the article that the article has been nominated for deletion. It will do this by leaving a note on the user's talk page, which for most editors who have enabled email will result in them getting emailed.

Marking an Article as Patrolled
Marking an article as patrolled will do just that. Selecting additional tags for improvement is not required: some articles are fine just the way they are when they enter the system.

If a tag is selected, the proper template will be inserted upon clicking the "Mark and Next" button.

Future Phases and Thoughts
I think we should aim for the following goals:


 * New users would be gradually taught to patrol correctly and could work with what they feel comfortable with, eventually graduating up to areas of additional difficulty
 * Includes automated systems to aid in patrolling
 * Includes a more crowd-sourced, moderation-queue like process
 * This will increase work-load overall, but probably decrease it per-user
 * Has multiple flags other than simply "patrolled" vs. "not patrolled"
 * Allows for the re-viewing and flagging of an the article in situ
 * Could easily be used on tablets and mobiles
 * Gesture support would be awesome
 * New article reports for the 700 or so wikiprojects would get more experienced editors who are interested in the various subject areas. If these reports were in effect special new pages filtered by wikiproject then it would bring extra patrolling to the mid queue. See Meta:Research_talk:Patroller_work_load

For initial phases of the implementation, the tool can work within the existing template/tag system by automatically adding templates to the article

Currently:


 * No way to tag an article for improvement but not mark it as patrolled. This needs to be added because there will be articles where a patroller is sure it is unreferenced but not sure whether to mark it as patrolled.