Article Creation Workflow/Design

'''This document is a work in progress. Comments are appreciated but this is not a final draft.'''



This document describes the behavior of a modified workflow for page creation for Wikipedia. This document is a work in progress. Feedback is welcome on the talk page.

This project is envisioned with multiple phases. By necessity of resourcing, the first phase will focus primarily on messaging rather than software-enforceable solutions. See the bottom of this document for future phase thinking.

Rationale

 * Page creation in Wikipedia suffers from a very low signal-to-noise ratio. New Page Patrollers report feeling overwhelmed and overworked, having to delete far too many pages due to the low quality of the new pages themselves (often poorly sourced, not-notable, spam, or copyright violations).
 * Users who wish to create new pages are usually confused. The process of creating a new page is poorly understood, poorly explained, and more often than not is user-hostile.
 * Lack of understanding of Wikipedia guidelines contributes to new editors being bitten as their articles are deleted.
 * Wikipedia guidelines tend to accumulate massive instruction creep, which has the predictable effect of people skipping through essential information
 * The average time it takes an article to be speedily deleted is just two minutes.
 * This is not enough time for new users to make edits to correct a page.
 * There is no "safe harbor" except to create a draft in the User space.
 * Most new users are not even aware that the User namespace even exists, and are thus unable to create userspace drafts.
 * Unregistered users (AKA IPs or anons) are often extremely confused and "put off" by the fact that they are not allowed to create pages. There is currently no message that tells them to how to create a page by registering.

See for some additional evidence regarding these issues.

Hypotheses

 * A more user-friendly system for page creation, including more usable messaging, will promote a higher signal-to-noise ratio within page creation, and reduce the workload of New Page Patrollers.
 * A more positive experience for first time page creators will engender better long-term survivability among new editors, promoting retention by reducing negative experiences

Existing Flows
Today, registered users create articles in one of the following ways:
 * 1) Search: They search for an article that doesn't exist.  From the search results page, they are presented with a redlink to create the page.  From this redlink, the user goes directly to the edit page without any introduction to Wikipedia guidelines for new articles.  Once they are on the edit page, they are presented with a dense list of bullet points, only one of which (currently) mentions guidelines for content.
 * 2) Redlink: They follow a redlink, which leads them directly to the edit page where they can create the article.  Again, they have no introduction to the Wikipedia guidelines for new articles.
 * 3) Direct url:  They enter the url to a page that does exist.  In this case, the user is presented with a screen that provides a minimal level of direction.

In two out of the three cases, users directly enter the article creation page (i.e., edit page) without any introduction to Wikipedia guidelines. Given this flow, it is not surprising that well-intentioned newbies have their articles deleted. The existing flow does not properly educate these new users.

Proposed Flows
In the proposed flow, there are the following modifications:
 * 1) Landing page:  Instead of taking the user directly to the create/edit page, the user is taken to a landing page where they are clearly told what is going on (i.e., they are trying to locate a page that does not yet exist).  They are then given a set of options to create the page, along with bubbles that help them choose which option to take.
 * 2) One page article creation guidelines:  If a user decides to create the page themselves, they are presented with a page that informs them of the most important guidelines when creating an article.  The new user may proceed only after viewing this content (this page will be dismissable).

By clarifying the flow and introducing educational content, well-intentioned editors will presumeably be better informed once they create the page. There additional steps will hopefully raise the signal-to-noise ratio of account creation.

As mentioned previously, the flow describe above only represents the first step in improving the overall account creation process. Additional features are listed below.

Feature Requirements
(Note: all the copy, even in quotations, is placeholder)


 * Landing Page
 * All account creation flows (i.e., search, redlink, direct url) will send the user to the Landing Page
 * Landing page will have the following options:
 * Create the article myself (self create). Takes user to Educational Content page (see below)
 * Follow a step-by-step process (use the article wizard). Takes user to Article Wizard.  The Article Wizard will need to be rewritten, but in a later phase
 * Leave the area (I don't want to create an article)


 * One page article creation guidelines
 * If user selects "Self-create", they will be taken to an interstitial page that contains the basic content guidelines for creating a Wikipedia article. The goal of this page is to give the user an understanding of the basic guidelines for creating an article so that their article has a higher chance of surviving.  This page will have two options:
 * 1) I understand, continue
 * 2) Go back (goes to back to Landing Page)


 * Edit page
 * The top portion of the edit page will be changed to offer more clear and concise information new article guidelines. Non-essential links will be either removed or placed on the Landing Page.


 * Search Results
 * The Landing Page will replace the search results page when a matching term is not found (To be confirmed)


 * Direct url results
 * The current direct url page (page user gets when they enter the url for a page that doesn't exist, e.g., http://en.wikipedia.org/wiki/Fdasfdasfdsfdsafa) will be replaced by the Landing Page


 * Anonymous users
 * When an anonymous user performs any of the actions which would have triggered the Landing Page for a registered user (e.g., follow a redlink), they will get a page letting them know that an account is required to perform the requested task and a links to either create and account or log in.

User Experience
Text on the landing page will be as plain and minimal as possible so as to prevent "Wall of Text Overshock" (which is one of many reasons why bad articles are created: users ignore the text blocks because they are opaque).

The user will be prevented with a friendly message indicating that the article does not exist and will then provide them with three very obvious options:


 * Create the article myself (self create)
 * Follow a step-by-step process (use the article wizard)
 * Leave the area (I don't want to create an article)

Language for the three options is to be user-centric and actionable ("I want to...").

If the user clicks on the "Leave the area" button, they will be returned to the previous screen or the Wikipedia Main Page.

Otherwise, clicking on either of the "actionable" buttons will cause an interstitial display to open to the side, with an indicator directly pointing at the button clicked.

The clicked button shall change state (indicating its activity). The remaining buttons will dim, indicating inactivity. Mousing over the inactive buttons will cause them to brighten (indicating that they are still actionable).

The interstitial display shall contain additional information for the user about the action they are choosing as well as additional controls, depending upon context.

For the "Create the article myself" text, a series of short bullet points are to be included that will help the user understand how and why their article may be deleted.

For the "Use the Article Wizard" text, the text shall help to establish the process that the user will undergo.

Both options will have a "Continue" button. Clicking on the "Continue" button will bring the user directly to the chosen action (either the beginning steps of the Article Wizard or the Editor pane).

Additionally, in the "Create the article myself" display, a checkbox shall be included: "Skip this step in the future". Checking this shall save a preference on the user whereby they will not see the interstitial dialog in the future (e.g., clicking on the "Create this article myself" button will cause them to go directly to the editor).

Analytics
The following is a list of data we would like to track, subject to our privacy policy.

Workflow evaluation:
 * Number of sessions per day that enter the new workflow (i.e., view the landing page) through:
 * Redlinks
 * Search
 * Direct entry through url
 * Web-analysis of workflow (fallout):
 * Number of sessions-views of the landing page per day
 * Number of users who click on Create
 * Number of users who click on Article Creation Wizard
 * Number of users who leave page
 * Number of users who click on the other text links (AFC, search etc.)
 * Per day, the number of the following generated through this workflow:
 * Number of new articles created directly
 * Number of these articles that are not deleted after [x time]
 * Number of new articles created through the Article Creation Wizard
 * Number of these articles that are not deleted after [x time]
 * Number of AFC requests

Additional analysis:
 * Effect of article deletion on retention rate

Workspace Editing
The next phase of improving the article creation workflow will involve creating a "Workspace" for new articles which would serve as a sort of "safe harbor" for articles. This will be a specialized namespace where new articles are:


 * Not indexed or visible via search engines, unless the user is explicitly searching within the namespace on Wikipedia
 * Possibly subject to automatic deletion after certain criteria are met (such as 60 days without being edited)
 * Encourage and allow for multiple editors to work on the article (as opposed to userspace drafts)

When creating a new article, the editor will be modified in such a way as to not impede existing user workflows while providing new users a workflow that will allow them to make mistakes, add sources, fix notability issues, and other problems without falling prey to the "two minute speedy deletion" process.

Users will be given the choice to save the article in the Workspace or to Publish immediately to the Main namespace.

Workspace articles will be able to utilize templates (and categories) from the Main namespace, but they will not appear in those categories (outside of the Workspace).

Articles that already exist in the Workspace may be published to the Main namespace at any time. Doing so will "move" the article from the Workspace to the Main namespace. Article history will be retained.

If an article is saved to the Workspace, it will appear in a "Workspace Articles" list that will be searchable by anyone, as well as a "My Workspace" list.

It will not be possible to create two articles of the same name within both the Workspace and the Main space. Only one can exist at a time. Similarly, the community will have the ability to "salt" particular titles in the Workspace which it knows to be targets of abuse or very unlikely to become viable articles.

Articles may be unpublished from the Main namespace. In this event, the article will be moved back to the Workspace.

Workspace Tagging
The third phase of this project involves adding the concept of "tagging" to articles within the Workspace. Experienced editors will be able to review articles within the Workspace and then apply one or more tags to the article (such as "Needs photos" or "Needs better sources").

This will help new users by providing them with a list of issues to work from. When an issue has been addressed, the editor marks the tag as having been addressed, which then places it back into the review queue.

Tags are metadata applied on the object (the Page) and not templates within the article. Because of this, they can be searched and intersected in a manner such as:


 * Show me all Workspace articles that Need Photos and do NOT Require Sources

Automated Checks
Another set of features to be considered is the automatic checking of articles being created. For example, we could perform a simple check for references to make sure that any article created has at least one reference. If a user tries to create an article without any sources, they will receive a message letting them know that sources are required.

These automated checks could also be handled in a manner similar to AbuseFilter rules and be written by the community itself. For example, a rule to prevent abuse of a biography of a living person could be constructed thus:


 * IF the article contains the "Living Persons Infobox"
 * AND there are no references
 * DO NOT allow publication

or, to combat spam:


 * IF the article contains greater than X external links
 * AND the article is less than Y kilobytes of non-links,
 * DO NOT allow publication

MoodBar comments
MoodBar is a simple tool that allows new editors to report confusion or frustration. Among the emerging themes from more than 2,500 comments by new editors, confusion about new page creation is one of the most common themes. Preceding a quantitative evaluation of comments, the following selection of comments are examples:


 * dont know how to create a page
 * For beginners it is difficult to understand what must be done to bring up a page to create an article.
 * so difficult to make a page
 * I'm not sure how to write a whole new article on an artist. Where's the wizard??
 * Trying to figure out how to create an article is VERY confusing, not at all intuitive
 * I dont understand how I could write an article. The Sandbox feature is not user friendly.
 * because all i want to do i create a page but i cannot see anything that says that
 * i am not able to add a new page. how to add?? please send me the link.
 * I have no idea how to create a biography
 * I would like to publish an article and cannot find the way to do it properly from my account.
 * I dont know how to create my article in wikipedia !!! i am so confused....

New users are also confused by the instructions, often in ways that could be resolved by some human help:


 * Respectfully, I didn't quite understand what was meant by 'reliable 3rd party sources' disregarding external links
 * It is very complicated, and the guidelines are confusing.
 * It's my first page - forgive me. I understand that I need more verifiable sources, but I'm not sure exactly what needs more verification.
 * Ridiculous to say the least. Too many procedures and policies. Ready to move on to new system/site that is more user friendly

Increase in templated warnings toward new users
We have plenty of evidence that new users, both in the process of new page creation, but also in general, are receiving an ever-increasing amount of automated or semi-automated negative messages on the English Wikipedia:



This indicates severe problems with the new user editing experience, and provides an incentive for exploring "safe spaces" where new editors can learn and collaborate without an expectation of compliance with the myriad guidelines and policies Wikipedia requires them to learn.

Speedy deletion
The top two reasons articles are speedily-deleted on English Wikipedia are:


 * 1) Notability (~37%)
 * 2) Unambiguous advertising or promotion (~8%)

That trend is one we suspect holds true for other Wikipedias as well. Since attack pages, pure vandalism, and nonsense do not account for a majority of deletions, this data suggests that effectively educating potential authors about what kinds of articles we want in the encyclopedia could help stem the tide at New Page Patrol and in the fall-off of new editors.

Data Requirements
Per Dario:

Focus on case 1: Search & Create

A/B test hypothesis: the new NPC funnel results in a higher proportion of pages created that survive speedy-deletion

Feature data:
 * npc_page_title (title of page to be created)
 * npc_accept_timestamp (timestamp of click on link to edit screen)
 * npc_complete_timestamp (timestamp of successful creation of new page, NULL otherwise )
 * npc_page_id (page_id of new page if page is created, NULL otherwise)
 * npc_rev_id (revision id of the first edit if page is created, NULL otherwise)
 * npc_rev_len (length of first revision of new page if page is created, NULL otherwise)
 * npc_user_id (user ID)
 * npc_user_contribs (lifetime edits of page creator)
 * npc_user_bucket

Clicktracking data:
 * npc_create_pitch_0
 * npc_create_pitch_1
 * npc_accept_0
 * npc_accept_1
 * npc_complete_0
 * npc_complete_1

To Do

 * Screen mockups for
 * Landing page for anonymous users
 * Landing page for search results
 * Better example for the UX for mouse hovers and selections
 * Screen shot of Article Wizard interstitial