Upload dialog/Design/Bad uploads

There are have been several bad uploads using this tool, most of them copyright violations. An A/B test was conducted to find a better way to educate users about Commons, but the results aren't in yet. More details on Multimedia/December 2015 cross-wiki upload A/B test.

Possible reasons for bad uploads

 * Not understanding what kind of files are helpful for Commons and other projects (selfies are useful for social networks, for example).
 * Not understanding the meaning of Own work.
 * Not understanding the basics of Copyright law.
 * Malice.
 * Disregard for legal notices (they aren't enforced as often on other websites).
 * The context of the upload, that is the page it is being uploaded from is forgotten or ignored.
 * Just testing a new feature, unaware of side-effects. Since the user is in an in-progress edit (until saving) it may not be obvious that the image gets public as soon as it is added. (Do we have numbers of how many of the bad uploads were not finally saved on the page?)
 * Repeated behaviour. The user uploaded a bad image before without noticing it, and now does it again.

Possible directions to help with each issue
Not understanding what kind of files are helpful for Commons and other projects: Not understanding the meaning of Own work.: Not understanding the basics of Copyright law: Malice: Disregard for legal notices: The context of the upload: Just testing a new feature: Repeated behaviour:
 * Provide more specific examples of the kind of content that is helpful for Commons.
 * Focus on a clear subset of the valid content (even if some valid content is left out) and encourage the user to be conservative (e.g., "in case of doubt, don't upload it.").
 * Provide counter-examples of content that is not welcome (based on frequently bad uploads).
 * Anticipate part of the classification stage: asking about the kind of content. which is used to provide specific guidelines (e.g., discourage uploading selfies if the user indicated the content depicted is a person), and find relevant categories (in the next step).
 * Use more familiar terms: "pictures you took, media you created".
 * Focus on the authorship aspect (which may be easier to understand).
 * Suggest the effort is worthless (e.g., the file will be deleted fast).
 * Surface the negative impact of the actions (e.g., wasting time for contributors, harming an educational project).
 * Provide succinct information users can understand with no effort.
 * Require explicit interaction (with caution to avoid adding too much friction).
 * Mention the page the file is being added to.
 * Discard files that were added but are no longer present when finally saving the page (or mark them to make them easier to track for the community).
 * Provide a stronger message for those users that got recent files deleted.

Example scenarios
Some scenarios based on the usual personas:
 * Good will, bad image. Henry finds out that the Wikipedia page for his favourite music album lacks a picture of the cover. He decides to help by scanning the album cover he owns and adding it to the article. Henry does not understand the license or authorship issues and does not spend much time reading all the instructions for something as simple as adding an image.
 * Vandalism. Jack saw on TV some inappropriate comment from a politician and decides to add a funny picture to the politician page on Wikipedia as a kind of personal revenge. He wants to share this with his think-alike friends. Jack is not aware of the impact of his actions for the community.

Proposed solutions
There are several possible directions we can explore based on the above issues and ideas:

Confirm contextual information

Making the information connected to the specific file the user is uploading puts the information in context (i.e., "you created this album cover?"). Asking the user to confirm the above information as part of the action to upload helps to encourage the user to give a thought. The example below also includes some additional info along the lines of "anticipate the consequences of bad uploads", to try to discourage vandalism.



Better educate users

Providing specific examples of good and bad uploads can be useful to help users understand which files are welcome. To avoid providing too much information, information can be distributed acros two steps.



Anticipate consequences of bad uploads

Surfacing the impact of bad uploads can discourage vandals as well as keep the users more conservative about what they are uploading (only what they are completely sure is ok).



Explicit choice

Asking users to explicitly select a choice can help to avoid users to just go on auto-pilot. Presenting the choice as an action to move to the next stage can also avoid adding too much friction.



Another possibility for selection is to ask users to do a quick classification. That will provide an opportunity to provide more specific indications about the kind of content that is useful. In addition it can help select categories on the next step (making this initial selection not just additional friction):



More approaches

There are two types of solutions being proposed, one than improves education, and the other where some kind of data is collected, either discreetly or through a form and that data is used to help editors while they're patrolling.
 * https://phabricator.wikimedia.org/T120867#1914007
 * https://phabricator.wikimedia.org/T120867#1903246
 * https://phabricator.wikimedia.org/T120867#1887088
 * https://phabricator.wikimedia.org/T120867#1868598
 * https://phabricator.wikimedia.org/T120867#1884287