User:Jmorgan (WMF)/Commons Upload Wizard user study

Jump to navigation Jump to search


  1. To better understand the motivations and experiences of skilled photographers uploading photos to Wikimedia Commons.
  2. To better understand the experience of using Upload Wizard on a tablet computer.


Attracting more skilled photographers to our movement could have a major positive impact on content quality. However, skilled photographers are an important class of "power" content contributors who are not currently identified as such or specifically supported by any current software products.

Skilled photographers often process their photos using specialized photo editing software tools before uploading their photos to social media sites. Over the past few years, tablet computer screens and processors have improved significantly, and there are many photo editing apps available for iOS, Android, and Windows tablets. Some photographers now use tablets rather than laptop or desktop PCs to edit their photos. Many photo sharing services, like VSCO, Flickr, and Instagram, provide apps that provide a tablet-optimized experience for photographers as they browse, upload, and organize photos.

Wikimedia currently lacks any sort of dedicated (maintained) Commons Upload app, so it would be useful to evaluate the feasibility of uploading images from a tablet via the vanilla website using the Upload wizard extension, so that we might have a baseline against which to compare any future uploading interfaces.


evaluate use of the images upload wizard starting at the Commons main page and finishing with a user identifying the jump off point to use their uploaded image in an article.


I worked with a skilled photographer participant on the upload of a photo she had taken and then edited in her preferred editing application (VSCO), from her tablet (Apple iPad Mini 3). The participant had frequently uploaded images to Flickr, Instagram, and Facebook from her tablet. Photography was not the participant's primary means of financial support, although she had been paid for photography in the past. She mostly took and shared photographs for personal fulfillment. The participant had previously released photos under Creative Commons licenses, primarily on Flickr.

The participant has a Wikipedia account and has made several edits to English Wikipedia under that account in the past, but is not a regular contributor. She has uploaded one photograph to Wikimedia Commons before, with the assistance of an experienced Wikipedia editor.


The setting for our study was a small meeting room at the company for which the participant worked, during her lunch hour. For the study, the participant was logged in under her Wikimedia Commons account. The participant used Safari, the default browser for her tablet. Her tablet displays the Mobile MediaWiki stylesheet by default, and that stylesheet does not display an upload option. So for simplicity's sake we started the study after the participant had manually switched to the Desktop stylesheet.

She had agreed ahead of time to donate her photo to the Commons, and understood the conditions of the CC-BY-SA license. The study took approximately 20 minutes. The investigator typed notes during the task, and the session was also audiorecorded, with the participant's permission.


The participant was instructed to think aloud during the study, and the investigator prompted her to resume speaking if she lapsed into silence for more than 4-5 seconds. He also asked her questions about her expectations, goals, and impressions at various points throughout the task, for instance if she appeared to exhibit surprise, frustration, or confusion.

Before the participant began the task, the investigator read her the simple task instructions (see "Instructions" below), and then asked her several questions about her background with photography, sharing photos on social media, and with Wikimedia websites (see "Pre-task questions" below). After the participant concluded her task, the investigator asked her several open-ended questions about her experience (see "Post-task questions" below).

Subsequent to the study session, the investigator conducted a personal heuristic walkthrough to understand the image upload process in greater detail and to further contextualize the notes he had taken during the study session.


"This is a test of an interface to help people upload images to Wikipedia. Imagine you've decided that, in addition to reading Wikipedia, you want to help build it. Your goal will be to upload a photo that you have taken and edited on your iPad to Wikimedia Commons. Remember, we're testing the interface, not you. If you're having difficulty with something, the problem is with our design. Please think out loud as much as possible; tell us your thought process during each task, and try to explain your choices."

Pre-task questions[edit]

  1. Have you edited Wikipedia, even once?
  2. Have you uploaded images to Wikimedia Commons? If so, why?
  3. Have you uploaded images to other websites? If so, which websites, and why?

Post-task questions[edit]

  1. What frustrated you the most about this task?
  2. What surprised you the most about this task?
  3. What did you enjoy about the process, if anything?
  4. What improvements could have made this process easier or more enjoyable?
  5. Is there anything else you would like to share about your experience today?


The participant was able to complete the task as specified ("evaluate use of the images upload wizard starting at the Commons main page and finishing with a user identifying the jump off point to use their uploaded image in an article"). She remarked that it was not substantially more difficult than using the upload wizard on a laptop PC (a task she had performed several months previously under the guidance of an experienced Wikipedian).

Below I present the results of this one-person user study, broken down into things that worked well, things that didn't, significant one-off observations, and my own notes from performing the task a second time after the study.

What worked well[edit]

Basic form interactions. Although the desktop MediaWiki skin is not optimized for tablets (more on that below), all the basic controls for selecting an file to upload work well. When the user clicks on "Select media files to share" at the beginning of the workflow, a small modal dialog box pops up in-place, and offers a view into the user's photo library, organized hierarchically by album. Images are represented by thumbnails, not titles, and tapping a thumbnail selects the image for upload. Similarly, filling in fields by tapping on the empty textboxes, selecting dates from a date-picker work well, though targets were sometimes too small and AJAX-y form validation doesn't always behave as expected--for example, in some cases beginning to type a category name prompted a list of similarly-titled categories to pop up under the text field, but this action seemed more finicky and liable to disappear due to "fat fingers" than on desktop.

Tracking progress towards completion. The user several times noted that she appreciated and understood the progress bar, which shows the "steps" in the upload process. However, I personally found its light blue color overshadowed by the bright yellow "Help" notice directly below it, and step 1 doesn't exist! (see heuristic evaluation notes).

What did not work well[edit]

Copying formatted links to the image. This was the only task that the participant could not complete on her tablet: at the end of step 5, the participant is prompted to copy the contents of a textbox that contains either a) formatted wikimarkup for a captioned thumbnail image, suitable for placement in a Wikipedia article. or b) a bare hyperlink to the image file page. However, on an iPad it is virtually impossible to copy either string of text: you either end up selecting the entire DIV around the textbox, or only a few characters within it. Arguably, copying the text was not part of this study, since it occurs outside of the image upload workflow, but I'm including it since success in this simple task is nearly-essential for new uploaders to be successful in the natural next step: using their uploaded image in an article.

Identifying salient calls to action. This was probably the single most difficult class of problem the participant had: at various points in the workflow, the next step that came most "naturally" to the participant was not immediately visible, usually because it was a bare blue hyperlink overshadowed by arguably-less-relevant UI buttons or brightly colored banners.

  • For example, at step 5 the participant wanted to see the image she had uploaded, along with the metadata she had entered about that image, but the most salient calls to action were "Upload another image" or "Go to the main page". The investigator eventually pointed out to her that by clicking the image filename, she would be able to see the image and metadata.
  • another example: the participant wanted to add a second category, but the blue "add another category" hyperlink was not immediately obvious to her and she ended up trying to comma-separate her categories first.

Identifying salient notifications. Similar to the bullet above: there were several points in the workflow where the system had served up an interface message to the user in response to some input, but these messages were not very visible. For example, when the user added her first category title, which did not match any existing categories, the words "this category is not in use yet" appeared to the right of the text box. But they appeared in light grey, italic text, so the message looked like basic interface text, and was initially ignored.

(de)activating targets. Although the explicit UI buttons were good sized, trying to click the many small targets presented to the user (like "?" tooltips and single-line form fields) was a struggle on the iPad. De-activating the tooltips was even more difficult, since the buttons are stateless and often partially occluded by the small dialog boxes they generate.

Explaining attribution and licenses. Although the user was familiar with the standard CC-BY-SA Commons license and was uploading her own work, she noted that she was not sure what other licenses were available, why one would chose one of those over the default, and how one was supposed to license work that was not their own. She was interested in learning more, but she was hesitant to explore these options (see "Anything that involved clicking away" below).

Surfacing information about categories. The user did not know what categories to use for her image. She eventually decided to add the category "New York non-profits". About 30 seconds after she entered that category name, she noticed the "No category exists with this title" message to the right of the added category title, and was dismayed but resolved to deal with it later. She noted that she was "not sure what the best practice [was]" when categorizing, and whether it was good or bad to create new categories. Again, she seemed to be interested in more information but was hesitant to investigate, either because it was tedious, or because she didn't want to risk losing her work.

Adding multiple categories. After adding her first category, the participant tried to add a second category after the first one, in the same text box, separated by a comma, as you would add tags to content in other websites. She then noticed (again, almost a minute later) the blue link "Add another category" and clicked it, causing another category textbox to appear below the first. After that, she re-typed the first word of second category she had thought of for her image ("LGBT...") into the new textbox and after battling with the inconsistently activated Live Filter for a few seconds, managed to successfully select the pre-existing category "LGBT community centres". The user was satisfied with these two categories as a start.

Anything that involved clicking away. The participant was notably hesitant to click on any links that looked like they might kick her out of the upload workflow. She explained that she was concerned that clicking a link would redirect her browser tab to a new page, and she would have to start the upload process again from scratch, losing any input she had entered in the mean time.

(pre)viewing uploaded image. Both the preview of the image in step 4 ("describe") and the thumbnail of the image in step 5 ("use") are tiny on a iPad mini screen, even though they are surrounded by plenty of empty space (see Figure 1). They aren't big enough to be useful at all, and it's not made clear via interface cues that you can see a blown-up version of the image in a "lightbox" by clicking on the thumbnail. Additionally, in step 5, it was not immediately clear to the participant that she could click on the hyperlinked filename below the tiny thumbnail and go to the File: page to view the full-sized image (see also "Identifying salient calls to action").

Notes from heuristic evaluation[edit]

  • The Wizard starts on step 2 ("upload"); step 1 ("learn") is skipped and inaccessible. Did it ever exist?
  • There is no back button, and you can't click the title of a previous step on the progress bar to return to that step. I suspect this is a consequence of the way the wizard is built. Still, it is not impossible to cache input from previous steps, and modern users expect to be able to return to a previous page when filling out multi-step webforms.
  • The yellow "Please visit Commons:Help..." banner is visually deafening. Worse, clicking on it does not open up Commons:Help in a new window, so any user who clicks on it loses their current upload progress and needs to start again.
  • Tooltip text for "Other information" is not helpful: Any other information you want to include about this work — geographic coordinates... but the previous field is Geographic coordinates! The only other suggestion given for "other information" is "other versions of this work", but it's not immediately clear why that should be listed in a separate textbox than the description.
  • Lots of redundant validation in step 2. After you upload, you see the words "uploaded", and a green check mark indicating success, and the words "all uploads were successful!", and also "1 of 1 files uploaded". But most of this is just black text (see "Identifying salient notifications" above).
  • The help resources that you get to when you click on links in the workflow (Commons:Categories and Commons:Help) are not very helpful. The first is a link to a policy, which comes across as a baffling nonsequiter in a user help context--the sections of that document that may be relevant to a new uploader learning about categories are way below the fold. The second is a link to a low-traffic help forum with confusing instructions--the Teahouse gadget would be ideal here.

Other notes[edit]

  • There were a couple aspects of the image upload process that are probably not the fault of UploadWizard itself, but should be noted because they created pain points, for later investigation. In step 2, the image was re-titled "image.jpg" and none of the usual metadata such as datestamp, which the user stated were in the original file, were pre-populated in the fields in step 3.
  • The participant also provided useful feedback on the organization of the image File: page, which she explored after uploading her image. But I'm leaving that out for now because this report has gotten long enough, and the File: page wasn't explicitly part of the task anyway.


Figure 1. Step 3 of the workflow, with giant yellow help banner and tiny preview image. The most important CtA on this page--specifying attribution--is not visually salient at all.
Figure 2. Tiny image of the clone repository widget on GitHub. Used to illustrate a UI pattern that we should be following in step 5 of Extension:UploadWizard.

What we can (and should) fix now[edit]

Add a "copy" button next to the formatted image links in step 5. Rather than making people tap, drag, and copy, make this a one-click operation, like GitHub does (see Figure 2). This would help people who are copying these links on desktop computers as well.

Use a consistent visual language around interactive page elements. This should be a relatively easy fix, since we have a style guide. The biggest issue here is plain blue links: clicking a blue link...

  • sometimes opens up a new page in a new tab, but
  • sometimes redirects you to a new page in the same tab (exiting the wizard);
  • other times clicking a blue link opens up a new (hidden) page section;
  • on still other occasions clicking a blue link removes an attached file.

Other interactive elements should also be made consistent:

  • the "Upload" button that triggers the Wizard on the Commons main page is styled differently than the buttons within the Wizard
  • you go from step 2 to step 3 by clicking "Continue", but "Next" is the label for that button everywhere else in the workflow (although it might make sense to use "Finish" instead for the button that takes you to the final step.

Use a consistent visual language around notification and form validation messages. Current behavior: if you try to enter a name for your image and that name is taken already, you get a red warning above the image. But if you enter the name of a non-existent category in the category field on the same page, you receive a black, italic notification next to that field. Notification that your file uploaded successfully are black, but not italicized.

Maintain a call to action hierarchy. The most important action you want a user to perform should be the most salient. Determining what that action is may require more testing, but in some areas it's clear that the hierarchy is currently skewed: for example...

  • the "Help" banner should be made less salient (smaller and less colorful, see Figure 1) because it currently overshadows all other buttons, and
  • on step 5, "View your image" should be more salient than "Return to Commons homepage".

All links should open in new tab. A user should never have to worry that by asking for more information they are risking losing the work they've done so far.

Remove outdated or redundant text. The redundant "uploaded" validation messages in step 2 should be reduced; the unused "learn" step should be removed; and the "geographic coordinates" example should be removed from the "Other information" tooltip.

Create a simple infographic to explain licensing options. The individual license deeds (linked from the Wizard) written in plain language are helpful, but there is no way for a user to compare the features of two licenses side by side. We should provide them with enough information to make an informed decision about how they want to release their work.

Increase the default image thumbnail size and indicate that users can zoom in. It's too small to be useful as it is, and there's no indication that if you click on it, you can view a larger version (elsewhere on wiki, clicking an image takes you to the file page, so this inconsistent behavior should be called out visually).

Link to a better help page on categories, and make that link more visible. Until/unless we want to invest in recommending categories, we should at least provide better end-user documentation that explains how categories work, why people should categorize, and that provide some of the "best practices" that my participant was asking for.

Thinking longer term[edit]

Cache input from previous steps and add a back button. When we do this, make the progress bar headers clickable for all steps the user has completed.

Add structured fields for additional, specific metadata beyond "Other information". One suggestion the participant made was a field for street address, which is often much easier for people to find manually than Lat/Lang coordinates, and can be used to approximate those after the fact. Other possibilities include the option to specify camera make, model, and settings (if not embedded in image metadata), as well as processing software used and processing steps applied.

Provide better category suggestions. Categories are in many ways the most important metadata of all: content on Commons is only useful if it can be located. Giving uploaders better suggestions of categories to use probably involves a combination of faceted categorization prompts (for example, having different fields asking specifically for location categories, subject-matter categories, etc) and some good algorithmic work. One option would be to suggest relevant existing categories based on the categories the user has already specified. Or, ask people to enter a string of keywords, rather than entering categories themselves, and then click a button to "search" Commons for related categories, using those keywords as search terms to return a list of categories that the user can choose from.

Build apps. Or start maintaining the ones we already have once more, and create a tablet-specific versions. Perhaps a more responsive mobile stylesheet, that reveals upload functionality when expanded to tablet dimensions, could be an interim solution?