Wikimedia Research/Usability testing

This page describes the basics of how, when, and why to do usability testing. Usability testing is an essential part of the developing usable products. Ultimately, this helps create more usable and user-centered wikis. This primer covers the basics of usability testing practices and tools. It describes and shows how to find the right participants, how to design a study, structure data collection, and perform analysis and synthesis for delivering findings toward iterating a more usable functionality. This will be of benefit to engineers who have ever had their code blocked by usability bugs.

This document was prepared in support of the Wikimania 2015 presentation Operationalizing usability testing for Mediawiki engineers.

What usability testing is good for

 * Figuring out whether users can complete important tasks, like finding an article, uploading an image, or adding a citation
 * Figuring out how long it takes users to complete tasks
 * Comparing different interface options for for the same kind of user, like Visual Editor vs Wikitext editor for casual Wikipedia contributors
 * Comparing the same product or feature for different kinds of users, like Visual Editor for casual Wikipedia contributors vs experienced editors
 * Comparing a new version of a product or feature with an earlier iteration, like the first release of Echo vs. a newer version
 * Learning what surprises, delights, or frustrates users about your product
 * Learning what users expect to happen when they perform a new task for the first time

What usability testing is not good for

 * Learning users' opinions about why they would use your software.
 * Learning users' opinions about whether they would choose your software vs. your competitors.
 * Learning what users think would make your software better.
 * Testing everything about your product
 * Testing how every user will use your product

Write a test plan
Create a written summary of what you why you want to perform a usability test, and how the results will be used. This will help you decide which tasks to include in your test (Create a protocol) and which kinds of users to test with, and how many to recruit (Recruit participants).

Create a protocol
A protocol is a list of tasks that test participants will perform with your software. To determine which tasks to include, start out by writing a list of questions that you want to answer with the study.

The protocol should contain basic instructions about the task the users are being asked to perform. The goal is to give the user enough information that they understand the nature of the task, and where to start, but not to tell them how to accomplish it. These instructions should be written out, so that all users get the same amount of explanation.

Some advice for writing task instructions
 * Use plain language and short sentences. Avoid using technical jargon ("diff", "revert", "nav bar", "wikilink"), unless you are 100% sure your users understand exactly what those terms mean.
 * Make the goal of the task clear. Tell the user what you want them to accomplish, and why they are being asked to perform this task, but don't tell them how to do it.

Pre-test questionnaire
In order to understand your users better, consider developing a pre-test questionnaire and asking your users to fill it out before they begin their first task (or ask them verbally, and write down their answers. See Take notes.) This will help you out during analysis as you compare findings from different users. For example: someone who hasn't edited Wikipedia before, but who has extensive experience with MediaWiki in a corporate intranet environment, will probably be more successful at mastering wiki syntax than a new user who is completely unfamiliar with wikis.

To save time, you can email your users with a link to a questionnaire that you built using SurveyMonkey, Qualtrics, Google Forms, or other online survey tool. Ask them to fill it out before the session begins.

Recruit participants
You should recruit participants for your test who have similar goals, needs, and characteristics as the people who use your product. Every piece of software is used by people with diverse backgrounds and levels of experience. Particular tasks or product features will be more important to some users than others. Generally you will want to test the usability of your product for a particular type of user, or to compare the experience of 2-3 classes of user (for example, new users vs. casual users vs. "power" users).

Several studies have established that you can identify most "serious" usability issues with a product after testing it with as few as 3-5 users. If you are testing your product on several different classes of user, you should aim for at least 3-5 users per class.

Schedule more participants than you need. Recruiting and scheduling test participants can be one of the most time-consuming parts of a usability study. Many of the people who respond to your solicitation drop out before they have completed the test. With some respondents, you won't be able to find a time that they are available. Others will fail to show up at the scheduled time. As a general rule, try to schedule at least 50% more participants than you absolutely need: if you absolutely need 6 people to complete the study, schedule 9 or even 10!

Facilitate usability tests

 * Use a script.
 * Make it clear to the user that you are not testing them, their technical skills, or their mind-reading abilities. In usability testing, the users are the experts: they are helping you test the software, so you can make it better.
 * Lead them to that starting point. Don't make users do unfamiliar things that are outside the scope of the tasks you've assigned them. If you are testing whether people can change their preferences, help them log in first. If you are testing an image upload workflow, direct them to a page with an "Upload an image" link.
 * Remind them to think aloud. Ask the user to describe what they are thinking, what they are trying to do, and what they expect to happen, right now. You will probably have to remind them repeatedly: that's expected. Make sure to remind them gently.
 * Don't lead them through the task. The goal of a usability test is to understand whether users can complete tasks on their own, under "normal" circumstances. When a user struggles with a task, it can be tempting to jump in and tell them what to do. Don't give in to this temptation.
 * note: if the participant wanders so far afield that they are no longer performing the task you assigned them, feel free to direct them back to the starting point (in a polite and supportive way). Similarly, if a participant can't complete the task at all after multiple attempts, and has completely given up in frustration, you can give them hints to get them back on task.
 * Ask questions. In most usability testing sessions, the most valuable findings come from asking questions in the moment. If a participant seems confused, surprised, delighted, frustrated, etc by something they see, or something that happens, ask them to explain what they are thinking/feeling and why. It can also be useful to ask users what they expect will happen when they do something, right before they do it. Then you can follow up with them after they see the results of the action (clicking a link, entering a search term) by asking "So what happened? Is this what you expected? Why or why not?"

Take notes

 * Have a designated note-taker: ideally, you should have two people performing the test. One person facilitates the test--their job is to guide the participant through their task, and ask and answer questions. The other person devotes their time to taking notes. It's possible for one person to fill both of these roles, but you'll have an easier time and get more out of the test if you divide responsibilities.


 * Record the session: With the participants permission, audio or videorecord the testing session. Videorecording is especially useful if you can capture the participant's screen (less useful if you have the camera pointed at their face the whole time).


 * Write down important quotes: Direct quotes from participants are a great tool for communicating the significance of your findings in your report. Quotes can serve as evidence that someone is confused about a task. They might say they're doing one thing, but are actually doing something different. They also might express surprise, dismay, frustration, or curiosity--these responses are often signs that something interesting is happening.


 * Record the time for important notes: If you use audio or video recording, record the timestamp of the quote so that you can find it later. Record the times that a user started a particular task, so that you can easily go back and review their performance on that task.


 * Ask any viewers to take their own notes as well: Different people will notice different things when they watch the same user test. If you invited collaborators (developers, designers, project managers, executives) to watch you conduct your test, ask them to take notes. If possible, have them use the same note-taking template as your designated note-taker. This will make it easier to consolidate all of your findings later.

Analyze results

 * Debrief right after the session: Memory is a fickle thing. Have a quick discussion with your co-facilitator (and any observers) right after the session is over. This will help you make sure you have captured important observations or ideas that came up during the session, before they slip away. Each person should review their notes and comment on anything they noticed during the session that was particularly interesting or important. You can also use this time to refine your test plan or your protocol. Was a participant confused by the way the task instructions were worded? Consider changing them before the next session. Have you noticed that a particular task isn't generating good results? Consider replacing that task with one that helps you gather more relevant information, or drop it entirely to free up time to focus more on other tasks.


 * type up/review/consolidate your notes.
 * note common problems across users.

Report results

 * prioritize based on severity.
 * prioritize based on design goals.
 * share quotes and clips.
 * provide suggestions/next steps.

Who can perform a usability test?
Anyone can perform a usability test. Formal training in usability testing methods can help you ask the right questions and interpret your results, but it's not necessary. If you read through some of the resources listed below, perform some practice sessions with friends or colleagues to get the hang of it, you'll be in a good position to start testing with real users.

What parts of my product should I test?
Any part of your product that users are going to need to understand and use can be tested. If a user needs to find, browse, click on, fill out, create, save, or delete a part of your product in order to complete a task, that part of the product is "fair game" for usability testing. Since you probably can't test everything, consider what the most important tasks or workflows are and what parts of the product a user needs to understand to complete those tasks or workflows. For example, if you are building a wiki that anyone can edit, it's probably more important to test whether people can find the "edit" button and save an edit than to test whether they can create a user account.

As you are testing your "high priority" features with real users, you might find that some other features are actually more important than you expect. For example, if one of the high-priority editing activities you want to support is adding images of local landmarks to wiki pages about those landmarks, you might discover while testing that it is critically important to test whether users can successfully upload an image to Wikimedia Commons.

Who should I recruit to participate in a usability test?
Ideally, you should recruit people who actually use (or intend to use) your product. But it's usually sufficient to recruit people who are similar to your actual users in terms of their goals, interests, level of experience, and demographics. Some important things to keep in mind, try to recruit people...
 * with the same general level of technological familiarity as your target users
 * with the same level of reading/speaking competency as your target users (in whatever language you use in your product)
 * who use the same kind of technology to access your product (for example, if you're testing mobile Wikipedia editing, recruit people who own a smartphone).
 * who have some similar goals or interests as your target users (for example, if you're testing the Commons image upload workflow, recruit people who like to take photos, whether or not they are professional photographers).

Other demographic factors like race, age, gender, and profession may or may not be important. It depends on the product you're building and what it's supposed to be used for. For example, if you are designing an improved code editor for MediaWiki Gadgets, try to recruit people who regularly build gadgets--or at least have some experience with Javascript or web development. If you are designing a Content Translation tool, test with users who are multilingual.

How many participants should I test with?
Three to five people per user group is usually plenty. How you define a "user group" is up to you, but you can generally identify 1-2 groups that you are primarily building your product for. If you need to evaluate the usability of the mobile editing experience, test with 3-5 new users (who likely don't have an account, aren't as familiar with WikiText, and will edit as IPs) and with experienced Wikipedians (who may perform more complex edits, edit different pages, use "advanced" features like watchlists and talk pages, and will want to edit while logged in to their account).

My software is already completed and deployed. Do I still need to do usability tests?
Yes! Usability testing is a crucial part of iterative user interface design. Even if people are already using your product successfully, there are bound to be areas where you can make their experience better by tweaking the text, updating the interface, or adding new features. Improving the user experience can help increase your user base, and make your existing users more satisfied with your product. In the case of collaboration-focused products like MediaWiki, improving the user experience can also help increase the quantity and quality of the users' contributions.

My software isn't complete yet. How can I perform usability tests?
You don't need to have a feature-complete product to do user testing. If you have functioning code and a basic UI for one part of your product, you can test that specific part. You can also build prototypes of various kinds. Prototypes allow you to mimic the functionality of a piece of software, and the experience of using it to perform specific tasks, without the need to write all the code first.

Prototypes can be made out of HTML and CSS, with special prototyping software, with other general-purpose tools like Microsoft PowerPoint or Gimp, or even sketched on a piece of paper. THe prototype doesn't need to look exactly like the finished product: in fact, sometimes testing with a prototype is more effective than testing with a full-featured product, because you can focus the user's attention on the specific tasks, interface elements, etc that need evaluation.