Wikimedia Research/Usability testing

This page describes the basics of how, when, and why to do usability testing. Usability testing is an essential part of the developing usable products. Ultimately, this helps create more usable and user-centered wikis. This primer covers the basics of usability testing practices and tools. It describes and shows how to find the right participants, how to design a study, structure data collection, and perform analysis and synthesis for delivering findings toward iterating a more usable functionality. This will be of benefit to engineers who have ever had their code blocked by usability bugs.

This document was prepared in support of the Wikimania 2015 presentation "Operationalizing usability testing for Mediawiki engineers".

What usability testing is good for

 * Figuring out whether users can complete important tasks, like finding an article, uploading an image, or adding a citation
 * Figuring out how long it takes users to complete tasks
 * Comparing different interface options for for the same kind of user, like Visual Editor vs Wikitext editor for casual Wikipedia contributors
 * Comparing the same product or feature for different kinds of users, like Visual Editor for casual Wikipedia contributors vs experienced editors
 * Comparing a new version of a product or feature with an earlier iteration, like the first release of Echo vs. a newer version
 * Learning what surprises, delights, or frustrates users about your product
 * Learning what users expect to happen when they perform a new task for the first time

What usability testing is not good for

 * Learning users' opinions about why they would use your software.
 * Learning users' opinions about whether they would choose your software vs. your competitors.
 * Learning what users think would make your software better.
 * Testing everything about your product
 * Testing how every user will use your product

Write a test plan
Create a written summary of what you why you want to perform a usability test, and how the results will be used. This will help you decide which tasks to include in your test (Create a protocol) and which kinds of users to test with, and how many to recruit (Recruit participants).

Create a protocol
A protocol is a list of tasks that test participants will perform with your software. To determine which tasks to include, start out by writing a list of questions that you want to answer with the study.

The protocol should contain basic instructions about the task the users are being asked to perform. The goal is to give the user enough information that they understand the nature of the task, and where to start, but not to tell them how to accomplish it. These instructions should be written out, so that all users get the same amount of explanation.

Some advice for writing task instructions
 * Use plain language and short sentences. Avoid using technical jargon ("diff", "revert", "nav bar", "wikilink"), unless you are 100% sure your users understand exactly what those terms mean.
 * Make the goal of the task clear. Tell the user what you want them to accomplish, and why they are being asked to perform this task, but don't tell them how to do it.

Pre-test questionnaire
In order to understand your users better, consider developing a pre-test questionnaire and asking your users to fill it out before they begin their first task (or ask them verbally, and write down their answers. See Take notes.) This will help you out during analysis as you compare findings from different users. For example: someone who hasn't edited Wikipedia before, but who has extensive experience with MediaWiki in a corporate intranet environment, will probably be more successful at mastering wiki syntax than a new user who is completely unfamiliar with wikis.

Recruit participants
You should recruit participants for your test who have similar goals, needs, and characteristics as the people who use your product. Every piece of software is used by people with diverse backgrounds and levels of experience. Particular tasks or product features will be more important to some users than others. Generally you will want to test the usability of your product for a particular type of user, or to compare the experience of 2-3 classes of user (for example, new users vs. casual users vs. "power" users).

Several studies have established that you can identify most "serious" usability issues with a product after testing it with as few as 3-5 users. If you are testing your product on several different classes of user, you should aim for at least 3-5 users per class.

Schedule more participants than you need. Recruiting and scheduling test participants can be one of the most time-consuming parts of a usability study. Many of the people who respond to your solicitation to drop out before they have completed the test. With some respondents, you won't be able to find a time that they are available. Others will fail to show up at the scheduled time. As a general rule, try to schedule at least 50% more participants than you absolutely need: if you absolutely need 6 people to complete the study, schedule 9 or even 10!

Facilitate usability tests

 * Use a script.
 * Make it clear to the user that you are not testing them, their technical skills, or their mind-reading abilities. In usability testing, the users are the experts: they are helping you test the software, so you can make it better.
 * Lead them to that starting point. Don't make users do unfamiliar things that are outside the scope of the tasks you've assigned them. If you are testing whether people can change their preferences, help them log in first. If you are testing an image upload workflow, direct them to a page with an "Upload an image" link.
 * Remind them to think aloud. Ask the user to describe what they are thinking, what they are trying to do, and what they expect to happen, right now. You will probably have to remind them repeatedly: that's expected. Make sure to remind them gently.
 * Don't lead them through the task. The goal of a usability test is to understand whether users can complete tasks on their own, under "normal" circumstances. When a user struggles with a task, it can be tempting to jump in and tell them what to do. Don't give in to this temptation.
 * note: if the participant wanders so far afield that they are no longer performing the task you assigned them, feel free to direct them back to the starting point (in a polite and supportive way). Similarly, if a participant can't complete the task at all after multiple attempts, and has completely given up in frustration, you can give them hints to get them back on task.
 * Ask questions. In most usability testing sessions, the most valuable findings come from asking questions in the moment. If a participant seems confused, surprised, delighted, frustrated, etc by something they see, or something that happens, ask them to explain what they are thinking/feeling and why. It can also be useful to ask users what they expect will happen when they do something, right before they do it. Then you can follow up with them after they see the results of the action (clicking a link, entering a search term) by asking "So what happened? Is this what you expected? Why or why not?"

Take notes

 * Have a designated note-taker: ideally, you should have two people performing the test. One person facilitates the test--their job is to guide the participant through their task, and ask and answer questions. The other person devotes their time to taking notes. It's possible for one person to fill both of these roles, but you'll have an easier time and get more out of the test if you divide responsibilities.


 * Record the session: With the participants permission, audio or videorecord the testing session. Videorecording is especially useful if you can capture the participant's screen (less useful if you have the camera pointed at their face the whole time).


 * Write down important quotes: Direct quotes from participants are a great tool for communicating the significance of your findings in your report. Quotes can serve as evidence that someone is confused about a task. They might say they're doing one thing, but are actually doing something different. They also might express surprise, dismay, frustration, or curiosity--these responses are often signs that something interesting is happening.


 * Record the time for important notes: If you use audio or video recording, record the timestamp of the quote so that you can find it later. Record the times that a user started a particular task, so that you can easily go back and review their performance on that task.


 * Ask any viewers to take their own notes as well: Different people will notice different things when they watch the same user test. If you invited collaborators (developers, designers, project managers, executives) to watch you conduct your test, ask them to take notes. If possible, have them use the same note-taking template as your designated note-taker. This will make it easier to consolidate all of your findings later.

Analyze results

 * Debrief right after the session: Memory is a fickle thing. Have a quick discussion with your co-facilitator (and any observers) right after the session is over. This will help you make sure you have captured important observations or ideas that came up during the session, before they slip away. Each person should review their notes and comment on anything they noticed during the session that was particularly interesting or important. You can also use this time to refine your test plan or your protocol. Was a participant confused by the way the task instructions were worded? Consider changing them before the next session. Have you noticed that a particular task isn't generating good results? Consider replacing that task with one that helps you gather more relevant information, or drop it entirely to free up time to focus more on other tasks.


 * type up/review/consolidate your notes.
 * note common problems across users.

Report results

 * prioritize based on severity.
 * prioritize based on design goals.
 * share quotes and clips.
 * provide suggestions/next steps.