Design Systems Team/Codex Manual Testing Guidelines



The following guidelines aim at providing simple and direct instructions for contributors wanting to sign off any implementation tasks that introduced or modified Codex components via manual testing.

In the context of the Codex project, manual testing is subsidiary to the automated functional, visual and accessibility testing processes integrated in the library. Nevertheless, the following practices are fundamental for maintaining the quality of components, as they can help identity first-hand issues that could otherwise remain undetected.

🔍 Testing approach and process[edit]

At any stage of implementation of a Codex task that involves visual or interactive changes, the team members (Designer, Engineer or Test Engineer) might need to review affected components with respect to the scenarios with the aim of identifying any missed requirements. The goal is to test early as much as possible, to reduce any redesign efforts later in the development stage.

When a task needs to be manually signed off, there is the need to evaluate the following:

  1. In case a new component has been added to the Codex library: that the component follows the visual and interactive requirements provided in the Figma design specifications (see example), and that the acceptance criteria defined in the ticket is met.
  2. In case a library component has been updated or fixed: that the changes meet the visual and interactive definitions provided in the specs and/or the task’s acceptance criteria; and that no other properties were altered as a result of said changes.

To verify the functional behavior and evaluate the components’ visual features against the mentioned design specs and acceptance criteria, designers, engineers and/or testers will interact with components directly. They’ll need to trigger the different states using any of their available device(s), just like end-users would do. Nevertheless, it’s recommended for manual testers to use browser inspection capabilities too in order to verify the value of CSS styles, as a way of supporting visual detection.

Where to test Codex changes[edit]

🔵 Active patch[edit]

In case that the patch containing the changes that need to be tested hasn’t been merged yet, you can access the staged Codex demo page in a Netlify build.

To access the build, simply add the patch number to be reviewed in the following URL (replacing “PATCHNUMBER”):

The patch number (which is appended to the patch’s URL) can be easily found in Gerrit:

Finding patch number on Gerrit
Finding patch number on Gerrit

Or extracted from the link provided by gerritbot in the relevant Phabricator ticket:

Finding patch number in gerritbot link
Finding patch number in gerritbot link

🟢 Merged patch[edit]

Testing on the Codex Demo site[edit]

Once a patch has been merged, all changes are automatically published to the Codex demo site and can be reviewed there. Engineers will add an interactive demo page for each new component as part of the component development process.

Testing in MediaWiki[edit]

New components should also be tested inside a MediaWiki instance that more closely resembles our production environment. The Design Systems Team maintains a MediaWiki extension called VueTest that can be used for this purpose. The same component demos which have been prepared for the Codex Demo site can also be used in a MediaWiki instance on the Special:VueTest page. To test out a new Codex component in MediaWiki, the following steps need to be followed.

  • Developers will ensure that the new component demos are available in the VueTest extension prior to moving the component task over to the “QTE Sign-off” column. Even pre-released “experimental” components can be tested in this way once the relevant patch has been merged.
  • Set up a new PatchDemo instance (this creates a custom Wiki with appropriate software installed). Under the “configuration preset” settings, choose “Custom” and select the options below. For convenience, you can also set the “Landing Page” option to: Special:VueTest.
PatchDemo configuration presets for testing in MediaWiki
PatchDemo configuration presets for testing in MediaWiki
  • Once the wiki has finished building, you can navigate to mywikiURL/wiki/Special:VueTest/codex. Once there, navigate to the appropriate component; here you can find the same interactive demos that exist on the Codex Docs site. Once a PatchDemo wiki has been created, it will persist for a while and can be used by multiple testers (useful if you are testing out different browser/OS/device combinations).
  • If you need to test in a way that the VueTest demos don’t cover, let DST engineers know and they’ll help figure out a solution.

How to test: process[edit]

The manual testing process consists of the following steps:

Step 1: Gather and analyze requirements[edit]

Check the acceptance criteria included in the task and access to the appropriate version of the component design specs in Figma (they should be linked in the task description).

Step 2: Create a test plan[edit]

The Spec Sheet is the source of truth for visual/functional requirements of components. Document test cases from this which would cover all its specifications and check the Test Case document (which is often linked in the Test Cases section of the task) to see if there are any previous test cases defined. Add any test cases that are specific to the component under review.

Test cases should be based on the goal of the technical task (i.e. the mentioned list of acceptance criteria), and they should allow testers to verify the result of applying any visual and interactive changes against the expected designs (i.e. the relevant component specs in Figma).

Verifying a component against the entirety of its Figma specifications (see image below) will be specially relevant in the case the said component is completely new to the library. You might let the different sections help you define the specific test cases for the component being verified:

Component specifications for Codex's combobox component
Component specifications for Codex's combobox component

Mapping the specification sections, typical test cases would cover the following:

  • Visual
    • Global stylistic properties: padding, sizes, fonts, etc
    • States: active, hover, disabled etc. display the right, specified styles.
  • Functionality
    • Making sure the right states and actions are displayed when the component is clicked, selected, searched, text input etc.
    • Use cases
    • Minimum and maximum examples (e.g. text overflow behavior).
  • Responsiveness
  • RTL behavior
  • Accessibility and keyboard navigation specs

Test cases should both reflect the predefined component specs and properties, and attempt to explore how to break the component.

Step 3: Execute the test cases[edit]

  • In case a totally new component is being reviewed, we recommend executing the totality of the test cases defined in all the test scenarios listed below (browser compatibility, responsiveness, accessibility and internationalization).
  • For changes to existing components, execute the tests cases only in the relevant scenarios below (with regards the scope of the change), and record any visual or interaction bugs detected.

The scope of the change – which component properties and behaviors were altered in the last implementation effort – will determine the type of test or tests that would need to be performed as part of the verification process (e.g. if the color of an icon was updated, there’s no need to test for responsiveness or internationalization). In case of doubt, you can reach out to your friendly test engineer to help you define the best manual testing strategy.

The following are the possible test approaches, to be performed based on their suitability depending on the task goals and acceptance criteria. Team members might feel free to decide the most convenient combination of tests and their scope, based on their expertise and available resources.

Browser compatibility[edit]

When to test browser compatibility: Always. Providing basic browser support is essential for our project, so browser compatibility tests should be carried out in all cases.

Validate the defined visual and functional tests cases in the following selection of browser versions:

  • Chrome: Current and previous version
  • Firefox: Current and previous version
  • Safari: Current and previous version

Tests can be performed in any operating system. Simply open the relevant demo or Netlify link in each one of the browser versions listed, either directly in your computer or in BrowserStack, and execute the test.

The browser compatibility list is based off the Wikimedia’s browser support matrix, and has been refined with data from the Browser usage breakdown dashboard. The selection was simplified by leaving only one representative of each one of the three most used browser engines.


When to test components’ responsiveness: specifically when new components are introduced to the Codex library. Also necessary in case the task’s goal was to modify the responsive behavior of the element being tested.

Test in the relevant or adequate device/ screen sizes to evaluate whether the component adjusts and behaves as specified in the designs (i.e. as specified in the “Responsive behavior” section in the Figma specification sheets).

Overall responsive performance[edit]

Given that Codex’s focus is on web and mobile web (basic responsiveness), it is sufficient to use the device tools provided by any of the specified browsers to test the responsiveness of components in the following breakpoints*:

  • 2000px
  • 1200px
  • 1000px
  • 720px
  • 320px

If specific media queries, new responsive behaviors or breakpoints were introduced, be sure to evaluate the component’s behavior and visual aspect in each relevant screen size, as well as between them (i.e. check for responsiveness transition issues).

* Please note that breakpoints are currently being revisited and might be subject to change ( see T303522). The information in this section will be updated accordingly.

In-depth mobile testing[edit]

For more in-depth mobile testing, we recommend executing the test cases in as many of the following mobile operative systems and browsers scenarios as possible:

iOS Android
Mobile Safari Current and previous version -
Chrome Mobile Current and previous version Current and previous version
Samsung Internet - Current and previous version

The test matrix above is based off the data from the Browser usage breakdown dashboard.


When to test components’ accessibility: Specially required when new components are added to the library. Also a must in cases where the component’s interaction flow or keyboard shortcuts were modified, or to verify the contrast in case of color adjustments.

Verify the accessibility test cases using assistive technology in order to make sure that the component can be navigated and fully interacted with using the keyboard shortcuts specified in the design spec sheet (see the relevant “Keyboard shortcuts” section in the Figma spec sheet). Check the Screen reader and keyboard navigation testing section for more information.

In general, accessibility testing might include any combination of the following optional and recommended practices:

Reviewing color contrast (Optional)[edit]

In general, designers aim at applying color combinations with sufficient contrast during the definition and specification of components. Nevertheless, oversights are possible. If you suspect that a text might not have enough contrast against its background (we aim at being compliant with an AA level of contrast – check Success Criterion 1.4.3 Contrast (Minimum) in the WCAG 2.2) please report the issue as described in Step 4 of the How to test: process section.

Recommended online contrast-checking tools: WebAIM Contrast Checker, Colour Contrast Checker.

Please observe font size as a factor when evaluating contrast, as described by the Success Criterion 1.4.3 Contrast (Minimum) in the WCAG 2.2.

Legibility (Optional)[edit]

Make sure that the text included in components or patterns is visible and recognizable, and that it can be read with ease. The following font properties play an important role in ensuring legibility and should be observed:

  • Font-size: Font sizes should match the design specifications, and never be smaller than 12px. Pay attention to compounding effects that might render font-sizes smaller than intended in context. Users with low vision and other sight disabilities commonly (3% of users) define bigger default font sizes in their browsers in order to keep content readable for them. User-interface text should always adjust to these user settings. You can easily verify this by increasing the font size to 24px (Firefox or Chrome) in the Appearance settings of the browser you’re using to perform the tests.
  • Color: Text should display a sufficient contrast ratio against its background to remain easily visible for all users (a minimum AA contrast level is expected). Check the Color contrast section for more details.
  • Line height and spacing: The right spacing (defined by line-height) will prevent lines of text from looking too close or too distant to each other, conditions that would impact legibility. System font styles have predefined, optimal line-heights applied to them, based on their size and hierarchy. Validating that said specifications are followed is key to ensure a correct vertical text spacing, and thus an optimal level of legibility.
  • Line length: Lines of text should present an optimal length in order to ensure the readability of a paragraph. In general, a general maximum line length of 80 characters should be observed (40 characters for CJK languages) (Source: WCAG 2.2: Success Criterion 1.4.8 Visual Presentation).
Screen reader and keyboard navigation testing (Recommended)[edit]

Execute the defined test scenarios using screen reader technology in order to make sure that the right components, elements, states and content are announced (find recommended screen reader software in the section below).

While working with screen readers, make sure to interact with components using only your keyboard. This is helpful to verify that all content is accessible, and that all states can be triggered using the specified keyboard shortcuts (see the relevant “Keyboard shortcuts” section in the Figma component specs).

It is worth noting that this step is 100% voluntary during design sign-off processes: most of accessibility testing will be performed during development, and during the Functional Testing stage. Automated a11y testing introduced to Codex CI will serve as an additional step to catch current issues, but even more importantly future breakages (which will be constantly tested). There’s the possibility to set up testing with people with disabilities with the help from external contractors in case there’s the need to evaluate more complex components or patterns.

Recommended screen reader and browser combinations[edit]

There are dozens of possible combinations of browsers and screen readers. For pragmatic reasons, we collected here the most widely used combinations, that testers can feel free to choose at their convenience. (Sources: Accessibility Developer Guide, WebAIM’s Screen Reader User Survey #9):

Testing assistive technology on desktop[edit]

Here are the suggested screen readers with which to conduct manual accessibility testing on desktop devices:

NVDA (Windows), Windows built-in Narrator or VoiceOver (MacOS built in accessibility tool) on either Chrome, Firefox and/or Safari.

Aside from trying to reproduce the regular test cases, please make sure to verify that the specified Keyboard shortcuts – which can be found at the bottom of the component’s specification sheet in Figma (see an example) – have been correctly applied.

Testing accessibility on mobile[edit]

Here are the suggested screen readers with which to conduct accessibility testing on mobile devices:

VoiceOver (iOS built in accessibility tool) or TalkBack (Android built in accessibility tool) on Safari, Chrome, Firefox


The Codex demo site allows you to toggle between an LTR and RTL display of components. You can use this feature to check whether a given element follows the bidirectionally specifications provided in Figma spec sheet (read the “RTL behavior” section).

When to test internationalization: Specially required when new components are added to the library.

Step 4: Report bugs and visual fixes related to the patch[edit]

You can document the issues found either in a comment or in the description of the Phabricator task being reviewed (see example).

Make sure to provide a checklist of the issues, and add clear individual descriptions, context (device, operative system, browser) and visual media (screenshots, videos, gifs) per problem if necessary. Don’t forget to ping the engineer that worked on the changes.

  1. In case unrelated bugs are found during the testing process, report them separately, using the bug report template in Phabricator. New bugs can be added to the Needs triage (Incoming request) in Codex’s Phabricator board.
  2. Test again once the bugs/ design fixes have been implemented in order to re-verify the changes.

If any needed requirements were missing from the specs and weren’t implemented, rather than adding them to the current task, they should be included in a separate ticket and tagged with Design.

Step 5: Add your final approval message[edit]

Check items from the checklist as individual fixes as applied, and add your final sign off approval message to the relevant Phabricator ticket once there’s nothing left to fix. Don’t forget to mention which type of tests were performed and in which set up. Once signed-off, the manually tested ticket should be moved to the “Functional testing” column in the Design Systems Sprint Phabricator board.

✨ Manual testing checklist[edit]

A simplified list of the steps involved in the manual testing process:

  • [ ] Gather relevant resources such as the component’s spec sheet
  • [ ] Define the test cases (what needs to be tested) based on the acceptance criteria defined in the Phabricator task
  • [ ] Access the relevant testing environment: this can be either the Netlify build of the Codex library (in case the patch is active) or the Codex demo page
  • [ ] Execute the test cases and make sure to validate:
    • [ ] That the component displays the specified functional states and visual properties (depending on the scope of the task) are correctly in the current and previous versions of Chrome, Firefox and Safari in your operative system.
    • [ ] That the component displays the correct responsive behavior. At least for the current breakpoints of 320px, 720px, 1000px and 2000px. More about mobile testing browsers and breakpoints.
    • [ ] That the accessibility and keyboard navigation specs are followed: Test using JAWS (Windows), NVDA (Windows) or VoiceOver (MacOS built in accessibility tool) on Chrome, Firefox or Safari (one of these is recommended). Find out more about testing for accessibility
    • [ ] That the component follows the bi-directionality design specifications
  • [ ] Report bugs: Provide a list of needed fixes in the shape of a checklist in the relevant task. Make sure to include individual descriptions and visual media if necessary. Don’t forget to ping the engineer that worked on the changes.
  • [ ] In case unrelated bugs are found during the testing process, report them separately, using the bug report template in Phabricator. Add new bugs to the Needs triage (Incoming request) in Codex’s Phabricator board.
  • [ ] Test again once the bugs/ design fixes have been implemented ****in order to re-verify the changes.
  • [ ] Check items from the checklist of fixes as they’re solved, and add your final sign off approval message to the relevant Phabricator ticket once there’s nothing left to fix. Again, don’t forget to mention which tests were performed and in which set up, and to ping the person assigned to the task.
  • [ ] Once signed-off, a manually tested ticket should be moved to the “Product Sign-Off” column in the Design-Systems-Sprint Phabricator board.