Quality Assurance/Strategy

Goals

 * 1) Improve Wikimedia software products:
 * 2) * User perceived quality.
 * 3) * Areas difficult to cover with automated testing.
 * 4) Grow the Wikimedia technical community.
 * 5) * Accessible entry point for Wikimedia users and editors. No technical skills required.
 * 6) * Good motivation for experienced and professional testers.

We still need a central "these are our QA priorities" page.

Volunteer profiles

 * Wikipedia/Wikimedia editors and motivated users willing to try what's coming next.
 * Experienced / professional testers willing to contribute.
 * Companies developing products where Wikipedia/Wikimedia software needs to run well.
 * ... and of course other regular contributors at https://bugzilla.wikimedia.org/ willing to get involved in a more structured way.

Consolidating a testing team
We need to identify, empower and let lead to those experienced in testing and QA, and those experienced in Wikimedia software & community.

We must build a healthy meritocratic structure with a dose of fun and incentives to those doing great progress and helping out others progressing as well.

Based on this we have another view on profiles required:


 * Senior testers - can help and teach others.
 * Organizers - can increase the quantity and quality of QA activities.
 * Connectors - can bridge QA volunteers efforts with development teams.
 * Promoters - can help reaching out to new volunteers.

Activities
In theory almost all combinations apply:


 * Testing / bug triaging.
 * Online / on-site.
 * DIY / team sprint.
 * Synchronous / asynchronous.

However, not all combinations are equally productive towards different contexts and goals. For instance, a face to face team sprint requires well defined scope and goals, and a heavy involvement from the development team. Individual tasks can provide great results as long as they are not related to urgent deliveries and critical paths.

We need good documentation to clone efficiently at least these cases:


 * Online testing sprint: how to organize, announce, perform, evaluate.
 * Proposed: right after deployment of new MediaWiki versions to non-Wikipedias.
 * Proposed: right after feature deployments.
 * Note: this requires availability of effective announcements and release notes.
 * See Mozilla Test Days, Fedora Test Days and our own Weekend Testing Americas held on 2012-05-5 and Article Feedback Testing on 2012-06-09.
 * Individual testing: tasks that a person can perform and report about anytime / anywhere.
 * Individual bug triaging: reports to look at and instructions to improve their status.

Reaching out
We need to go beyond the sporadic isolated efforts and build a continuous, incremental flow of activites. The success of each activity must contribute to future successes.

We need to let people know about ongoing / DIY opportunities as well as events. We need to reach out to the current MediaWiki / Wikimedia communities as well as to external groups and potential new contributors.


 * Calendar: a central place where activities are announced. It should be possible to subscribe and receive notifications of new activities.
 * QA communities: reaching out and having processes in place to promote our activities.
 * Work with promoters to spread the news.
 * Contact companies testing Wikipedia in their products e.g. browser developers.
 * Organize on-site activities engaging local groups.

Follow-up activities
Testing events require a follow-up to


 * Evaluate and announce the outcome.
 * Triage and process the feedback received into the regular development flow.
 * Keep the contributors engaged.
 * Warm up for the next event.

For instance, it is a good idea to organize an online bug triaging sprint after a testing event.

Measuring success
Team activities:
 * Number of events.
 * Diversity of events across development areas, promoters and locations.
 * Number of participants.
 * Effectiveness welcoming new contributors.
 * Effectiveness retaining and promoting current contributors.
 * Quantity and quality of reports created / triaged.
 * Impact in software releases and Bugzilla statistics.
 * Contributions to the goals of the Wikimedia Foundation.
 * Feedback from the related development team.

Individual activities:
 * How easy and how long before doing a first contribution.
 * Positive / negative feedback, complaints, bugs.
 * Individual contributors showing up in community channels and team activities.
 * Statistics on individual contributors (to be defined).
 * QA contributor retention.

Community incentives
To be defined. Some ideas:


 * Tester barnstar.
 * "I test Wikipedia" shirt.
 * Sponsored training e.g. AST courses.
 * Sponsored travel to Wikimania.

Test automation

 * We have a huge dev-to-QA ratio; work will have to be mostly by devs
 * Selenium more Chris's area... write tests that are robust, will last for a few years
 * How do we make it easy to write a unit test?
 * Do the tests run against a production or a test environment? Example: you can test ProofreadPage, 99%, without being logged in, and it's always there.  And Labs is not robust enough.  Doesn't return pages fast enough.  Labs needs a reliable automated deploy process.
 * Think about UploadWizard, which is fragile and breaks every release! Need a Selenium test for that, or is this too deep for automation?  There would be at least 1 basic test.  Core piece of Commons, which has strategic importance.  But how far do we go?  Every media type?  No. Just get a basic test: get logged in, make 2 or 3 styles of edits, upload an image, 1 or 2 for Wikisource, 1 or 2 for Wikinews, a FlaggedRevs test
 * Note that any test involving logging in will have to involve a user that can't do damage, since the user/password will probably be public. Also note that any automated browser test that uses files on disk (like for uploading) will be subject to a whole lot of extra maintenance, since the automated test should run in/from any suitable environment: labs, production, local.
 * If you're concerned about a path through an app, that's a good candidate for a browser test. UploadWizard is a great candidate.  But the tough thing is managing files on disk.
 * A zillion little elements that each do a diff. thing in the UI is tough. Example: ProofreadPage. Hard to test with a unit test, makes sense with a browser test.
 * Focusing - what tests are easier to write and maintain over time? PP - we can really use a minimal level of smoke testing.  Focus on what a lot of people see but we don't get reports on.  Like, major breakage in IE7.  People just leave and do not come back.
 * What do we use? We can't use the VMs in the office. Something like Sauce is probably ultimately our best bet.  Biggest concern: getting useful tests in place in the first place.  Then figure out where to host them.
 * Ruby still makes a lot of sense.
 * Selenium tests in Ruby? Selenium WebDriver is in the process of being approved by W3C as a standard - IE, Firefox, Chrome, & Opera have signed on. Languages: Python, C#, Java, & Ruby.  The Python stuff (see Mozilla for example) requires a lot of custom scaffolding we'd have to write ourselves.
 * Outside unit tests & Selenium tests, are we looking into any other kind of testing? Benchmark or smoke tests? We don't have reliable ongoing data on performance on, e.g., uploads.


 * consider test case management, like Litmus & Moztrap/Mousetrap?

Follow up: Jenkins