Quality Assurance/Review February 2013

Follow up from QA Review
Agenda: Discussion to capture items that need followup by/at next quarterly review.

Followups (meetings/whatever):
 * Manual database updates in beta cluster: Can we automate this, or add it to checklist? Discussion with Chris, Antoine, Rob, maybe Asher.
 * Prioritizing both test environment work and exploratory testing/automation targets. Howie, Rob, Chris
 * Getting Search workign in Beta: Ram, Antoine, Chris
 * Community test strategy - including success metrics
 * Useful browser testing on beta (db is correct, policies support beta-- all code gets tested on beta before it hits a production branch)
 * Greater visibility of browser test results (e.g. Jenkins integration, getting people set up on CloudBees, IRC notifications)
 * Metrics for browser testing ->bugs

Need to ask Erik which conversations he wants to make sure we have, and need to spend

Staffing ask for 2013-14
 * More help on test environments? Beta
 * More help on exploratory testing -- what is the right ratio of product to QA?
 * More help on volunteer QA management? Depends on performance of activities in the remainder

Tabled for this quarter:
 * "Cluster-in-a-box" - Labs doesn't yet give you a ready made MediaWiki instance with all of the bells and whistles. It won't for a little while.

Notes from QA Quarterly review - 2013-02-08
Slides are up as "QA Review Feb 2013"

Overview of testing areas (11:10-11:30 PST)

Questions on each area
 * Test environments (11:30-11:45)
 * Exploratory/manual testing (11:45-12:00)
 * What do testers get from us ahead of time? test charter.
 * How do they submit bug reports? mostly BZ.
 * Should we focus exploratory testing energy on AFT, given that it will probably never be at 100% on en.wp? Yes, because of fr & de.  But balance it with other needs.
 * Browser test automation (12:00-12:15)
 * Right now, test failures get reported at a public Jenkins instance? emails get sent to relevant people?
 * wmf.ci.cloudbees.com
 * Can those errors get reported in Gerrit against the changeset that caused them? takes a lot of investigation
 * Volunteer engagement (12:15-12:30)
 * How do we get visibility to the community re different testing activities?
 * How large a community of testers are we aiming for? We want a mix of regulars and drop-ins, maybe 50 each week total.  We're far from that right now.
 * What motivation strategies? 1) involvement from teams working on cool stuff. 2) Motivating community members to get involved (didn't catch specifics)

Detailed discussion topics:
 * getting prototype branches on beta cluster - how?
 * change policy?
 * change code that does deployments to support more than just master branch
 * Well, why not just merge into master, to avoid longlived branches?
 * maybe the heart of this problem is that standalone single-instance prototype wikis are too hard to set up.
 * database changes
 * in production they are managed manually. In beta not handled and updater is not being run.
 * Smaller Platform group to handle this question
 * On beta cluster, supporting more extensions
 * MF enabled, everything supporting that should be approved/merged next week (varnish etc.); next sprint will be squid; SSL stuff is deferred to a future sprint
 * What next? VisualEditor? E3?
 * How do we decide what to prioritize? Stuff where it's useful to test before production.
 * What's the most fragile piece of core infrastructure, other than mobile, where we wish we'd been catching bugs sooner? Suggestions:
 * caching (Antoine says, "like 99% of our infrastructure is about caching so caching is the top root cause of issues :-]")
 * search <--- Lucene could be setup in beta
 * jobqueue
 * thumbnails
 * Lucene
 * bits.wikimedia.org
 * parsercache
 * ResourceLoader
 * swift
 * Lots of this is Platform/DevOps. Features are often isolated or rapidly developed enough that we don't need a sophisticated test stage.  But VE will get there eventually.  Target that for Q2 of this calendar year?  (default deployment is by July 1 2013)  But talk to their team first!
 * TODO: Chris & RobLa to talk about how to do this prioritization
 * Can we get subject matter experts to guide testing efforts by writing "how this should work" descriptions?
 * Can we actually get volunteer testers from Wikimedia community?
 * Let's stop trying to do outreach that scales and start doing personal one-to-one outreach in emails and talkpages
 * Feb 24 - plan for mobile photo upload test event
 * One strategy: try to replicate what the localisation community is doing. We iterate...  hybrid of individual outreach & building an ongoing community
 * Do we need more communications channels? We need to talk onwiki.  Maybe run a query to find users who have edited VP:T or software testing-related pages, and post to their talk pages.
 * Where we should be targeting our automated test energy: test2? beta cluster? Git/Gerrit/Jenkins? If we want automated test results to automatically show up someplace public to be effective in getting people to care about adding to the test suite & to get those failures noticed & investigated, how do we focus on that?
 * That would be expensive.
 * We could run them after every commit. Ideal situation: every commit/merge to master starts a Jenkins job, creates a MW instance, triggers unit tests & linting, and after that, browser tests.  Browser tests are ready for that.  We do that for MF.
 * Takes about 10 min.... the more tests we add, the longer these suites will take.
 * If test fails, would be nice to have the same VM available for investigation.
 * Interim step: go to beta on a periodic basis. Stop updates to beta. run tests, resume them
 * or merge the features in an integration branch. Test it out, if ok, roll in production
 * All the code is public & mirrored in GitHub []
 * Erik is asking for the "every day we perform these x tests against y target" overview -- will follow up later.
 * Test results: []
 * []
 * Could we have prevented building the wrong tool/preventing bad reception?
 * in product development decisions.... when we engage volunteers, let's regard them as people using the product & giving feedback
 * AFT had actually very high community engagement compared to many other WMF engineering projects.
 * This kind of product management work -- putting it on QA seems a little wrong, community liaising on this should be done but should be by product team
 * but if we notice feedback-style things along the way we should feed it back into product
 * Can we respond to bugs that we find?
 * Depends on the project; AFT is low-staffed, Echo is high-staffed
 * How do we decide what to focus exploratory testing energy on? How do we balance it on a per-area basis?  How do we get those teams to accept the help?
 * If it's Chris going around and asking "who wants QA," that may lead to skewed priorities; Fabrice's responsiveness may have led to overemphasis on AFT
 * It does depend on the place in the lifecycle the team is in. Product manager does need to have discretion to say "doesn't work right now"
 * Product should look at QA level desired...
 * Example: Account Creation UX (ACUX) redesign. Ran as experiment on en.wp since Dec.  Increased converstion rates.  Now, want to productize it.  Not quite solid yet.  Issues with form validation - unknown to us.  Legit to make a case to SWalling that before deployment it's a good time to run tests.
 * We need to sync more.
 * But there's a cost to Product Managers to working with QA to develop requirements documents.
 * Should this be less opportunistic, more deliberate & formal?
 * Yes, & include dev leads, product managers, & Chris.
 * Specifically, let's talk Echo & Flow.
 * TODO: Chris, RobLa, Howie, & some others to develop an actual process for prioritizing exploratory testing
 * We will talk about similar topics in the next quarterly review. Hope we've made progress.

Budget question for 2013-2014: can you foreshadow needing any additional resources?
 * Test environment support.

Post meeting notes:
 * Need metrics for browser testing ->bugs
 * Integrate

Old agenda
My mandate was and is to provide exploratory testing and browser test automation.

QA testing is a service provided to software development projects. Software development teams have a QA testing practice for the same reason that hockey teams have goalies. You can play hockey without a goalie, but the chance of the other team scoring is a lot higher.

QA testing serves software development projects, and software development projects serve the goals of the Foundation. Testing improves the value of these projects: https://strategy.wikimedia.org/wiki/Product_Whitepaper#Product_priority_recommendations

Software development is only a part of the Wikipedia movement. Of the goals of the movement, QA testing serves to help stabilize infrastructure and also to encourage innovation. https://strategy.wikimedia.org/wiki/Strategic_Plan/Movement_Priorities#Goals

Initial challenges:


 * Earn the trust of software development projects by providing value and improving quality
 * Create and maintain useful test environments
 * Provide guidance and documentation of good testing practice, both exploratory/manual testing and automated testing

Who we are:

Testers:
 * Chris McMahon (QA Lead, January 2012)
 * Željko Filipin (Test automation, October 2012)
 * Michelle Grover (Mobile testing, October 2012)

Community:
 * Quim Gil (Technical contributors, November 2012)
 * Sumanah Harihareswara (Engineering community)

Support:
 * Antoine Musso (WMF Jenkins, unit testing, beta labs)
 * Andre Klapper (Bug Wrangler, October 2012))

Teams we work with:

Closely:
 * Editor Engagement
 * Mobile
 * Platform
 * Features (UploadWizard, etc.)

Less closely but gaining:
 * Visual Editor
 * Language

Teams we'd like to work with:
 * E3
 * Yours?

Test environments
Test environments are a precondition for testing.

Then:
 * Most testing was done in production, mostly reacting to user reports of issues
 * testwiki/test2wiki were underused and crufty
 * beta labs was a hot mess under the covers, unreliable and unmaintainable

Now:
 * test2 is fully utilized as a target for automated regression tests, run in a timely manner in the course of regular deployment
 * much of the cruft on test2 has been addressed
 * beta labs is puppetized, maintainable, and communicates with git
 * beta labs hosts the testable AFTv5 code, primarily as a target for automated tests

Future:
 * beta labs to host testable MobileFrontend
 * beta labs better served by Jenkins
 * policy changes so beta will serve development projects better (i.e. find a way to deploy useful extension code without merging to master)

Lessons learned:
 * Reliable test environments require investment and maintenance
 * Policies governing what is and is not deployed to test environments are difficult and may need to shift over time

Input wanted:
 * More focus and more support on beta labs, more extensions supported and configured. MobileFrontend is underway, but we could use more.
 * Database updates are particularly thorny right now. These are always done manually in production deployments, and beta is languishing.  Ideally this could be done as part of the move to a more DevOps/Continuous Deployment deployment style.
 * We really need a policy and infrastructure that will support experimental features and extensions, the way forward is not clear right now.

Three month timeline:
 * MobileFrontend on beta labs
 * Fix test2wiki so that PageTriage works so we can get some automated tests for NewPagesFeed workflow. https://bugzilla.wikimedia.org/show_bug.cgi?id=44065
 * Work on supporting E3 in beta labs
 * Work on supporting experimental versions of AFT on beta labs
 * Possibly get more deploy builds into Jenkins and out of daemon/cron jobs
 * Eye toward testing DevOps work in beta labs

Exploratory/human/manual testing
Then:
 * no organized or dedicated testing
 * no testing community
 * existing test plans were scattered and not well considered
 * little or no community outreach or communication

Now:
 * early Proof Of Concept to test rapid deployment didn't work too well, mandate was too broad and too complicated
 * POC was in conjuction with Weekend Testers America group and Telerik Test Summit conference
 * early POC to test AFTv5 was successful, important issues identified that influenced the course of the project
 * POC was in conjunction with OpenHatch and involved people from Weekend Testers who remained excited about the exercise
 * hiring Quim Gil puts community testing forward with scheduled events and Groups
 * good documentation and examples exist on mediawiki.org

Future:
 * Build the testing community, continue to uncover issues in software and features
 * Make the community self sustaining to the extent possible
 * Create a framework within which others may contribute besides just testing, for example volunteer-organized test events, volunteer test plans, bug management in particular areas.

Lessons learned:
 * Test events require narrow focus and clear intent
 * Scheduling test events is not trivial. Software development projects shift priorities frequently and stakeholders have misunderstandings and conflicting goals.
 * Adequate test environments are not always available and sometimes require preliminary work and investment.
 * "Community is harder than most people think." -Marlena Compton, formerly of Mozilla WebQA, in a private conversation

Input wanted:
 * SMEs to be involved in guiding test efforts
 * Outreach to potential volunteer testers, particularly from the Wikipedia community (as opposed to the greater software testing community)

Three month timeline:
 * Test event for AFT
 * Test event for Echo
 * Test event for Mobile
 * Community outreach and marketing.
 * Move testing Groups out of "Proposals"

Browser test automation
Then:
 * at least one browser test automation project failed after significant investment
 * tools and practice were primitive and very expensive
 * other browser test initiatives were scattershot, not maintainable, not standard, used inferior tools and practices

Now:
 * significant research into best available tools and practice
 * Test Automation Bazaar and Telerik Test Summit conferences were invaluable
 * proof of concept made public in github, sparked discussion in the community (best example: http://watirmelon.com/2012/06/22/rspec-page-objects-and-user-flows/)
 * hired Zeljko and the project took off. Bottom line: our browser tests find issues.  Regression problems with UW and cross-browser issues found when testing IE versions are the biggest success so far
 * economical and inexpensive hosted services is the key
 * scalable with Jenkins on cloudbees and Sauce Labs
 * good reporting
 * maintainable architecture
 * support for mobile
 * works with our deployment practice, not against it
 * our framework is best-of-breed, absolutely world-class

Future:
 * More test coverage! More features tested, more depth to existing tests
 * Build up the backlog of tests to be automated
 * Contributions from extensions people, product people, community
 * Cucumber opens a communication channel between automaters and non-programmers
 * Code contributions from programmers

Lessons learned:
 * Expertise in the area is required. Starting browser test automation is easy, maintainance less so.
 * Good tools and good practice attract interest
 * Expanding test coverage takes significant time and effort
 * Automated tests have shown that we don't support IE very well

Input wanted:
 * Test scenarios. I would like to have a large backlog of tests to be automated.
 * Developer interest in creating browser tests
 * Community contributions from competent browser test developers

Three month timeline:
 * First Visual Editor basic test (we don't want to invest too much here since VE is subject to radical changes, but we want to be able to move quickly as VE matures)
 * First PageTriage test (depends on test environment timeline above)
 * First test to use the API, probably for delete/restore test
 * Check for javascript errors opening pages
 * More test scenarios in the backlog!
 * First public events for browser testing will be education about creating test scenarios, explaining Given/When/Then syntax for Cucumber

Volunteer Engagement
Groups we've worked with:


 * Weekend Testers Americas
 * OpenHatch
 * Software Testing Club

Far future
It is a historical accident that software testing is erroneously conflated with Quality Assurance in the software development arena. Testing is not QA.

Software testing is "...a process of gathering information about (the system) with the intent that the information could be used for some purpose." (-Gerald Weinberg) It is a process of investigation and reporting.

Quality Assurance, on the other hand, is methodology work. It is the examination of processes for the purpose of improving those processes.

As testing becomes more trusted and more routine, I would like to spend more effort improving software development process itself rather that only just investigating software behavior.