Quality Assurance/Review February 2013

Follow up from QA Review[edit]

Agenda: Discussion to capture items that need followup by/at next quarterly review.

Followups (meetings/whatever):

Manual database updates in beta cluster: Can we automate this, or add it to checklist? Discussion with Chris, Antoine, Rob, maybe Asher.
Prioritizing both test environment work and exploratory testing/automation targets. Howie, Rob, Chris
Getting Search workign in Beta: Ram, Antoine, Chris
Community test strategy - including success metrics
Useful browser testing on beta (db is correct, policies support beta-- all code gets tested on beta before it hits a production branch)
Greater visibility of browser test results (e.g. Jenkins integration, getting people set up on CloudBees, IRC notifications)
Metrics for browser testing ->bugs

Need to ask Erik which conversations he wants to make sure we have, and need to spend

Staffing ask for 2013-14

More help on test environments? Beta
More help on exploratory testing -- what is the right ratio of product to QA?
More help on volunteer QA management? Depends on performance of activities in the remainder

Tabled for this quarter:

"Cluster-in-a-box" - Labs doesn't yet give you a ready made MediaWiki instance with all of the bells and whistles. It won't for a little while.

Notes from QA Quarterly review - 2013-02-08[edit]

Slides are up as "QA Review Feb 2013"

Overview of testing areas (11:10-11:30 PST)

Questions on each area

Test environments (11:30-11:45)
Exploratory/manual testing (11:45-12:00)
- What do testers get from us ahead of time? test charter.
- How do they submit bug reports? mostly BZ.
- Should we focus exploratory testing energy on AFT, given that it will probably never be at 100% on en.wp? Yes, because of fr & de. But balance it with other needs.
Browser test automation (12:00-12:15)
- Right now, test failures get reported at a public Jenkins instance? emails get sent to relevant people?
- wmf.ci.cloudbees.com
- Can those errors get reported in Gerrit against the changeset that caused them? takes a lot of investigation
Volunteer engagement (12:15-12:30)
- How do we get visibility to the community re different testing activities?
- How large a community of testers are we aiming for? We want a mix of regulars and drop-ins, maybe 50 each week total. We're far from that right now.
- What motivation strategies? 1) involvement from teams working on cool stuff. 2) Motivating community members to get involved (didn't catch specifics)

Detailed discussion topics:

getting prototype branches on beta cluster - how?
- change policy?
- change code that does deployments to support more than just master branch
- Well, why not just merge into master, to avoid longlived branches?
- maybe the heart of this problem is that standalone single-instance prototype wikis are too hard to set up.
database changes
- in production they are managed manually. In beta not handled and updater is not being run.
- Smaller Platform group to handle this question
On beta cluster, supporting more extensions
- MF enabled, everything supporting that should be approved/merged next week (varnish etc.); next sprint will be squid; SSL stuff is deferred to a future sprint
- What next? VisualEditor? E3?
  - How do we decide what to prioritize? Stuff where it's useful to test before production.
  - What's the most fragile piece of core infrastructure, other than mobile, where we wish we'd been catching bugs sooner? Suggestions:
    - caching (Antoine says, "like 99% of our infrastructure is about caching so caching is the top root cause of issues :-]")
    - search <--- Lucene could be setup in beta
    - jobqueue
    - thumbnails
    - Lucene
    - bits.wikimedia.org
    - parsercache
    - ResourceLoader
    - swift
  - Lots of this is Platform/DevOps. Features are often isolated or rapidly developed enough that we don't need a sophisticated test stage. But VE will get there eventually. Target that for Q2 of this calendar year? (default deployment is by July 1 2013) But talk to their team first!
  - TODO: Chris & RobLa to talk about how to do this prioritization
- Can we get subject matter experts to guide testing efforts by writing "how this should work" descriptions?
- Can we actually get volunteer testers from Wikimedia community?
  - Let's stop trying to do outreach that scales and start doing personal one-to-one outreach in emails and talkpages
  - Feb 24 - plan for mobile photo upload test event
  - One strategy: try to replicate what the localisation community is doing. We iterate... hybrid of individual outreach & building an ongoing community
  - Do we need more communications channels? We need to talk onwiki. Maybe run a query to find users who have edited VP:T or software testing-related pages, and post to their talk pages.
- Where we should be targeting our automated test energy: test2? beta cluster? Git/Gerrit/Jenkins? If we want automated test results to automatically show up someplace public to be effective in getting people to care about adding to the test suite & to get those failures noticed & investigated, how do we focus on that?
  - That would be expensive.
  - We could run them after every commit. Ideal situation: every commit/merge to master starts a Jenkins job, creates a MW instance, triggers unit tests & linting, and after that, browser tests. Browser tests are ready for that. We do that for MF.
    - Takes about 10 min.... the more tests we add, the longer these suites will take.
    - If test fails, would be nice to have the same VM available for investigation.
  - Interim step: go to beta on a periodic basis. Stop updates to beta. run tests, resume them
  - or merge the features in an integration branch. Test it out, if ok, roll in production
  - All the code is public & mirrored in GitHub [[1]]
    - Erik is asking for the "every day we perform these x tests against y target" overview -- will follow up later.
    - Test results: [[2]]
      - [[3]]
- Could we have prevented building the wrong tool/preventing bad reception?
  - in product development decisions.... when we engage volunteers, let's regard them as people using the product & giving feedback
  - AFT had actually very high community engagement compared to many other WMF engineering projects.
  - This kind of product management work -- putting it on QA seems a little wrong, community liaising on this should be done but should be by product team
    - but if we notice feedback-style things along the way we should feed it back into product
- Can we respond to bugs that we find?
  - Depends on the project; AFT is low-staffed, Echo is high-staffed
- How do we decide what to focus exploratory testing energy on? How do we balance it on a per-area basis? How do we get those teams to accept the help?
  - If it's Chris going around and asking "who wants QA," that may lead to skewed priorities; Fabrice's responsiveness may have led to overemphasis on AFT
  - It does depend on the place in the lifecycle the team is in. Product manager does need to have discretion to say "doesn't work right now"
  - Product should look at QA level desired...
    - Example: Account Creation UX (ACUX) redesign. Ran as experiment on en.wp since Dec. Increased converstion rates. Now, want to productize it. Not quite solid yet. Issues with form validation - unknown to us. Legit to make a case to SWalling that before deployment it's a good time to run tests.
    - We need to sync more.
    - But there's a cost to Product Managers to working with QA to develop requirements documents.
  - Should this be less opportunistic, more deliberate & formal?
    - Yes, & include dev leads, product managers, & Chris.
    - Specifically, let's talk Echo & Flow.
    - TODO: Chris, RobLa, Howie, & some others to develop an actual process for prioritizing exploratory testing
We will talk about similar topics in the next quarterly review. Hope we've made progress.

Budget question for 2013-2014: can you foreshadow needing any additional resources?

Test environment support.

Post meeting notes:

Need metrics for browser testing ->bugs
Integrate

Old agenda[edit]

My mandate was and is to provide exploratory testing and browser test automation.

QA testing is a service provided to software development projects. Software development teams have a QA testing practice for the same reason that hockey teams have goalies. You can play hockey without a goalie, but the chance of the other team scoring is a lot higher.

QA testing serves software development projects, and software development projects serve the goals of the Foundation. Testing improves the value of these projects: https://strategy.wikimedia.org/wiki/Product_Whitepaper#Product_priority_recommendations

Software development is only a part of the Wikipedia movement. Of the goals of the movement, QA testing serves to help stabilize infrastructure and also to encourage innovation. https://strategy.wikimedia.org/wiki/Strategic_Plan/Movement_Priorities#Goals

Initial challenges:

Earn the trust of software development projects by providing value and improving quality
Create and maintain useful test environments
Provide guidance and documentation of good testing practice, both exploratory/manual testing and automated testing

Who we are:

Testers:

Chris McMahon (QA Lead, January 2012)
Željko Filipin (Test automation, October 2012)
Michelle Grover (Mobile testing, October 2012)

Community:

Quim Gil (Technical contributors, November 2012)
Sumanah Harihareswara (Engineering community)

Support:

Antoine Musso (WMF Jenkins, unit testing, beta labs)
Andre Klapper (Bug Wrangler, October 2012))

Teams we work with:

Closely:

Editor Engagement
Mobile
Platform
Features (UploadWizard, etc.)

Less closely but gaining:

Visual Editor
Language

Teams we'd like to work with:

E3
Yours?

Test environments[edit]

Test environments are a precondition for testing.

Then:

Most testing was done in production, mostly reacting to user reports of issues
testwiki/test2wiki were underused and crufty
beta labs was a hot mess under the covers, unreliable and unmaintainable

Now:

test2 is fully utilized as a target for automated regression tests, run in a timely manner in the course of regular deployment
much of the cruft on test2 has been addressed
beta labs is puppetized, maintainable, and communicates with git
beta labs hosts the testable AFTv5 code, primarily as a target for automated tests

Future:

beta labs to host testable MobileFrontend
beta labs better served by Jenkins
policy changes so beta will serve development projects better (i.e. find a way to deploy useful extension code without merging to master)

Lessons learned:

Reliable test environments require investment and maintenance
Policies governing what is and is not deployed to test environments are difficult and may need to shift over time

Input wanted:

More focus and more support on beta labs, more extensions supported and configured. MobileFrontend is underway, but we could use more.
- Database updates are particularly thorny right now. These are always done manually in production deployments, and beta is languishing. Ideally this could be done as part of the move to a more DevOps/Continuous Deployment deployment style.
We really need a policy and infrastructure that will support experimental features and extensions, the way forward is not clear right now.

Three month timeline:

MobileFrontend on beta labs
Fix test2wiki so that PageTriage works so we can get some automated tests for NewPagesFeed workflow. https://bugzilla.wikimedia.org/show_bug.cgi?id=44065
Work on supporting E3 in beta labs
Work on supporting experimental versions of AFT on beta labs
Possibly get more deploy builds into Jenkins and out of daemon/cron jobs
Eye toward testing DevOps work in beta labs

Exploratory/human/manual testing[edit]

Then:

no organized or dedicated testing
no testing community
existing test plans were scattered and not well considered
little or no community outreach or communication

Now:

early Proof Of Concept to test rapid deployment didn't work too well, mandate was too broad and too complicated
- POC was in conjuction with Weekend Testers America group and Telerik Test Summit conference
early POC to test AFTv5 was successful, important issues identified that influenced the course of the project
- POC was in conjunction with OpenHatch and involved people from Weekend Testers who remained excited about the exercise
hiring Quim Gil puts community testing forward with scheduled events and Groups
good documentation and examples exist on mediawiki.org

Future:

Build the testing community, continue to uncover issues in software and features
Make the community self sustaining to the extent possible
Create a framework within which others may contribute besides just testing, for example volunteer-organized test events, volunteer test plans, bug management in particular areas.

Lessons learned:

Test events require narrow focus and clear intent
Scheduling test events is not trivial. Software development projects shift priorities frequently and stakeholders have misunderstandings and conflicting goals.
Adequate test environments are not always available and sometimes require preliminary work and investment.
"Community is harder than most people think." -Marlena Compton, formerly of Mozilla WebQA, in a private conversation

Input wanted:

SMEs to be involved in guiding test efforts
Outreach to potential volunteer testers, particularly from the Wikipedia community (as opposed to the greater software testing community)

Three month timeline:

Test event for AFT
Test event for Echo
Test event for Mobile
Community outreach and marketing.
Move testing Groups out of "Proposals"

Browser test automation[edit]

Then:

at least one browser test automation project failed after significant investment
tools and practice were primitive and very expensive
other browser test initiatives were scattershot, not maintainable, not standard, used inferior tools and practices

Now:

significant research into best available tools and practice
- Test Automation Bazaar and Telerik Test Summit conferences were invaluable
proof of concept made public in github, sparked discussion in the community (best example: http://watirmelon.com/2012/06/22/rspec-page-objects-and-user-flows/)
hired Zeljko and the project took off. Bottom line: our browser tests find issues. Regression problems with UW and cross-browser issues found when testing IE versions are the biggest success so far
- economical and inexpensive hosted services is the key
- scalable with Jenkins on cloudbees and Sauce Labs
- good reporting
- maintainable architecture
- support for mobile
- works with our deployment practice, not against it
our framework is best-of-breed, absolutely world-class

Future:

More test coverage! More features tested, more depth to existing tests
Build up the backlog of tests to be automated
- Contributions from extensions people, product people, community
- Cucumber opens a communication channel between automaters and non-programmers
Code contributions from programmers

Lessons learned:

Expertise in the area is required. Starting browser test automation is easy, maintainance less so.
Good tools and good practice attract interest
Expanding test coverage takes significant time and effort
Automated tests have shown that we don't support IE very well

Input wanted:

Test scenarios. I would like to have a large backlog of tests to be automated.
Developer interest in creating browser tests
Community contributions from competent browser test developers

Three month timeline:

First Visual Editor basic test (we don't want to invest too much here since VE is subject to radical changes, but we want to be able to move quickly as VE matures)
First PageTriage test (depends on test environment timeline above)
First test to use the API, probably for delete/restore test
Check for javascript errors opening pages
More test scenarios in the backlog!
- First public events for browser testing will be education about creating test scenarios, explaining Given/When/Then syntax for Cucumber

Volunteer Engagement[edit]

Groups we've worked with:

Weekend Testers Americas
OpenHatch
Software Testing Club

Far future[edit]

It is a historical accident that software testing is erroneously conflated with Quality Assurance in the software development arena. Testing is not QA.

Software testing is "...a process of gathering information about (the system) with the intent that the information could be used for some purpose." (-Gerald Weinberg) It is a process of investigation and reporting.

Quality Assurance, on the other hand, is methodology work. It is the examination of processes for the purpose of improving those processes.

As testing becomes more trusted and more routine, I would like to spend more effort improving software development process itself rather that only just investigating software behavior.