Core Platform Team/Initiative/Add API integration tests/Epics, User Stories, and Requirements

Features, MVP:

 * the test runner executes each test case against the given API and reports any failures to comply with the expected results
 * tests are specified in yaml. Test contains an interaction sequence consisting of a request/response pairs (aka steps).
 * for each step, the request as specified in the interaction step is sent to the server, and the response received is compared to the response specified in the step.
 * Responses with a JSON body are compared to a JSON/YAML structure in the response part of the step. All keys present in the expected response must be present in the actual response. For primitive values, the actual value must match the expected value.
 * Expected primitive values can either be given verbatim (so the actual value is expected to be exactly the same), or yaml tags can be used to specify the desired type of match.
 * For the MVP, the only kind of match supported beyond equivalence is one based on regular expressions.
 * the test runner generates human readable plain text output

Features required to fulfill this project's goals:

 * support for variables
 * extracted from response
 * randomized
 * loaded from config file
 * support for cookies and sessions
 * discover tests by recursively examining directories
 * support fixtures
 * execute fixtures in order of dependency

Features expected to become useful or necessary later, or for stretch goals:
(roughly in order or relevance)


 * discover tests specified by extensions (needs MW specific code?)
 * allow test suites to be filtered by tag
 * execute only fixtures needed by tests with the tag (or other fixtures that are needed)
 * allow for retries
 * allow tests to be skipped (preconditions)
 * support JSON output
 * Support running the same tests with several sets of parameters (test cases)
 * support for cookie jars
 * allow test for interactions spanning multiple sites
 * support HTML output
 * parallel test execution

Requirements that seem particular to (or especially important for) the Wikimedia use case:

 * HTTP-centric paradigm, focusing on specifying the headers and body of requests, and running assertions against headers and body of the response.
 * Support for running assertions against parts of a structured (JSON) response (JSON-to-JSON comparison, with the ability to use the more human friendly YAML syntax)
 * filtering by tags (because we expect to have a large number of tests)
 * parallel execution (because we expect to have a large number of tests)
 * yaml based declarative tests and fixtures: tests should be language agnostic, it should be easy to write tests for people involved with different language ecosystems and code bases. This also avoids lock-in to a specific tool, since yaml is easy to parse and convert.
 * generalized fixture creation, entirely API based, without the need to write "code" other than specifying requests in yaml.
 * randomized fixtures, so we can create privileged users on potentially public tests systems.
 * control over cookies and sessions
 * ease of running on in dev environments without the need to install additional tools / infrastructure (this might by a reason to switch to python for implementation; node.js is also still in the race).
 * ease of running in WMF's CI environment (jenkins, quibble)
 * option to integrate with PHPUnit
 * discovery of tests defined by extensions.
 * dual use for monitoring a live system, in addition to testing against a dummy system

Usage of variables and fixtures:

 * have a well known root user with fixed credentials in config.
 * name and credentials for that user are loaded into variables that can be accessed from within the yaml files.
 * yaml files that create fixtures, such as pages and users, can read but also define variables.
 * Variable values are either random with a fixed prefix, or they are extracted from an http response.
 * Variable value extraction happens using the same mechanism we use for checking/asserting responses

Epic 2: Baseline implementation of Phester test runner
T221088: Baseline implementation of Phester test runner

Functional requirements:

 * Run the test suites in sequence, according to the order given on the command line
 * Execute the requests within each suite in sequence.
 * The runner can be invoked from the command line
 * Required input: the base URL of a MediaWiki instance
 * Required input: one or more test description files.
 * The test runner executes each test case against the given API and reports any failures to comply with the expected results
 * human readable plain text output
 * Support regular expression based value matching
 * Test definitions are declarative (YAML)

Rationale for using declarative test definition and YAML:


 * test definition not bound to a specific programming language (PHP, JS, python)
 * keeps tests simple and "honest", with well defined input and output, no hidden state, and no loops or conditionals
 * no binding to additional tools or libraries, tests stay "pure and simple"
 * Easy to parse and process, and thus to port away from, use as a front-end for something else, or analyze and evaluate.
 * YAML is JSON compatible. JSON payloads can just be copied in.
 * Creating a good DSL is hard, evolving a DSL is harder. YAML may be a bit ugly, but it's generic and flexible.

Implementation notes:

 * The test runner should be implemented in PHP. Rationale: It is intended to run in a development environment used to write PHP code. Also, we may want to pull this into the MediaWIki core project via composer at some point.
 * Use the Guzzle library for making HTTP requests
 * The test runner should not depend on MediaWiki core code.
 * The test runner should not hard code any knowledge about the MediaWiki action API, and should be designed to be usable for testing other APIs, such as RESTbase.
 * The test runner MUST ask for confirmation that it is ok for any information in the given target API to be damaged or lost (unless --force is specified)
 * No cleanup (tear-down) is performed between tests.

Story: Try writing basic tests
T225614:Create an initial set of API integration tests

The tests below should work on the MVP, so they must not require variables. So no login.

For MediaWiki, relevant stories to test are:
 * anonymous page creation and editing, verify change in content (verifying access to old revisions requires variable support) via API:Edit
 * re-parse of dependent pages (red links turning blue, missing templates getting used after being created) via API:Edit and then fetching the rendered page from the article path (index.php?title=xyz)
 * page history with edit summary, size diff (testing minor edits and user names requires variables for login) see API:Revisions
 * recent changes with edit summary, size diff, etc API:RecentChanges
 * renaming/moving a page (basic) via API:Move
 * pre-save transform (PSR) (via API:Edit)
 * template transclusion via API:Parsing wikitext
 * parser functions via API:Parsing wikitext (some)
 * magic words via API:Parsing wikitext (some)
 * diffs (use relative revision ids) via API:Compare
 * fetching different kinds of links / reverse links (see API:Query)
 * listing category contents (see API:Query)

In addition, it would be useful to see one or two basic tests for Kask and RESTbase.

It would be nice to see what these tests look like in our own runner (phester) and for some of the other candidates like tavern, behat, codeception, or dredd. It's not necessary to write all the tests for all the different systems, though.

Story: Try writing tests using variables
T228001: Create a set of API integration tests using variables

Without support for variables in phester, these tests have to be written "dry", with no way to execute them.


 * watchlist (watch/unwatch/auto-watch)
 * changing preferences
 * bot edits (interaction of bot permission and bot parameter)
 * undo
 * rollback
 * diffs with fixed revision IDs (test special case for last and first revision)
 * patrolling, patrol log
 * auto-patrolling
 * listing user contributions
 * listing users
 * page deletion/undeletion (effectiveness)
 * page protection (effectiveness, levels)
 * user blocking (effectiveness, various options)
 * MediaWiki namespace restrictions
 * user script protection (can only edit own)
 * site script protection (needs special permission)
 * minor edits
 * remaining core parser functions
 * remaining core magic words (in particular REVISIONID and friends)
 * Pre-save transform (PST), signatures, subst templates, subst REVISIONUSER.
 * Special pages transclusion
 * newtalk notifications
 * media file uploads (need to be enabled on the wiki) (needs file upload support in phester)
 * site stats (may need support for arithmetic functions)

It would be nice to see what these tests look like in our own runner (phester) and for some of the other candidates, like tavern, behat, codeception, or dredd.

Story: Decide whether to invest in Phester, and decide what to use instead
T222100: Decide whether creating Phester is actually worth while

See: https://docs.google.com/spreadsheets/d/1G50XPisubSRttq4QhakSij8RDF5TBAxrJBwZ7xdBZG0

Candidates:


 * Phester
 * strst
 * tavern
 * behat
 * codeception
 * cypress.io
 * RobotFramework
 * SoapUI
 * dredd

Criteria:


 * runtime
 * execution model
 * ease of running locally
 * easy of running in CI
 * test language
 * ease of editing
 * easy of migration
 * scope/purpose fit


 * stability/support
 * cost to modify/maintain
 * control over development
 * documentation
 * license model
 * license sympathy


 * recursive body matches
 * regex matches


 * variables
 * variable injection


 * global fixtures


 * JSON output


 * scan for test files
 * filter tests by tag


 * parallel execution

Phabricator Task: T227999: Extended implementation of test runner
Personas:


 * Developer of the functionality being tested
 * Test author
 * CI engineer
 * Operations engineer

Non-functional requirements:


 * Test runner has a low cost integrating with CI infrastructure
 * As a test author, I want IDE integration including autocompletion and validation
 * Test runner must be distributed under an open source license
 * Test runner should be well-documented, well-supported, and commonly used
 * Tests are written in a well-documented, familiar language
 * Tests should be runnable from the command line
 * Reduce vendor lock-in
 * Test runner should have a good fit in scope and purpose for running API tests over HTTP
 * Minimize the amount of code we have to maintain

Story: Implement initial set of tests
Implement the tests defined during the evaluation phase (Epic 3) for the actual runner. If we go with Phester, the tests should not have to change at all, or just need minor tweaks. If we go with a different frameworks, the experimental tests need to be ported to that framework.

Story: Document test runner
Create documentation that allows others to create tests and run them.

Write comprehensive suite of tests covering core actions that modify the database
Any API module that returns true from needsToken

Write comprehensive suite of tests covering core query actions
And API module extending ApiQueryModule