Core Platform Team/Initiative/Add API integration tests/Epics, User Stories, and Requirements

Features, MVP:

 * The test runner executes each test case against the given API and reports any failures to comply with the expected results
 * human readable plain text output
 * Support regular expression based value matching

Features required to fulfill this project's goals:

 * support for variables
 * extracted from response
 * randomized
 * loaded from config file
 * support for cookies and sessions
 * discover tests by recursively examining directories
 * support fixtures
 * execute fixtures in order of dependency

Features expected to become useful or necessary later, or for stretch goals:
(roughly in order or relevance)


 * discover tests specified by extensions (needs MW specific code?)
 * allow test suites to be filtered by tag
 * execute only fixtures needed by tests with the tag (or other fixtures that are needed)
 * allow for retries
 * allow tests to be skipped (preconditions)
 * support JSON output
 * Support running the same tests with several sets of parameters (test cases)
 * support for cookie jars
 * allow test for interactions spanning multiple sites
 * support HTML output
 * parallel test execution

Requirements that seem particular to (or especially important for) the Wikimedia use case:

 * HTTP-centric paradigm, focusing on specifying the headers and body of requests, and running assertions against headers and body of the response.
 * Support for running assertions against parts of a structured (JSON) response (JSON-to-JSON comparison, with the ability to use the more human friendly YAML syntax)
 * filtering by tags (because we expect to have a large number of tests)
 * parallel execution (because we expect to have a large number of tests)
 * yaml based declarative tests and fixtures: tests should be language agnostic, it should be easy to write tests for people involved with different language ecosystems and code bases. This also avoids lock-in to a specific tool, since yaml is easy to parse and convert.
 * generalized fixture creation, entirely API based, without the need to write "code" other than specifying requests in yaml.
 * randomized fixtures, so we can create privileged users on potentially public tests systems.
 * control over cookies and sessions
 * ease of running on in dev environments without the need to install additional tools / infrastructure (this might by a reason to switch to python for implementation; node.js is also still in the race).
 * ease of running in WMF's CI environment (jenkins, quibble)
 * option to integrate with PHPUnit
 * discovery of tests defined by extensions.
 * dual use for monitoring a live system, in addition to testing against a dummy system

Usage of variables and fixtures:

 * have a well known root user with fixed credentials in config.
 * name and credentials for that user are loaded into variables that can be accessed from within the yaml files.
 * yaml files that create fixtures, such as pages and users, can read but also define variables.
 * Variable values are either random with a fixed prefix, or they are extracted from an http response.
 * Variable value extraction happens using the same mechanism we use for checking/asserting responses

Epic 2: Baseline implementation of Phester test runner
T221088: Baseline implementation of Phester test runner

Functional requirements:

 * Run the test suites in sequence, according to the order given on the command line
 * Execute the requests within each suite in sequence.
 * The runner can be invoked from the command line
 * Required input: the base URL of a MediaWiki instance
 * Required input: one or more test description files.
 * The test runner executes each test case against the given API and reports any failures to comply with the expected results
 * human readable plain text output
 * Support regular expression based value matching
 * Test definitions are declarative (YAML)

Rationale for using declarative test definition and YAML:


 * test definition not bound to a specific programming language (PHP, JS, python)
 * keeps tests simple and "honest", with well defined input and output, no hidden state, and no loops or conditionals
 * no binding to additional tools or libraries, tests stay "pure and simple"
 * Easy to parse and process, and thus to port away from, use as a front-end for something else, or analyze and evaluate.
 * YAML is JSON compatible. JSON payloads can just be copied in.
 * Creating a good DSL is hard, evolving a DSL is harder. YAML may be a bit ugly, but it's generic and flexible.

Implementation notes:

 * The test runner should be implemented in PHP. Rationale: It is intended to run in a development environment used to write PHP code. Also, we may want to pull this into the MediaWIki core project via composer at some point.
 * Use the Guzzle library for making HTTP requests
 * The test runner should not depend on MediaWiki core code.
 * The test runner should not hard code any knowledge about the MediaWiki action API, and should be designed to be usable for testing other APIs, such as RESTbase.
 * The test runner MUST ask for confirmation that it is ok for any information in the given target API to be damaged or lost (unless --force is specified)
 * No cleanup (tear-down) is performed between tests.

Story: Try writing basic tests
T225614:Create an initial set of API integration tests

The tests below should work on the MVP, so they must not require variables. So no login.

For MediaWiki, relevant stories to test are:
 * anonymous page creation and editing, verify change in content (verifying access to old revisions requires variable support) via API:Edit
 * re-parse of dependent pages (red links turning blue, missing templates getting used after being created) via API:Edit and then fetching the rendered page from the article path (index.php?title=xyz)
 * page history with edit summary, size diff (testing minor edits and user names requires variables for login) see API:Revisions
 * recent changes with edit summary, size diff, etc API:RecentChanges
 * renaming/moving a page (basic) via API:Move
 * pre-save transform (PSR) (via API:Edit)
 * template transclusion via API:Parsing wikitext
 * parser functions via API:Parsing wikitext (some)
 * magic words via API:Parsing wikitext (some)
 * diffs (use relative revision ids) via API:Compare
 * fetching different kinds of links / reverse links (see API:Query)
 * listing category contents (see API:Query)

In addition, it would be useful to see one or two basic tests for Kask and RESTbase.

It would be nice to see what these tests look like in our own runner (phester) and for some of the other candidates like tavern, behat, codeception, or dredd. It's not necessary to write all the tests for all the different systems, though.

Story: Try writing tests using variables
T228001: Create a set of API integration tests using variables

Without support for variables in phester, these tests have to be written "dry", with no way to execute them.


 * watchlist (watch/unwatch/auto-watch)
 * changing preferences
 * bot edits (interaction of bot permission and bot parameter)
 * undo
 * rollback
 * diffs with fixed revision IDs (test special case for last and first revision)
 * patrolling, patrol log
 * auto-patrolling
 * listing user contributions
 * listing users
 * page deletion/undeletion (effectiveness)
 * page protection (effectiveness, levels)
 * user blocking (effectiveness, various options)
 * MediaWiki namespace restrictions
 * user script protection (can only edit own)
 * site script protection (needs special permission)
 * minor edits
 * remaining core parser functions
 * remaining core magic words (in particular REVISIONID and friends)
 * Pre-save transform (PST), signatures, subst templates, subst REVISIONUSER.
 * Special pages transclusion
 * newtalk notifications
 * media file uploads (need to be enabled on the wiki) (needs file upload support in phester)
 * site stats (may need support for arithmetic functions)

It would be nice to see what these tests look like in our own runner (phester) and for some of the other candidates, like tavern, behat, codeception, or dredd.

Story: Decide whether to invest in Phester, and decide what to use instead
T222100: Decide whether creating Phester is actually worth while

See: https://docs.google.com/spreadsheets/d/1G50XPisubSRttq4QhakSij8RDF5TBAxrJBwZ7xdBZG0

Candidates:


 * Phester
 * strst
 * tavern
 * behat
 * codeception
 * cypress.io
 * RobotFramework
 * SoapUI
 * dredd

Criteria:


 * runtime
 * execution model
 * ease of running locally
 * easy of running in CI
 * test language
 * ease of editing
 * easy of migration
 * scope/purpose fit


 * stability/support
 * cost to modify/maintain
 * control over development
 * documentation
 * license model
 * license sympathy


 * recursive body matches
 * regex matches


 * variables
 * variable injection


 * global fixtures


 * JSON output


 * scan for test files
 * filter tests by tag


 * parallel execution

Story: Add any missing functionality to the selected runner.
T227999: Extended implementation of Phester test runner

If we decide on continuing phester, this means implementing the features listed as "required to fulfill this project's goals" above.

Story: Implement initial set of tests
Implement the tests defined during the evaluation phase (Epic 3) for the actual runner. If we go with Phester, the tests should not have to change at all, or just need minor tweaks. If we go with a different frameworks, the experimental tests need to be ported to that framework.

Store: Document test runner
Create documentation that allows others to create tests and run them.

Write comprehensive suite of tests covering core actions that modify the database
Any API module that returns true from needsToken

Write comprehensive suite of tests covering core query actions
And API module extending ApiQueryModule