Parsoid/Round-trip testing

Jump to navigation Jump to search

The Parsoid code includes a round-trip testing system that tests code changes, composed of a server that gives out tasks and presents results and clients that do the testing and report back to the server. The code is in the testreduce repo which is an automatic mirror of the repo in gerrit. The roundtrip testing code has been fully puppetized.

There's a private instance of the server on testreduce1001 that currently tests a representative (~160000) set of pages from different wikipedia languages. You can access access the web service at (FIXME: there is some troubleshooting to be done here wrt serving CSS on testreduce1001. So, for now, you are better of setting up a ssh tunnel to testreduce1001 and access the web results like that. See below).

Private setup[edit]

The instructions to set up a private instance of the round-trip test server can be found here. A MySQL database is needed to keep the set of pages and the testing results.

RT-testing setup[edit]

Coordinator runs on testreduce1001. RT-testing clients run on testreduce1001. You need access to a bastion server on to access testreduce1001. See SSH configuration for access to production on wikitech. These clients access Parsoid REST API that runs on scandium.

The clients are managed/restarted by systemd and the config is in /lib/systemd/system/parsoid-rt-client.service. Please do not modify the config on testreduce1001 directly (they will be overwritten by puppet runs every 30 minutes). Any necessary changes should be made in puppet and deployed.

To {stop,restart,start} all clients on a VM (not normally needed):

# On testreduce1001
sudo service parsoid-rt-client stop
sudo service parsoid-rt-client restart
sudo service parsoid-rt-client start

Client logs are in systemd journals and can be accessed as:

### Logs for the parsoid-rt-client service on testreduce1001
# equivalent to tail -f <log-file>
sudo journalctl -f -u parsoid-rt-client
# equivalent to tail -n 1000
sudo journalctl -n 1000 -u parsoid-rt-client

### Logs of the parsoid-rt testreduce server
sudo journalctl -f -u parsoid-rt

### Logs for the parsoid service
sudo journalctl -f -u parsoid

In the current setup, the testreduce clients talk to a global parsoid service that runs on scandium. So, look at the logs of the parsoid service on scandium to find problems / bugs with the parsoid code being tested. These logs are also mirrored to Kibana which you can find on this dashboard.

Starting a test run[edit]

It's probably best to check that we're not currently running tests on a parsoid commit. Use the sudo journalctl -f -u parsoid-rt-client command on testreduce1001 to verify that it says "The server does not have any work for us right now".

To start rt-testing a particular parsoid commit, run the following bash script on your local computer from your checked-out copy of Parsoid:

bin/ <your-bastion-id> <sha-of-parsoid-code-to-rt-test>
# Ex: bin/ ssastry 645beed2

This updates the parsoid checkout on scandium and testreduce1001, and restarts the parsoid-php and parsoid-rt-client services.

Updating the round-trip server code[edit]

The rt-server code lives in /srv/testreduce and runs off the ruthenium branch.

After review/merge on main, checkout the ruthenium branch locally. Merge main into scandium. If you need to update node_modules/ do that as well, and commit it, then push the branch to gerrit (you should have push rights).

cd /srv/testreduce
## Please verify you are in the ruthenium branch before the git pull
git pull
sudo service parsoid-rt restart

Running the regression script[edit]

After an rt run, we compare diffs with previous runs to determine if we've introduced some new semantic differences. However, since the runs happen on different dates and the production data is used, there's going to be some natural churn to account for. The regression script automates the process of rerunning the rt script on a handful of pages to determine if there are any true positives

# on local machine
# Setup an ssh tunnel to get the results of the rt run
ssh -L 8003:localhost:8003 testreduce1001.eqiad.wmnet
# Copy the regressions from the following sources to some file on your local machine
# http://localhost:8003/commits        [click on the commit pair to check, then "Regressions"]
# http://localhost:8003/rtselsererrors [XXX CSA XXX this isn't right?]

# on local machine
# Make sure that an rt run isn't in progress on testreduce1001 / scandium
php tools/regression-testing.php -u <bastion-uid> -t <title-file> <oracle> <commit>

Note that the script will checkout the specified commits while running but that it doesn't do anything for dependencies -- Parsoid will be running in integrated mode with the latest production mediawiki version(s) and their corresponding mediawiki-vendor packages (depending on the Host field used in the request). So, at present, it isn't appropriate when bumping dependency versions in between commits. The <oracle> ("known good") and <commit> ("to be tested") can be anything that git recognizes, including tag names -- they don't necessarily have to correspond to the hashes you provided to the rt server, although usually that's what you'll use. Crashers prevent the script from running and may need to be pruned. They can then be tested individually as follows on testreduce1001.

cd /srv/parsoid-testing
git checkout somecommit
node bin/roundtrip-test.js --proxyURL http://scandium.eqiad.wmnet:80 --parsoidURL http://DOMAIN/w/rest.php --domain "Sometitle"

It's also a good idea to check on the parsoid-tests dashboard for notices and errors.

Running Parsoid tools on scandium[edit]

Parsoid will run in integrated mode on scandium from /srv/parsoid-testing but it requires use of the MWScript.php wrapper in order to configure mediawiki-core properly. More information on mwscript is at Extension:MediaWikiFarm/Scripts. A sample command would look like:

$ echo '==Foo==' | sudo -u www-data php /srv/mediawiki/multiversion/MWScript.php /srv/parsoid-testing/bin/parse.php --wiki=hiwiki --integrated

Parts of that command are often abbreviated as an helper alias mwscript in your shell to make invocations easier.

Todo / Roadmap[edit]

Please look at the general Parsoid roadmap.

Server UI and other usability improvements[edit]

We recently changed the server to use a templating system to separate the code from the presentation. Now other improvements could be done on the presentation itself.

Ideas for improvement:[edit]
  • Improve pairwise regressions/fixes interface on commits list bug 52407. Done!
  • Flag certain types of regressions that we currently search for by eye: create views with
    • Regressions introducing exactly one semantic/syntactic diff into a perfect page, and
    • Other introductions of semantic diffs to pages that previously had only syntactic diffs.
  • Improve diffing in results views:
    • Investigate other diffing libraries for speed,
    • Apply word based diffs on diffed lines,
    • Diff results pages between revisions to detect new semantic/syntactic errors,
    • Currently new diff content appears before old, which is confusing; change this.