Parsoid/Round-trip testing

The Parsoid code includes a round-trip testing system that tests code changes, composed of a server that gives out tasks and presents results and clients that do the testing and report back to the server. The code is in the testreduce repo which is an automatic mirror of the repo in gerrit. The roundtrip testing code on scandium has been fully puppetized.

There's a private instance of the server on scandium that currently tests a representative (~160000) set of pages from different wikipedia languages. You can access by setting up a ssh tunnel to scandium as follows:  will let you access the web service at http://localhost:8003 locally on your computer.

Private setup
The instructions to set up a private instance of the round-trip test server can be found here. A MySQL database is needed to keep the set of pages and the testing results.

RT-testing setup
Coordinator runs on scandium. RT-testing clients run on scandium and commit suicide when the revision of that checkout changes. You need access to bastion on wikimedia.org to access scandium.

The clients are managed/restarted by systemd and the config is in /lib/systemd/system/parsoid-rt-client.service. Please do not modify the config on scandium directly (they will be overwritten by puppet runs every 30 minutes). Any necessary changes should be made in puppet and deployed.

To {stop,restart,start} all clients on a VM (not normally needed):

Client logs are in systemd journals and can be accessed as: In the current setup, the testreduce clients talk to a global parsoid service that runs on scandium. So, look at the logs of the parsoid service to find problems / bugs with the parsoid code being tested. These logs are also mirrored to Kibana which you can find on this dashboard.

Updating the code to test (and being run by the clients)
UPDATE ME: CSA updated the scandium configuration (via gerrit Ia9026b1da57d1e4397a946cce50b708b0e953c62 and I3cfbd3e9ea68513cf4a16d1f5b49f0261cabdddc) to load parsoid from a git checkout in. You can cd to that dir and git pull to update the version of parsoid in use. The below instructions re  etc refer to a different checkout in   and the scripts haven't been updated yet for the new state of the world. FIX ME FIX ME FIX ME

To update rt-testing code, run the following on scandium:

This updates the parsoid checkout, restarts the parsoid service, and the parsoid-rt-client service.

In order to kick off a new round of testing, edit  and add a new line with a new entry. Since the RT testing code still has code to let us run tests against Parsoid/JS or Parsoid/PHP, you need to add a "PHP:" prefix to ensure the Parsoid/PHP test run actually kicks off. Normally, we've used a "PHP:" as a test run id. To ensure the test run starts right away (vs after the clients notice the change), you can. Then watch the logs for parsoid-rt service (see the intro in this section with info about logs).

Updating the round-trip server code
The rt-server code lives in /srv/testreduce and runs off the ruthenium branch.

After review/merge on master, checkout the ruthenium branch locally. Merge master into scandium. If you need to update node_modules/ do that as well, and commit it, then push the branch to gerrit (you should have push rights).

Todo / Roadmap
Please look at the general Parsoid roadmap.

Server UI and other usability improvements
We recently changed the server to use a templating system to separate the code from the presentation. Now other improvements could be done on the presentation itself.

Ideas for improvement:

 * Improve pairwise regressions/fixes interface on commits list . Done!
 * Flag certain types of regressions that we currently search for by eye: create views with
 * Regressions introducing exactly one semantic/syntactic diff into a perfect page, and
 * Other introductions of semantic diffs to pages that previously had only syntactic diffs.
 * Improve diffing in results views:
 * Investigate other diffing libraries for speed,
 * Apply word based diffs on diffed lines,
 * Diff results pages between revisions to detect new semantic/syntactic errors,
 * Currently new diff content appears before old, which is confusing; change this.