Talk:Wikimedia Release Engineering Team/CI Futures WG/Requirements

About this board

Devs sshing into machine - only 'would be nice'?

5
Summary by LarsWirzenius

Requirement "Must allow developer to replicate locally the tests that CI runs" satisfies this.

EEggleston (WMF) (talkcontribs)

I see this is under "would be nice":

  • Developers should have an option to ssh to VM/container that CI used to run the tests for debugging.

That's been pretty indispensable in debugging some unit tests in my experience, especially more complex custom things like the civicrm setup.

Jdforrester (WMF) (talkcontribs)

sshing into the temporary container created in CI (hosted on CI) to test the patch isn't currently supported, unless I'm missing something? Doing the same on a local machine is indeed a key debugging technique, but not quite the same.

EEggleston (WMF) (talkcontribs)

Ah, right, with the docker images I had to create a snapshot before the test finished, and then run bash in that. As long as there's SOME way get in and debug tests I'm happy!

LarsWirzenius (talkcontribs)

Does the requirement "Must allow developer to replicate locally the tests that CI runs. This is necessary to allow lower friction in development, as well as to aid debugging." suffice for this?

EEggleston (WMF) (talkcontribs)

Sure, that sounds good.

Dependency caching (pip/composer/etc.) is a must

4
Summary by Legoktm

It is now listed in the "hard requirements" section.

Legoktm (talkcontribs)

If there isn't dependency caching, then we run into all sorts of rate limits, errors, etc. Also, it cuts down on a huge source of flakiness (I'm looking at you npm!).

BDavis (WMF) (talkcontribs)

Don't forget how often Composer has flaked out due to rate limiting from GitHub's side (phab:T106452)!

LarsWirzenius (talkcontribs)

There is a requirement "Must support dependency caching – we have castor, maybe we could do better? Maybe some CI systems have this figured out?", which is meant to cover this. The wording could certainly be improved. Suggestions?

Legoktm (talkcontribs)

When I initially wrote the comment, it was listed in the "would be nice to have" category - but you moved it a short while after, so all good here.

Proposed new requirement: Ability to push 'draft' patches, skipping CI

3
Jdforrester (WMF) (talkcontribs)
LarsWirzenius (talkcontribs)

Is marking a change as WIP in Gerrit sufficient for this?

Jdforrester (WMF) (talkcontribs)

Well, we'd need to have a way to configure CI to not run in those circumstances (and make clear to users that that was happening). And then we'd need a way to manually trigger CI to run even on a draft patch. But if you think that's a reasonable requirement, yes.

Reply to "Proposed new requirement: Ability to push 'draft' patches, skipping CI"

Tight integration with Gerrit

1
Legoktm (talkcontribs)

"Must allow changing git repository, code review, and ticketing systems from Gerrit and Phabricator."

I worry a bit that requiring the solution to support other repo/CR systems will give us a worse result. I'd like to see much tighter integration with Gerrit, like inline comments and uploading new patches, etc. (as explained in my wishlist). I think overall solutions that are more tightly integrated will provide a better user experience than ones that are more abstract and provide general interfaces.

Reply to "Tight integration with Gerrit"

Legoktm's CI wishlist

1
Legoktm (talkcontribs)

Mostly unsorted late night thoughts right now. Here's what I want:

  • If a CI tool spits out a report of errors on specific lines, then we should be able to show those errors inline instead of looking up in the jenkins console the line numbers.
  • Multiple tools support auto fixing errors. It should be possible to automatically apply those fixes without manually doing it locally.
  • I want to be able to use dependencies defined in a repository when checking out out git repos so dependencies don't have to be defined centrally in CI config (some kind of recursive dependency resolver probably)
  • I would like the list of "jobs" to be run for a repo like phan and npm and composer to be defined in the repository itself (like .travis.yml) BUT the job definitions about how to execute "phan" etc. should be defined centrally in CI configuration.
  • I would like gate jobs to be run regularly on a cron-ish schedule. Right now extension tests depend upon plenty of inputs that aren't necessarily in that repo, and its common for things to start failing on random patches because those external factors have changed. And if no one submits a patch in a while, it never gets noticed. We could notice it if we ran tests on a cron schedule of some kind, and had a dashboard for failing stuff.
  • Smarter tracking of flaky errors, maybe something similar to http://status.openstack.org/elastic-recheck/
  • Ability to prototype/debug/change jobs before committing them to version control, as we currently do with jjb.
  • Ability to test patches in one repo against all other repositories. Say I make a large/high-risk change to a structure test or class in MediaWiki core, and want to ensure I didn't break anything, or want to gauge the breakage. I want to be able to trigger all extensions to run all of their tests against my MediaWiki core patch. Fedora has a rebuild branch concept that is similar to this IMO.

Maybe some more later. HTH.

Reply to "Legoktm's CI wishlist"

"Must be fast enough that it isn't perceived as a bottleneck"

1
Legoktm (talkcontribs)

Based on this requirement, it's a little unclear to me what's in scope here. I don't think anyone perceives jenkins or zuul as the bottleneck, we all know that the bottleneck is usually the tests themselves.

But for most developers, the test execution *is* jenkins, so the perception will always be that jenkins is slow - which isn't necessarily wrong.

Reply to ""Must be fast enough that it isn't perceived as a bottleneck""
Tgr (WMF) (talkcontribs)

The part of CI developers interact with the most is displaying test output on failure (other than the test itself, if we count waiting for it as interaction). Even small changes in usability (e.g. is the system intelligent enough to highlight the part of a thousand-line log that contains the error?) have a fairly large effect on developer productivity.

Reply to "Test output UX"
Tgr (WMF) (talkcontribs)

Support for dependent patches (ie. Depends-On) is not mentioned; is that part of supporting Gerrit? It should be a hard requirement IMO.

Reply to "Dependent patches"
There are no older topics