Topic on Talk:Wikimedia Release Engineering Team/MW-in-Containers thoughts

On number of versions and rolling back

2
LarsWirzenius (talkcontribs)

Here's an idea, possibly too crazy: we build new container versions, in sequential order somehow: v1, v2, ...


We deploy each a new version, and after it's run acceptably with production traffic for time T, we label is golden. Any non-golden version can be rolled back (possibly automatically, based on error rate or UBN; possibly manually be RelEng/SRE/CPT/....). If rolling back one version isn't enough, roll back further, until newest golden version.


Every roll back results in an alert to RelEng, SRE, CPT, and anyone with changes since the newest golden version until the rolled back version.

Jdforrester (WMF) (talkcontribs)

Definitely agreed on the "golden" label with possible auto-rollback, but sometimes golden labels turn sour over time, either temporary (e.g. a configuration setting for which endpoint the DBs are loaded from) or permanently (e.g. a feature is intentionall removed), so we may need more humans in the loop than ideal.

Reply to "On number of versions and rolling back"