Rolling deploys vs. globally atomic deploys
In the table PHP now claims to want rolling deploys (as a 'SHOULD'), while earlier there was discussion of "Cluster-wide atomicity" in https://etherpad.wikimedia.org/p/DeploymentSystemRequirements. Is this change intentional?
- Rolling deploys for MW is not exactly the right terminology for the desired outcome. What Greg and I have discussed should me more properly described as 'gated' or 'phased' deploys. This would be a big change to infrastructure configuration and thinking where by changes were rolled out to increasing percentages of incoming traffic via discrete clusters of servers. This is just a wishlist dream at this point. I don't have the full plan worked out but conceptually the idea would be to stage new releases (eg 1.23wmf19) to a percentage of traffic across all wikis rather than to all traffic on a percentage of wikis. This is really more like a series of progressively larger canary deploys rather than an rolling restart type of deploy as desired for services like parsoid and elasticsearch. In the ultimate expression this enables so called "continuous deploys" where the unit of change is kept very small and the progression from 1% -> 3% -> 8% -> 21% -> 55% -> 100% of traffic happens in hours or minutes instead of weeks. --BDavis (WMF) (talk) 04:30, 20 March 2014 (UTC)