Wikimedia Release Engineering Team/Monthly notable accomplishments

This page lists notable accomplishments for the month as we come up with them during our weekly team meetings.

Jan

 * Folks clamoring to use pipeline
 * Dan & Ahmon jumping in to help devs
 * Chad moved!
 * Instance-wide runners!
 * First non-us/non-ci repo deployed via GitLab Pipeline: https://docker-registry.wikimedia.org/repos/data-engineering/mediawiki-event-enrichment/tags/
 * No more repos on diffusion!
 * Kokkuri's python now!


 * Buildkit upstream patch to make client connections more robust in case of loss
 * Smooth wmf.22 train
 * docker-pkg build --list \o/

Oct

 * Blubber's gitlab ci file is enough to move the project over
 * Setup in gitlab ci: jwt auth and buildkit
 * New features in blubber: builders improvement from Jaime + feature for running variant from Jaime
 * Jaime + Antoine upgraded scap in the dev-tools project
 * Team likes each other
 * Train feels less stressful lately
 * GitLab runners in K8s now running buildkit with caching—nothing to a k8s cluster
 * Scap builds docs on GitLab

Nov

 * Scap repo is fully moved over to GitLab
 * Internship opportunities!
 * Dan's daughter can now pedal a bike
 * Critical systems list
 * Antoine fixed mixed-case usernames in Gerrit
 * Gerrit upgrade
 * Scap3 dev env: https://gitlab.wikimedia.org/repos/releng/scap3-dev
 * Phabricator is now hosted on a new box at phab1004 and deployed with scap
 * registry-based caching
 * reggie is in use and working
 * Autoscaling
 * Kokkuri

Dec

 * Antoine replaced Docker with Podman
 * https://wikitech.wikimedia.org/wiki/Gitlab/Phabricator_integration
 * Provision Horizontal Pod Autoscaler (HPA) for GitLab cloud runners https://phabricator.wikimedia.org/T323164
 * certmanager for DO k8s registry
 * MW-ok-k8s routing traffic Soon™ —our part works \o/ woo
 * Dan's GitLab CI presentation to tech-all!

July

 * We don't specifically have any reason to think our GitLab instance has been owned, necessarily.
 * Small merges for mwpresync
 * GitLab runner config management changes merged!
 * Increased the team knowledge of scap3
 * Pending major update for GitLab
 * Scap-o-scap installed in beta! \o/

August

 * Train-blockers toolforge scrapes from phab \o/
 * Nagged GitLab into updating their FAQ: https://gitlab.com/gitlab-org/gitlab/-/issues/363212#note_1066797431
 * Clare used scap backport for real
 * Phabricator (probably) deploys from scap 3
 * Beta exists still
 * Chad re-earning t-shirt
 * Upgraded Gerrit from 3.4.4 to 3.4.5
 * Scap-backport improvements, seeing increased use
 * Renewed GitLab relationship!
 * Moved Gerrit replica server!
 * Yet another successful train, automatic edition this time!
 * Team reviews are fast!
 * Gitlab JWT STUFF MERGEDDDDDD \o/

Sept

 * GitLab meeting with Bryan at GitLab—ultimate is free if we want it, license compliance thing (https://docs.gitlab.com/ee/user/compliance/license_compliance/#license-compliance)
 * Stage-train automatic mode ran all the way through!
 * MW-to-k8s deploy via scap
 * Blocker/resource conversation
 * GitLab trusted runners testing can progress now that we can hit the internet
 * Successfully built an image that fetched node and python packages from the internet
 * sooo close to deploying phab with scap
 * Tyler fixed the toolforge script generating the Deployment page (got broken in May https://wikitech.wikimedia.org/wiki/Special:Contributions/DeploymentCalendarTool ).
 * Antoine's first Go patch \o/ \o/
 * Build images on GitLab trusted runners!
 * https://phabricator.wikimedia.org/phame/post/view/297/scap_backport_makes_deployments_easy/
 * Finished migrating the last two services to PipelineLib
 * Antoine can share his screen in firefox! :D
 * Blubber and scap review!
 * One command scap release—a 10,000% increase in productivity vs 1 year ago

April

 * Jamie deployed the train
 * Jaime rolled back the train
 * GPG keysigning
 * Fixed bug in proxy balancing in scap
 * Scap deploy-promote
 * Scap 4.7.0-fully out; 4.7.1 going out this week!
 * Scap 4.7.1 fixes cross-datacenter pulls!
 * New Phatality deploy
 * Scap backport

May

 * Our long, grinding efforts at deployment training are finally starting to result in more people doing deploys (well, ok, they've resulted in Clare doing deploys, anyway) \o/
 * Rolled back train five times
 * Deployment tooling just kind of sucks less than it used to
 * Merged scap backport
 * scap stage-train \o/
 * Finally got rid of the generic service-pipeline-* jobs and migrated remaining 23 projects to use bespoke `.pipeline/config.yaml` based jobs
 * gitlab-a-thon
 * We found a a whooooole lot of blockers
 * Dan being a good open-source citizen: https://github.com/moby/buildkit/pull/2868
 * JWT implements oauth2
 * Could be used to authorize push access to namespace based on project path
 * Root access on phabricator
 * Updated Jenkins for Security—which broke Jenkins for a while
 * I think I finally remember how to use a standalone puppetmaster
 * https://wikitech.wikimedia.org/wiki/Help:Standalone_puppetmaster
 * ERC going well and DEI moving ahead
 * Dan got changes to buildkit merged upstream
 * Seems like we're pretty close to how auth will work for publishing images from GL
 * serviceops are plodding ahead on GitLab physical machines
 * CI for blubber in gitlab
 * update scap backport to work with new zuul plugin
 * new tests for scap backport
 * scap tests run without deprecation warnings (for stretch, buster, and bullseye)
 * Giuseppe plans to enable always-restart-php-fpm on Thursday.
 * Docs for GitLab are somewhat less crappy than they were a week ago
 * Upgraded Gerrit in train-dev to match production
 * Hired Backfill
 * Phabricator deployment runbook https://wikitech.wikimedia.org/wiki/Phabricator/Deployment

June

 * GitLab Sprint summary by Brennen https://phabricator.wikimedia.org/phame/post/view/288/gitlab-a-thon/
 * We have GitLab on new metal, and can probably enable GL Container Registry \o/
 * We know more about git than we did in May
 * Functional scap already self-installed in prod
 * JWT presentation!
 * Phab deployment has a runbook https://wikitech.wikimedia.org/wiki/Phabricator/Deployment
 * scap scap
 * successfully used it
 * ITCs getting done
 * somebody noticed phab
 * scap backport revert
 * scap rollback
 * scap pushing commits with a shared ssh key
 * Chad has some equipment: monitor and docking station required
 * Dan's rested
 * uneventful train
 * Antoine got rest api in gerrit working for jsonschema

January

 * New release of Java Gearman plugin
 * Gerrit upgrade! (3.3.6 → 3.3.9)
 * Ahmon debugging of contint1001
 * mwcli coolest tool honorable mention
 * Production benchmarks of yaml parsing for MW
 * New test environment progress for GitLab
 * Plan for trusted runners coming together
 * Scap prep auto working in beta (on the way to prod)
 * Gerrit 3.3→3.4 prework to solve javascript incompat
 * CI-agents upgrade prep for stretch -> bullseye
 * MWCLI progress and GoLand for IntelliJ

February

 * logspam-watch is fast now because Ahmon is good at computers and can tolerate modifying Perl
 * Wrapping up MediaWiki settings loader expedition caching—abstractions for MediaWiki config settings—they can declare how they're cached and how long (with probablistic early expiry to help with lock contention)
 * Mukunda has freed himself from train :D
 * Jaime joined, already has a scap patchset
 * Things we didn't do but benefit from: New GitLab test instance in devtools project: https://gitlab.devtools.wmcloud.org/ --> same puppet as prod!
 * Tyler's DigitalOcean exploration spike
 * Scap prep auto! is neat <3
 * Antoine's qemu learnings!
 * moving train-dev to helm3

March

 * Scap 4.4.1 released (includes container image building stuff) \o/
 * Brennen talked publicly and was not shamed by it
 * Onboarding Jaime!
 * Dan's blubber demo!
 * docker build can use a blubber file directly now
 * Supports bd808 and developer tooling
 * Opens up for a more flexible release model
 * Job post posted
 * scap backport exits if change is not mergeable
 * Trainsperiment was instructive AND SUCCESSFUL!
 * Got a working mw container image deployed from deploy1002 woohoo 🐳🎉
 * Fixed check-new-errors script (extremely tiny win) \o/
 * Jaime's first scap release!
 * I don't have to press enter any more! (added -n)
 * Train schedule worked!
 * deploy-promote upstreamed to scap!

December

 * Greg now director of FR tech!😅
 * Performance scap improvements (beta-scap-sync-world takes less time than ever)
 * Scap backtrace much cleaner (fewer of them for common error situations)
 * Several small Phabricator improvements including:
 * improvements to in-progress status for workboards
 * anti-vandalism feature to prevent merging more than 5 tasks in a single transaction.
 * https://phabricator.wikimedia.org/T298063
 * https://phabricator.wikimedia.org/T288956
 * https://phabricator.wikimedia.org/T295934
 * https://phabricator.wikimedia.org/T297249
 * Test wiki that has NO LocalSettings.php — just uses yaml
 * Pipelinelib for helm3
 * Scap backport validation

November

 * Deploy MediaWiki manually in Train-Dev to k8s!
 * Pipeline supports copying files out of containers as published artifacts in Jenkins!
 * Deployed Tuesday with a script thing!
 * Scap 4.0.3 release!
 * record time for scap release
 * Worked?!l out a path forward for GitLab runner architecture with SRE; moved projects to top-level group with runners: https://gitlab.wikimedia.org/repos \o/


 * https://gitlab.wikimedia.org/thcipriani/bacon-stats#-bacon-window-stats
 * Data³ dashboard new stuff!

October

 * Scap 4 release!! Now with more Python
 * GitLab upgrades
 * Gerrit added to train-dev unblocks scap backport dev
 * Data^3d is functional and ready for demos
 * More people using train dev
 * Antoine uses chrome^Hium |not sure that is a win|
 * Gerrit upgrade to 3.3.6 (fixes some minor ui glitches)
 * Client side errors are blocking the train
 * GitLab is open to all
 * dashboards getting close to demo-worthy?  http://173.17.185.55:8001/-/dashboards/project-metrics?project=PHID-PROJ-uier7rukzszoewbhj7ja

September

 * Upgraded GitLab to 14.x release
 * Migrated dev-images repo to GitLab
 * GitLab usernames fixed
 * Productive collaboration with an upstream https://github.com/rclement/datasette-dashboards/issues/9
 * Successfully set up gitlab ci on ddd: https://gitlab.wikimedia.org/releng/ddd/-/pipelines/665

August

 * Started dev-images to buster
 * Gerrit 3.3
 * Successful php_fpm_always_restart: true test (https://phabricator.wikimedia.org/T266055)
 * GitLab soft launch
 * migrated mw-cli to gitlab, got docker-in-docker integration tests working (thanks addshore)
 * https://gitlab.wikimedia.org/releng/cli/-/blob/master/.gitlab-ci.yml
 * Finished dev-images to buster
 * Merged workboard metrics code!
 * Reviewed on GitLab
 * GitLab code review experience ftw
 * A successful (?) interaction with GitLab upstream
 * GitLab upstream merge request in progress
 * Node 14 patch updated
 * Emacs installed on releasesXXXX servers
 * Mukunda learned how to extend datasette with ddd/phab functionality

July

 * Projects exist on GitLab
 * Gerrit upgrade pairing
 * Published local dev cli

April

 * Scap 3.17.1 tagged
 * GitLab Ansible code review
 * Deployment trainings

May

 * Quibble 0.0.47
 * Jenkins upgrade to latest LTS
 * Released new upstream Jenkins Gearman plugin
 * Wikitech Gerrit docs updated
 * data³ used successfully to extract train blocker stats from Phabricator
 * Added transaction metadata to Phabricator task transactions api so that tools can get more detailed transaction details required for the train blockers analysis.
 * Quibble weekly meetings
 * gitlab.wikimedia.org is running (still needs cas registration)
 * Documented the process for adding languages to phabricator, as well as maintaining the translation strings from translatewiki. All of this is now documented in the README for the phabricator translation repo. That change can be seen here: https://phabricator.wikimedia.org/rPHTR0de9c13ef996326a99d6320f4c26669901f3aff4

June

 * Knowledge transfer on Gerrit deployments
 * Running gitlab.wikimedia.org, real use now
 * Guiseppe reports: curl -H 'Host: en.wikipedia.org' https://staging.svc.eqiad.wmnet:4444/wiki/Main_Page works
 * Automatic notification of security patch application failures.  One real use so far.

January

 * Update dev images to split apache and php containers for local dev
 * Gerrit security bug discovery and deployed fix by Antoine
 * In sync with Gerrit upstream war (Java compiled code)
 * Target releases for apt packages in blubber deployed so wuvi can use npm

February

 * PipelineLib fully working on releases-jenkins.wikimedia.org
 * Rust introduction talk (not strictly RelEng business)
 * logspam-watch
 * Minimum hits consolidation feature
 * Error histograms, at-a-glance status indicators (emoji, it's emoji), improved UTF-8 handling and terminal resizing
 * Gearman plugin deployed. Merged bunch of pending changes + a fork from GoodData company which adds support for Pipeline jobs

March

 * PipelineLib fully working on releases-jenkins.wikimedia.org
 * Credentials added to pipelinelib
 * S&F contractors underway with production GitLab configuration
 * Terrible script for finding status of production errors on logstash dashboard
 * Ability to deploy phatality updates
 * scap apply-patches much improved

October

 * GitLab consultation

November

 * Gerrit security upgrade
 * Gerrit grafana dashboard
 * Created pipelinelib-experimental cloud project for working on pipelinelib
 * Scap 3.16.0 release (tagged, waiting on SRE now)
 * logspam-watch improvements
 * apparently scap apply-patches may possibly work in some circumstances
 * Upstream fix for shallow cloning in git: https://github.com/git/git/commit/fb3d1a083f776f02caa514cad8b232d8b974641f

December

 * Scap 3.16.0 released and deployed
 * Dropped scap plugins from mw-config
 * unconditional restart on deploy for opcache corruption deployed
 * https://doc-stage.wmcloud.org/, staging area for doc.wikimedia.org. Next prod then update related docs.
 * Scap source formatted with Black now
 * Runnable runbook blog

July

 * CI now supports REL1_35 branches (and ignores REL1_33).
 * Eliminate elasticsearch dependency from Phabricator search engine
 * Cassandra Docker image
 * Jenkins node Docker image cleanup & re-onlining after disk space recovers
 * Collection of disk space stats on Jenkins workers
 * Credentials and environment variables in PipelineLib
 * Blubber now correctly supports multi-stage artifact copies

August

 * Reduced the number of non-failure FAILURE messages in CI
 * After 9 months, Aphlict is finally back.
 * Scap version 3.15.0 released (in git, if not as .deb yet)

September

 * Scap 3.15.0 was deployed to all servers
 * We have trained another person to conduct the train
 * New phabricator metrics / stats in the project reports (currently deployed to cloud, prod coming soon)
 * image promotion in CI
 * Blog post about this: https://phabricator.wikimedia.org/phame/post/view/208/ci_now_updates_your_deployment-charts/
 * Tiny incremental improvements to logspam-watch (it now shows seconds)
 * GitLab consultation well underway
 * Released Quibble 0.0.45 https://doc.wikimedia.org/quibble/changelog.html
 * Local development mailing list and updates page

April

 * Docker images published on buster-based contint2001 (as part of general temporary switch-over from contint1001 to 2001 for buster migration)
 * Composer is now authenticated with github
 * Dropped basic PHP 7.1 testing from CI
 * Published Kubernetes migration tutorial
 * Phabricator milestone columns can now be moved on workboards
 * Phabricator workboards can be sorted by most recent activity.
 * Tech talk on PGP basics
 * "Cache of wmf-config/InitialiseSettings often 1 step behind" fixed! -

May

 * The release train branch cut is now an automatic job
 * Wikimedia Portals build and WDQS data release jobs moved to docker
 * The Continuous Integration instances on WMCS have been fully migrated off Jessie! T236576
 * Scap 1.14.0 released (by releng) and deployed (by serviceops)
 * Documentation for setting up a local dev environment for Phabricator: https://www.mediawiki.org/wiki/Phabricator/Local_Dev_Environment
 * CI server (contint) migrated to buster!

June

 * Scap plugins will move from mediawiki-config to scap git repository with the next release.
 * Deployment script added to deployment-charts for deploying to k8s
 * MediaWiki branch cuts are fully automated, at last!!!!
 * TMH job runner works in MediaWiki-Docker
 * Interactive logspam-watch
 * Gerrit 3.2.2

February

 * Production releases of Parsoid/PHP now also go through final pre-production tests
 * Scap release 1.13.0
 * Local development MediaWiki docker environment has shipped and been announced - https://lists.wikimedia.org/pipermail/wikitech-l/2020-February/093109.html / https://www.mediawiki.org/wiki/Docker

March

 * scap has its first integration test
 * MediaWiki tarball / Wikimedia production are now PHP 7.4-compatible.
 * All extension and skin repos are now being tested against PHP 7.4.
 * Analytics Refinery release job isolated into a Docker container.

December

 * PHP 7.4 testing was available in CI the first "business day" after 7.4.0 was released.
 * Revived "This week in logspam" email
 * Auto DBLists
 * PGP Key repo
 * Production config now has pre-merge diff reports, e.g.: https://integration.wikimedia.org/ci/job/operations-mw-config-php72-composer-diffConfig-docker/86/console

November

 * branch.py for cutting the branch for train
 * logspam-watch for tailing logfiles

October

 * Dev images are now automatically created as part of postmerge via the pipeline for:
 * Parsoid
 * Soon: RestBASE
 * (different from RESTbase? ;-))
 * Selenium documentation updated https://www.mediawiki.org/wiki/Selenium/Node.js
 * Quibble 0.0.36 released https://lists.wikimedia.org/pipermail/wikitech-l/2019-October/092658.html
 * Quibble 0.0.37 released https://lists.wikimedia.org/pipermail/wikitech-l/2019-October/092660.html
 * Quibble 0.0.38 & 0.0.39 released for mediawiki/tools/api-testing
 * Introducing Phatality - Streamlined error reporting from Kibana to Phabricator https://phabricator.wikimedia.org/phame/post/view/177/introducing_phatality/
 * HHVM removed from CI and MediaWiki.
 * Gerrit is on gerrit1001 now
 * … and so is most of the code review. ;-) :)
 * Unforked Jenkins Job Builder

September

 * Scap 3.12.1-1 released/deployed
 * Refactored Zuul layout to use per-branch pipelines
 * Lets you run arbitrary code against a working MediaWiki install
 * The phabricator "Report Error Code" form (https://phabricator.wikimedia.org/maniphest/task/edit/form/46/ ) has been updated with separate fields for the stack trace and error code/request id.
 * T232608 Delete selenium-daily-beta-EXTENSION Jenkins jobs that are broken more than 30 days
 * Write cached config to JSON as well as serialised PHP https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/533592/ (first step towards a saner config)
 * MediaWiki PHP support target modernised from 7.0+ to 7.2+ for 1.34 onwards. https://phabricator.wikimedia.org/T228342
 * Quibble 0.0.35 release
 * 1.34.0-wmf.24 branch cut was done /mostly/ with branch.py instead of make-wmf-branch.php (some small bugs remain to work out but it's very close)
 * Creating accounts was broken on beta cluster since 2019-09-08. It was fixed today (2019-09-25). https://phabricator.wikimedia.org/T232796
 * Phatality extension for Kibana deployed to production and used for reporting production errors into Phabricator.
 * Train blocker tasks created for 1.35.0-wmf.1—1.35.0-wmf.25
 * Dev images are now automatically created as part of postmerge via the pipeline for MediaWiki

August

 * Read only "gerrit-replica" active, handling 10% of all traffic (read from phab)
 * https://time.releng.team ¯\_(ツ)_/¯
 * Scap 3.12.0-1 in production

July

 * Migrated all generic CI jobs from PHP 7.0 to PHP 7.2 https://phabricator.wikimedia.org/T225457
 * Three new folks have been spun up on and have successfully run the Train, by end-of-month
 * it-phabricator plugin updated; fixes errors in All-Users repo in Gerrit
 * Completed first book club iteration: https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Book_club/Continuous_Delivery
 * Unit vs Integration test split announcement: https://phabricator.wikimedia.org/phame/post/view/169/changes_and_improvements_to_phpunit_testing_in_mediawiki/
 * Gerrit 2.15.14 deployed
 * Contint1001 now storing docker images on separate partition
 * Blubber 0.8.0 deployed - https://lists.wikimedia.org/pipermail/wikitech-l/2019-July/092344.html
 * Deployment Pipeline docs published on Wikitech - https://wikitech.wikimedia.org/wiki/Deployment_pipeline

June

 * Speculative CI meta-architecture published within WMF for feedback (two versions)
 * Old image versions automatically removed from jenkins agents when /var/lib/docker space > 80%
 * scap 3.10.0 cut
 * Jenkins build timings reports: https://people.wikimedia.org/~dduvall/jenkins/
 * Helped Kask team sketch an outline of its architecture (https://www.mediawiki.org/wiki/Kask)
 * Fatal Monitor with marker lines for deployments: https://logstash.wikimedia.org/app/kibana#/dashboard/77cc3e90-aa27-11e7-9109-51bd3197f7a9?_g=

May

 * Repository-hosted CI/CD pipeline configurations now supported (.pipeline/config.yaml) - https://phabricator.wikimedia.org/T210267
 * Train notes published on branch cut
 * Codehealth pipeline beta - https://phabricator.wikimedia.org/phame/live/1/post/160/introducing_the_codehealth_pipeline_beta/
 * Some baseline local development images published

April

 * Phabricator vandalism rollback tool completed 🎉 (blog post? 😉)
 * Upgrade Zuul to 2.5.1-wmf6 (which unblocks the Gerrit upgrade to 2.16) - https://phabricator.wikimedia.org/T208426
 * Team offsite in Chicago

March

 * CI tooling future WG started, blogged
 * GerritBot comments on patches going through the pipeline (with fancy badges and the like)
 * Train deploy notes are now automatically generated on branch push
 * Scap 3.9.2-1 released in production
 * Phabricator upgrade: https://phabricator.wikimedia.org/phame/post/view/147/projects_forms_and_subtypes_oh_my/
 * Published the ISOSTWG results and recommendation on officewiki and announced: https://office.wikimedia.org/wiki/Internal_Support_for_Open_Source_Tools_Working_Group
 * swat tags now show up in the deployment schedule (via lua magic)
 * Blog post: https://phabricator.wikimedia.org/phame/post/view/152/help_my_ci_job_fails_with_exit_status_-11/
 * CI future WG report: https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/CI_Futures_WG/Report
 * Blog post: https://phabricator.wikimedia.org/phame/post/view/155/quibble_hibernated_it_is_time_to_flourish/
 * Published a CLI tool to roll back vandalism in phabricator.

Feb

 * blubber uses blubberoid.wikimedia.org in the pipeline and pipeline is almost there for end-to-end functionality (can't yet deploy to production, but nearly can)
 * scap development back on gerrit -- new contributors
 * local-charts repo created
 * docker SIG announced/setup
 * Developer satisfaction survey results https://www.mediawiki.org/wiki/Developer_Satisfaction
 * Scap 3.9.0-1 released in production
 * Deployed wmf.18
 * Updated Phabricator to 2019-02-20 release, blog posted detailing some changes:  https://phabricator.wikimedia.org/phame/post/view/145/phab_phebruary/