Wikimedia Release Engineering Team/Monthly notable accomplishments

From mediawiki.org

This page lists notable accomplishments for the month as we come up with them during our weekly team meetings.

22/23-Q2[edit]

Oct[edit]

  • Blubber's gitlab ci file is enough to move the project over
  • Setup in gitlab ci: jwt auth and buildkit
  • New features in blubber: builders improvement from Jaime + feature for running variant from Jaime
  • Jaime + Antoine upgraded scap in the dev-tools project
  • Team likes each other
  • Train feels less stressful lately
  • GitLab runners in K8s now running buildkit with caching—nothing to a k8s cluster
  • Scap builds docs on GitLab

22/23-Q1[edit]

July[edit]

  • We don't specifically have any reason to think our GitLab instance has been owned, necessarily.
  • Small merges for mwpresync
  • GitLab runner config management changes merged!
  • Increased the team knowledge of scap3
  • Pending major update for GitLab
  • Scap-o-scap installed in beta! \o/

August[edit]

  • Train-blockers toolforge scrapes from phab \o/
  • Nagged GitLab into updating their FAQ: https://gitlab.com/gitlab-org/gitlab/-/issues/363212#note_1066797431
  • Clare used scap backport for real
  • Phabricator (probably) deploys from scap 3
  • Beta exists still
  • Chad re-earning t-shirt
  • Upgraded Gerrit from 3.4.4 to 3.4.5
  • Scap-backport improvements, seeing increased use
  • Renewed GitLab relationship!
  • Moved Gerrit replica server!
  • Yet another successful train, automatic edition this time!
  • Team reviews are fast!
  • Gitlab JWT STUFF MERGEDDDDDD \o/

Sept[edit]

21/22-Q4[edit]

April[edit]

  • Jamie deployed the train
  • Jaime rolled back the train
  • GPG keysigning
  • Fixed bug in proxy balancing in scap
  • Scap deploy-promote
  • Scap 4.7.0-fully out; 4.7.1 going out this week!
  • Scap 4.7.1 fixes cross-datacenter pulls!
  • New Phatality deploy
  • Scap backport

May[edit]

  • Our long, grinding efforts at deployment training are finally starting to result in more people doing deploys (well, ok, they've resulted in Clare doing deploys, anyway) \o/
  • Rolled back train five times
  • Deployment tooling just kind of sucks less than it used to
  • Merged scap backport
  • scap stage-train \o/
  • Finally got rid of the generic service-pipeline-* jobs and migrated remaining 23 projects to use bespoke `.pipeline/config.yaml` based jobs
  • gitlab-a-thon
    • We found a a whooooole lot of blockers
    • Dan being a good open-source citizen: https://github.com/moby/buildkit/pull/2868
      • JWT implements oauth2
      • Could be used to authorize push access to namespace based on project path
  • Root access on phabricator
  • Updated Jenkins for Security—which broke Jenkins for a while
  • I think I finally remember how to use a standalone puppetmaster
  • ERC going well and DEI moving ahead
  • Dan got changes to buildkit merged upstream
  • Seems like we're pretty close to how auth will work for publishing images from GL
  • serviceops are plodding ahead on GitLab physical machines
  • CI for blubber in gitlab
  • update scap backport to work with new zuul plugin
  • new tests for scap backport
  • scap tests run without deprecation warnings (for stretch, buster, and bullseye)
  • Giuseppe plans to enable always-restart-php-fpm on Thursday.
  • Docs for GitLab are somewhat less crappy than they were a week ago
  • Upgraded Gerrit in train-dev to match production
  • Hired Backfill
  • Phabricator deployment runbook https://wikitech.wikimedia.org/wiki/Phabricator/Deployment

June[edit]

  • GitLab Sprint summary by Brennen https://phabricator.wikimedia.org/phame/post/view/288/gitlab-a-thon/
  • We have GitLab on new metal, and can probably enable GL Container Registry \o/
  • We know more about git than we did in May
  • Functional scap already self-installed in prod
  • JWT presentation!
  • Phab deployment has a runbook https://wikitech.wikimedia.org/wiki/Phabricator/Deployment
  • scap scap
    • successfully used it
  • ITCs getting done
  • somebody noticed phab
  • scap backport revert
  • scap rollback
  • scap pushing commits with a shared ssh key
  • Chad has some equipment: monitor and docking station required
  • Dan's rested
  • uneventful train
  • Antoine got rest api in gerrit working for jsonschema

21/22-Q3[edit]

January[edit]

  • New release of Java Gearman plugin
  • Gerrit upgrade! (3.3.6 → 3.3.9)
  • Ahmon debugging of contint1001
  • mwcli coolest tool honorable mention
  • Production benchmarks of yaml parsing for MW
  • New test environment progress for GitLab
  • Plan for trusted runners coming together
  • Scap prep auto working in beta (on the way to prod)
  • Gerrit 3.3→3.4 prework to solve javascript incompat
  • CI-agents upgrade prep for stretch -> bullseye
  • MWCLI progress and GoLand for IntelliJ

February[edit]

  • logspam-watch is fast now because Ahmon is good at computers and can tolerate modifying Perl
  • Wrapping up MediaWiki settings loader expedition caching—abstractions for MediaWiki config settings—they can declare how they're cached and how long (with probablistic early expiry to help with lock contention)
  • Mukunda has freed himself from train :D
  • Jaime joined, already has a scap patchset
  • Things we didn't do but benefit from: New GitLab test instance in devtools project: https://gitlab.devtools.wmcloud.org/ --> same puppet as prod!
  • Tyler's DigitalOcean exploration spike
  • Scap prep auto! is neat <3
  • Antoine's qemu learnings!
  • moving train-dev to helm3

March[edit]

  • Scap 4.4.1 released (includes container image building stuff) \o/
  • Brennen talked publicly and was not shamed by it
  • Onboarding Jaime!
  • Dan's blubber demo!
    • docker build can use a blubber file directly now
    • Supports bd808 and developer tooling
    • Opens up for a more flexible release model
  • Job post posted
  • scap backport exits if change is not mergeable
  • Trainsperiment was instructive AND SUCCESSFUL!
  • Got a working mw container image deployed from deploy1002 woohoo 🐳🎉
  • Fixed check-new-errors script (extremely tiny win) \o/
  • Jaime's first scap release!
  • I don't have to press enter any more! (added -n)
  • Train schedule worked!
  • deploy-promote upstreamed to scap!

21/22-Q2[edit]

December[edit]

November[edit]

  • Deploy MediaWiki manually in Train-Dev to k8s!
  • Pipeline supports copying files out of containers as published artifacts in Jenkins!
  • Deployed Tuesday with a script thing!
  • Scap 4.0.3 release!
    • record time for scap release
  • Worked?!l out a path forward for GitLab runner architecture with SRE; moved projects to top-level group with runners: https://gitlab.wikimedia.org/repos \o/

October[edit]

  • Scap 4 release!! Now with more Python
  • GitLab upgrades
  • Gerrit added to train-dev unblocks scap backport dev
  • Data^3d is functional and ready for demos
  • More people using train dev
  • Antoine uses chrome^Hium |not sure that is a win|
  • Gerrit upgrade to 3.3.6 (fixes some minor ui glitches)
  • Client side errors are blocking the train
  • GitLab is open to all
  • dashboards getting close to demo-worthy?  http://173.17.185.55:8001/-/dashboards/project-metrics?project=PHID-PROJ-uier7rukzszoewbhj7ja

21/22-Q1[edit]

September[edit]

August[edit]

  • Started dev-images to buster
  • Gerrit 3.3
  • Successful php_fpm_always_restart: true test (https://phabricator.wikimedia.org/T266055)
  • GitLab soft launch 
  • migrated mw-cli to gitlab, got docker-in-docker integration tests working (thanks addshore)
  • Finished dev-images to buster
  • Merged workboard metrics code!
    • Reviewed on GitLab
    • GitLab code review experience ftw
  • A successful (?) interaction with GitLab upstream
    • GitLab upstream merge request in progress
  • Node 14 patch updated
  • Emacs installed on releasesXXXX servers
  • Mukunda learned how to extend datasette with ddd/phab functionality

July[edit]

  • Projects exist on GitLab
  • Gerrit upgrade pairing
  • Published local dev cli

20/21-Q4[edit]

April[edit]

  • Scap 3.17.1 tagged
  • GitLab Ansible code review
  • Deployment trainings

May[edit]

  • Quibble 0.0.47
  • Jenkins upgrade to latest LTS
  • Released new upstream Jenkins Gearman plugin
  • Wikitech Gerrit docs updated
  • data³ used successfully to extract train blocker stats from Phabricator
    • Added transaction metadata to Phabricator task transactions api so that tools can get more detailed transaction details required for the train blockers analysis.
  • Quibble weekly meetings
  • gitlab.wikimedia.org is running (still needs cas registration)
  • Documented the process for adding languages to phabricator, as well as maintaining the translation strings from translatewiki. All of this is now documented in the README for the phabricator translation repo. That change can be seen here: https://phabricator.wikimedia.org/rPHTR0de9c13ef996326a99d6320f4c26669901f3aff4

June[edit]

  • Knowledge transfer on Gerrit deployments
  • Running gitlab.wikimedia.org, real use now
  • Guiseppe reports: curl -H 'Host: en.wikipedia.org' https://staging.svc.eqiad.wmnet:4444/wiki/Main_Page works
  • Automatic notification of security patch application failures.  One real use so far.

20/21-Q3[edit]

January[edit]

  • Update dev images to split apache and php containers for local dev
  • Gerrit security bug discovery and deployed fix by Antoine
  • In sync with Gerrit upstream war (Java compiled code)
  • Target releases for apt packages in blubber deployed so wuvi can use npm

February[edit]

  • PipelineLib fully working on releases-jenkins.wikimedia.org
  • Rust introduction talk (not strictly RelEng business)
  • logspam-watch
    • Minimum hits consolidation feature
    • Error histograms, at-a-glance status indicators (emoji, it's emoji), improved UTF-8 handling and terminal resizing
  • Gearman plugin deployed. Merged bunch of pending changes + a fork from GoodData company which adds support for Pipeline jobs

March[edit]

  • PipelineLib fully working on releases-jenkins.wikimedia.org
  • Credentials added to pipelinelib
  • S&F contractors underway with production GitLab configuration
  • Terrible script for finding status of production errors on logstash dashboard
  • Ability to deploy phatality updates
  • scap apply-patches much improved

20/21-Q2[edit]

October[edit]

  • GitLab consultation

November[edit]

  • Gerrit security upgrade
  • Gerrit grafana dashboard
  • Created pipelinelib-experimental cloud project for working on pipelinelib
  • Scap 3.16.0 release (tagged, waiting on SRE now)
  • logspam-watch improvements
  • apparently scap apply-patches may possibly work in some circumstances
  • Upstream fix for shallow cloning in git: https://github.com/git/git/commit/fb3d1a083f776f02caa514cad8b232d8b974641f

December[edit]

  • Scap 3.16.0 released and deployed
  • Dropped scap plugins from mw-config
  • unconditional restart on deploy for opcache corruption deployed
  • https://doc-stage.wmcloud.org/ , staging area for doc.wikimedia.org. Next prod then update related docs.
  • Scap source formatted with Black now
  • Runnable runbook blog

20/21-Q1[edit]

July[edit]

  • CI now supports REL1_35 branches (and ignores REL1_33).
  • Eliminate elasticsearch dependency from Phabricator search engine
  • Cassandra Docker image
  • Jenkins node Docker image cleanup & re-onlining after disk space recovers
  • Collection of disk space stats on Jenkins workers
  • Credentials and environment variables in PipelineLib
  • Blubber now correctly supports multi-stage artifact copies

August[edit]

  • Reduced the number of non-failure FAILURE messages in CI
  • After 9 months, Aphlict is finally back.
  • Scap version 3.15.0 released (in git, if not as .deb yet)

September[edit]

19/20-Q4[edit]

April[edit]

  • Docker images published on buster-based contint2001 (as part of general temporary switch-over from contint1001 to 2001 for buster migration)
  • Composer is now authenticated with github
  • Dropped basic PHP 7.1 testing from CI
  • Published Kubernetes migration tutorial
  • Phabricator milestone columns can now be moved on workboards
  • Phabricator workboards can be sorted by most recent activity.
  • Tech talk on PGP basics
  • "Cache of wmf-config/InitialiseSettings often 1 step behind" fixed! - task T236104

May[edit]

  • The release train branch cut is now an automatic job
  • Wikimedia Portals build and WDQS data release jobs moved to docker
  • The Continuous Integration instances on WMCS have been fully migrated off Jessie! T236576
  • Scap 1.14.0 released (by releng) and deployed (by serviceops)
  • Documentation for setting up a local dev environment for Phabricator: https://www.mediawiki.org/wiki/Phabricator/Local_Dev_Environment
  • CI server (contint) migrated to buster!

June[edit]

  • Scap plugins will move from mediawiki-config to scap git repository with the next release.
  • Deployment script added to deployment-charts for deploying to k8s
  • MediaWiki branch cuts are fully automated, at last!!!!
  • TMH job runner works in MediaWiki-Docker
  • Interactive logspam-watch
  • Gerrit 3.2.2

19/20-Q3[edit]

January[edit]

February[edit]

March[edit]

  • scap has its first integration test
  • MediaWiki tarball / Wikimedia production are now PHP 7.4-compatible.
  • All extension and skin repos are now being tested against PHP 7.4.
  • Analytics Refinery release job isolated into a Docker container.

19/20–Q2[edit]

December[edit]

November[edit]

  • branch.py for cutting the branch for train
  • logspam-watch for tailing logfiles

October[edit]

19/20-Q1[edit]

September[edit]

  • Scap 3.12.1-1 released/deployed
  • Refactored Zuul layout to use per-branch pipelines
  • quibble -c Lets you run arbitrary code against a working MediaWiki install
  • The phabricator "Report Error Code" form (https://phabricator.wikimedia.org/maniphest/task/edit/form/46/ ) has been updated with separate fields for the stack trace and error code/request id.
  • T232608 Delete selenium-daily-beta-EXTENSION Jenkins jobs that are broken more than 30 days
  • Write cached config to JSON as well as serialised PHP https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/533592/ (first step towards a saner config)
  • MediaWiki PHP support target modernised from 7.0+ to 7.2+ for 1.34 onwards. https://phabricator.wikimedia.org/T228342
  • Quibble 0.0.35 release
  • 1.34.0-wmf.24 branch cut was done /mostly/ with branch.py instead of make-wmf-branch.php (some small bugs remain to work out but it's very close)
  • Creating accounts was broken on beta cluster since 2019-09-08. It was fixed today (2019-09-25). https://phabricator.wikimedia.org/T232796
  • Phatality extension for Kibana deployed to production and used for reporting production errors into Phabricator.
  • Train blocker tasks created for 1.35.0-wmf.1—1.35.0-wmf.25
  • Dev images are now automatically created as part of postmerge via the pipeline for MediaWiki

August[edit]

  • Read only "gerrit-replica" active, handling 10% of all traffic (read from phab)
  • https://time.releng.team ¯\_(ツ)_/¯
  • Scap 3.12.0-1 in production

July[edit]

18/19-Q4[edit]

June[edit]

May[edit]

April[edit]

  • Phabricator vandalism rollback tool completed 🎉 (blog post? 😉)
  • Upgrade Zuul to 2.5.1-wmf6 (which unblocks the Gerrit upgrade to 2.16) - https://phabricator.wikimedia.org/T208426
  • Team offsite in Chicago

18/19-Q3[edit]

March[edit]

Feb[edit]