Jump to content

Wikimedia Release Engineering Team/Monthly notable accomplishments

From mediawiki.org

This page lists notable accomplishments for the month as we come up with them during our weekly team meetings.

23/24 Q4

[edit]

Aug

[edit]

July

[edit]

single image container images.

  • Andre's "scap train" mods deployed.
  • scap changes to better support alternate stage_dir:
  • Southparkfan hammering away at beta
  • streaming logs from Catalyst environments
  • It seems like the new deployment box is pretty much working by now.
    • Jaime had to fix git flag placement for git 2.20 -> git 2.30
    • Jaime had to fix scap deploy for heterogenous python versions within the cluster
  • Merged changes in catalyst for MediaWiki helm charts to spin up new MediaWiki instances with the PatchDemo provisioning scripts
  • Progress on merging the PatchDemos
  • Split Puppet 5 and 7 compiler output since some hosts no more support v5 and that was confusing SREs (screenshots: https://phabricator.wikimedia.org/T371407#10028859 )
  • Added a "(diff)" link to the notification that https://schedule-deployment.toolforge.org/ gives you after adding a new backport to the schedule. phab:T367948
  • git.wikimedia.org is finally dead (a win in so far as maybe we never have to talk about it again)
  • Quarterly phabricator queries updated
  • Upstream opengraph diff was merged for phab, so that link previews may start working in slack at some point
  • Adding milestone description copying to lessen suffering

June

[edit]
 (plus followup from Jaime)

May

[edit]
  • scap k8s deployment progress reporting
  • scap release-scripts/perform-release rewritten in Python, and added wait for the tag pipeline.
  • Jaime is a PHP expert now, succesfully running patchdemo
  • Patches upstream for Phorge viewing reports while not logged in (https://we.phorge.it/D25608) etc
  • buildkitd upgraded to v13.2
  • scap clean improvements
  • First changes to Catalyst Patchdemo
  • Scap3 broken symlink up for review
  • Skins available in the catalyst environment
  • Upstream buildkit mod merged: https://github.com/moby/buildkit/pull/4899
  • Moved wmf buildkit helm chart to its own repo for easier maintenance: https://gitlab.wikimedia.org/repos/releng/buildkit-chart/
  • integration/config: jjb-diff improvement (don't assume stdout wants ansi)
  • Docker gc config for CI: https://gerrit.wikimedia.org/r/c/operations/puppet/+/1031045
  • docker-hub-mirror upstream bug workarounds
  • Phab: made good progress removing tech debt in Phabricator, all deployed thanks to Brennen: https://phabricator.wikimedia.org/maniphest/query/cGaRtbNWQSd1/#R . Disabled ~12 ancient Herald rules. Hackathon. Phorge upstream stuff. etc.
  • Hackathon a good time generally
  • Wikibugs got initial gitlab integration during the hackathon and has a couple of improvements since. Next step is wiring up a bot to configure more webhooks so the bot can see CI runs and code review comments. https://www.mediawiki.org/wiki/Wikibugs
  • Contint1002 is now running on bullseye along with python2 zuul, but this is the LAST TIME! (thanks dzahn)
  • Hack found for Wikibugs network instabilty issue talking to https://gitlab-webhooks.toolforge.org. Bypassing Kubernetes ingress by talking directly to service makes things much more stable for long lived connections.
  • Bryan working with Eoghan to get secrets provisioned for adding GitLab account block/unblock to Wikitech block cascade.
  • Upgrading SyntaxHighlight to work with the newest Pygments is stalled because the new version needs Python 3.8+. Prod, test, and default dev environments are all currently Buster with Python 3.7. <https://phabricator.wikimedia.org/T364249>
  • Jelto unblocked Wikibugs tests from calling Phabricator by creating a GitLab shared runner that can be used by projects in /toolforge-repos/
  • Deployed protection for https://phabricator.wikimedia.org/T282893 (Various CI jobs failing after "mkdir: cannot create directory ‘log’: Permission denied"). That revealed a few places where a root:root cache or log directory was previously being auto-created by docker. Added fixes for that. Plus a fix for codehealth checks from Tyler
  • Blubber Python builder: Always use a virtualenv https://phabricator.wikimedia.org/T357548 . blubber/buildkit 0.23.0 released
  • docker-gc resiliency improvement deployed.
  • Simplified gitlab-trusted-runner projects.json (removed project-ids)
  • Fixed problem recently discovered w/ gitlab-mentions-bot... it starts getting email notifications for MRs that it has made a note on. These emails go to releng. Fixed.
  • Andre's 1st TRAINNNNN \o/
  • Backfilled bugzilla tickets in phabricator to fix stats after 10 years - https://phabricator.wikimedia.org/T107254
  • Phabricator OGP previews upstream patch - https://we.phorge.it/D25668
  • SRE Collaboration Services has a dedicated IRC channel now irc://irc.libera.chat/wikimedia-sre-collab
  • https://phabricator.wikimedia.org/T313624
    • Reproduced the issue locally and identified that it occurs when the keyholder key is either not specified in scap.cfg or is missing from /etc/keyholder.d. According to OpenSSH behavior, if no specific key is provided, it tries all authentication methods up to the MaxAuthTries limit. Since these configurations are on the target and not modifiable, increasing MaxAuthTries is not a viable solution.
    • To resolve this, I updated the code to abort the program and prompt the user for a rollback if the key is missing.
  • Patchdemo checkbox with ooui https://patchdemo.catalyst-qte.wmcloud.org/
  • GitLab account blocking/unblocking has been integrated with Wikitech. Blocking a Developer account on Wikitech now also blocks the user's associated accounts in Cloud VPS & Toolforge, Gerrit, GitLab, and potentially Phabricator. The Phabricator block depends on the user having previously linked their Developer account with Phabricator. Unblocking a Developer account on Wikitech reverses the associated account blocks as well. This makes disabling a Developer account a lightweight and reversable process which in turn makes it easier to use a "block first; investigate more deeply later" approach to combating abuse. https://phabricator.wikimedia.org/T316418
  • commit-message-validator v2.1.0 (<https://www.mediawiki.org/wiki/Commit-message-validator>) now supports two optional trailing spaces after Bug: Tnnnn and Change-Id: Ixxxx footers to improve support for GitLab markdown rendering of merge request descriptions which can become commit messages when doing non-fast-forward merges. https://phabricator.wikimedia.org/T351253
  • Used new Blubber Python builder to resolve https://phabricator.wikimedia.org/T346226
 https://gerrit.wikimedia.org/r/c/mediawiki/services/machinetranslation/+/1035009 (CI: Revive use of tox for tests) 
 https://gitlab.wikimedia.org/repos/cloud/toolforge/tools-webservice/-/merge_requests/39 
  • pvc-cleaner: Added connection retries to reduce Pod restart events.
  • Seen while rolling the train to group0: Finished sync-prod-k8s (duration: 03m 03s) (less than half of time prior to today)
  • Finalized T359643: Get rid of the /srv/mediawiki/php symbolic link
 by removing the symlink from operations/mediawiki-config 
    • Finally!
  • Lots of positive feedback about scap backport recently.
  • Config file and speedup (thanks Bryan) for gitlab-settings/configure-projects/
    • Learned that puppet compiler button on gerrit is neat
  • Undeployed Blubberoid!
  • Phabricator deployment of cleanup: - https://phabricator.wikimedia.org/maniphest/?ids=366075,364720,361997,230590,228507,140448#R
    • Rotting list of Gerrit repos in Phabricator
    • 2fa + Oauth doesn't make you re-auth with mediawiki
    • Email pre-filled when you Oauth with MediaWiki in Phab
  • Button for running the puppet compiler in Gerrit
  • Content Transform team successfully used Quibble to debug https://doc.wikimedia.org/quibble/
  • Catalyst API Generates user tokens

April

[edit]
  • new puppetserver in devtools!
  • phorge inbound mail patches upstream
  • Phorge PoC patch to thwart webcrawlers—every line number is a separate page
  • Phabricator 6 mo stale ticket pings
  • Hiding non-canonical diffusion links WIP
  • Phab deploy!
  • Gerrit 3.7 upgrade had a slow down ( https://phabricator.wikimedia.org/T355529 ) Upstream fixed it. I have upgraded on Monday morning and... it works!
  • Scap user interactions consolidated into two functions
  • /srv/mediawiki/php symlink uses is dead \o/
  • Train-dev gerrit auto-upgrade
  • Scap handles pending rollback state from helm
  • SpiderPig demo
  • Patchdemo is running in k8s
  • Phab/Phorge translations are going again, ish
  • scap release process: Added checks to avoid surprises
  • scap backport: Improved behavior on bad change number input.
  • train-dev: Up-to-date docker compose plugin is required now
  • scap backport: support deployable non-production branch backport \o/
  • phabricator upstream patches for error log explosions
  • phabricator -- patch for restricting visibility of priority field
  • catalyst deploying extensions with a MediaWiki environment

23/24 Q3

[edit]

Mar

[edit]
  • Train: Nightly security patch failures updating phabricator tasks merged, ready to release
  • GitLab CI: Merged deploys-in-progress reset script
  • Scap3: Two repos have patches for git-fat → git-lfs
  • scap: replaced canary swagger checks with test server httpbb checks
  • Phorge integration with GitLab in its third round of review
  • GitLab webhooks also still going, looks like it'll go through
  • People like scap backport - more patches, fewer things typed into terminals.
  • Train: Security patch notification now working!
  • GitLab webhooks have a more accurate regex for "Bug: TXX"
  • Working towards getting rid of the /srv/mediawiki/php symlink
  • Upgraded GitLab k8s/cloud cluster to new k8s version and documented the process
  • Phab deploy is out (but stuff is broken (not terribly (probably)))
  • scap backport now works for non-extension submodules
  • gitlab cloud runner dependencies
  • scap backport -2 fix merged, need to release
  • scap web demo
  • catalyst project has a patchdemo vm
  • Jenkins is upgraded to the latest version
  • Logstash dashboard for Phab errors is nice and clean

Feb

[edit]
  • Phabricator dump script works again (but also probably isn't necessary?)
  • extended docpub to run multiple doc build jobs
  • Sandeep able to run train-dev environment
  • Gerrit 3.8 upgrade prepped
  • buildkitd security upgrade on gitlab-runners.
  • logstash_checker.py updated to check mw-on-k8s canaries. Merged this morning. Working on scap part today.
  • python2 wheels for zuul2 for bullseye - unlock ability to upgrade Debian on contint boxes
  • Patch for scap backport branch about to be live! (paired w/Jeena)
  • Prepped new Phorge release: https://phabricator.wikimedia.org/T358610
  • Deployed new scap with support for canary checks and git-lfs fixes
  • In progress: updating security issues with patch issues

Jan

[edit]
  • Gerrit train-dev fixed!
  • bd808's account approver thing seems working, maybe?
  • Requested extra Phab/Phorge hardware as backup
  • Deployed persistent volume cleaner
  • Docker images with pyenv with multiple versions of python
  • Worked with upstream phorge to hide the audit application (the post-merge review application for source code)
  • Webhook payloads for GitLab merge request changes—tracked down a bug in upstream
  • phab1004 distro upgrade is done
    • We know we can run in codfw if we have to (it sucks though, let's not)
  • scap clean now works better
  • PVC story nearing completion
  • Helm pending release story, in progress, testing needed
  • Sandeep a very happy linux user
  • We're on Gerrit 3.7
  • Added GitLab support to Git/Reviewers mediawiki.org page.
  • scap stage-train working! We're deleting old versions.

23/24 Q2

[edit]

Dec

[edit]
  • https://extloc.toolforge.org/ is live and works
  • Fixed cas3 → openid providers in gitlab
  • Differential is dead!
  • Andre is now a "blessed committer" in upstream Phorge and can +2 other folks' patches
  • Not my win, but bd808 has a gitlab account approver bot just about working
  • https://phabricator.wikimedia.org/T351478 in progress!
  • Jenkins job builder jobs into gitlab-ci.yml jobs
  • Systemd-managed dockerfiles for zuul

Nov

[edit]

Oct

[edit]
  • Bitergia user data is now a webapp and we can use it
  • Project Catalyst is underway -- https://wikitech.wikimedia.org/wiki/Catalyst
  • RelEngers can downtime things for phabricator deploys without SRE (although cookbooks are less fiddly)
  • Ran train with Andre!
  • Regular Phab deploy!
  • libs/metrics-platform moved over
  • Clarified language on backports: https://phabricator.wikimedia.org/T344409
  • Investigated hidden repositories in GitLab
  • Prettified puppet compiler output
  • fresnel updates for dependencies
  • Jaime working on project catalyst
  • Gerritlab revised branch naming landed

23/24–Q1

[edit]

Sep

[edit]
  • Image published for Blubber that is native LLB, no dockerfile anymore
    • implications
      • dockerfile is unnecessary since no one sees the dockerfile—we can customize each llb instruction and what it displays to the users: a name that corresponds to the blubber.yaml config
      • now we have the ability to create our own instructions
      • dockerfile2llb gone! No more external helper images that haven't been maintained just to copy files around—no more cross-platform compatibility/emulation issues
      • llb gets new stuff first—ex: diffop/mergeop https://www.docker.com/blog/mergediff-building-dags-more-efficiently-and-elegantly/
  • Phorge working on the scap3∞ deployment environment
  • Landed 3 upstream phorge patches, 1 is one we've had for years the blocks some tasks rendering (T284397)
  • Patch for T&S could outputs the MediaWiki SUL account along with the phab username (T344303)
  • Wrote a plugin for tox to keep supporting [tox:jenkins] CI config with tox v4 https://gerrit.wikimedia.org/g/integration/tox-jenkins-override unblocking part of the migration from tox v3.
  • Moved Civi CRM CI from Stretch to Bullseye and to php 7.4 (aligning with prod). Paired with Ejegg from FR-tech.
  • Ahmon refactoring GitLab nodepools, buildkitd persistent volume, and containerd debugging
  • Dan deep debugging of source code for containerd
  • GerritLab uses the git credential helper
  • Trusted runners now not running untrusted jobs
  • Authentication working in phorge dev environment
  • Phabricator housekeeping for open tickets assigned for more than two years
  • Phabricator logstash dashboard with filters
  • We've got a weekly Phorge deploy window, of sorts (and can ask for other things)
  • Workaround for security patches touching l10n—fixes bug!
  • Tox v4 migration in progress
  • Added phorge to the scap3 development environment!
  • Fixed a logspam-watch bug (SIGPIPE)

Aug

[edit]
  • Developer Satisfaction Survey got presented
  • Gerrit repo archiving script for GitLab migrations \o/
  • Gerritlab adoption
  • JWT auth changes
  • T272693 - reviewed non-standard phabricator policies
  • Downstream phabricator patches for php8 + logspam
  • Upstream phorge patches for logspam
  • Overwrote feed transaction default query in conduit (T344232#9092848)
  • Scap3 can now be configured to disable the service on secondary hosts: https://phabricator.wikimedia.org/T343447
  • Kokkuri is now using the new gitlab id tokens: https://phabricator.wikimedia.org/T337474
  • We're on Phorge!
  • Gitlab CI-built kask container image deployed today. (https://phabricator.wikimedia.org/T335691)
  • Gitlab local hacks in progress
  • Ahmon passed his CKA! Read Kubernetes in action
  • Merged 3 fixes to Phorge upstream for phab logspam
  • 🎉 Delayed announcement: Jeena's back, and she's a senior software engineer
  • Blubber refactor ripping out dockerfile passing acceptance tests—straight to llb
  • Added another pool to our DO cloud runner pull—memory-optimized
  • Refactored the patch to tune-down staging substatually(sp?)
    • Now there are 4 runner-controller runners running + 4 nodes ready to go
  • GerritLab commits merged to speed up sending patches and does the right things given GitLab's weirdness
  • scap backport bugfix

Jul

[edit]
  • git::clone puppet resource updated
  • LDAP group sync to GitLab
  • Git blame on stack traces within Phatality
  • Buildkitd allowed image list deployed
  • Onboarding Andre Klapper, all sorts of new permissions: phab-root, contint-root, gerrit-root, gerritadmin
  • Bunch of GitLab accounts created ~200 accounts for the mediawiki/* namespace
  • Assisted in GitLab migrations, notably, [[ https://gitlab.wikimedia.org/repos/data-engineering/datahub | datahub ]]

22/23–Q4

[edit]

April

[edit]
  • Mr. Widget doesn't seem to have broken again.
  • Job to test train branch cut on a daily basis
  • Successfully debugged an obscure buildkitd -> registry interaction
  • A plan exists for Phorge migration
  • Abstract Wikipedia showed up asking for help with a GitLab migration
  • Jelto deployed the privileged buildkitd commit
  • Moving scap backport tests, win in progress
  • Aphlict on a new box---nothing exploded, nobody yelling
  • Jenkins releases configuration fully automated
  • scap train

May

[edit]

Jun

[edit]
  • Blubber acceptance tests
  • docpub in Jenkins
  • Antivandalism patch deployed! (one down; one to go)
  • Learned that we needed to restart php
  • git::clone changes in puppet for specifying a tag
  • git::clone upstream changes now changes the origin
  • WMCS instance caches for NPM via "npm cache verify" to GC the cache
  • buildctl --wait
  • dev-images image rebuilding
  • train backport on Saturday

22/23-Q3

[edit]

Jan

[edit]

Feb

[edit]
  • Buildkit upstream patch to make client connections more robust in case of loss
  • Smooth wmf.22 train
  • docker-pkg build --list \o/
  • Isolating build containers in buildkit in privilege mode
  • Mr. Widget should be deployable, I think (as soon as secrets are actually stashed)
  • Pushed up patches for JWT (hopefully the final ones :D)
  • Deployed production releases-jenkins using scap3!
  • Patched scap to be smarter about interrupted helm deploys!
  • See Slack://WikiLove thread re: scap backport
    • "90% reduction in time spent in existential dread" ← going on a slide deck somewhere!
  • fixed scap backport for dependencies
  • Buildkit upstream patch to make client connections more robust in case of loss
  • Smooth wmf.22 train
  • docker-pkg build --list \o/

Mar

[edit]
  • New scap self-install in production
  • Learned too much about iptables
  • Moved gitlab-cloud-runner Helm stuff to Terraform :)
  • Thundering herd testing passed—k8s can handle 100 concurrent job
  • Phab release prepped
  • docker-gc is blubberized and kokkorized
  • found and deleted docker image based on obsolete debian version (stretch)
  • made terraform plan run before merge
  • Tentative optimism for CKA
  • Monte's having success using the phab api to build different views of tasks
  • Mr Widget deployed
  • gitlab-cloud-runner stress tests successful!
  • Dockerhub mirror admission controller
  • Reggie JWT auth enabled in gitlab cloud runners
  • Gitlab cloud runners ready to be made available instancewide
  • We made a staging cluster
  • Docker-gc repo using kokkori
  • Gerrit progress bars

22/23-Q2

[edit]

Oct

[edit]
  • Blubber's gitlab ci file is enough to move the project over
  • Setup in gitlab ci: jwt auth and buildkit
  • New features in blubber: builders improvement from Jaime + feature for running variant from Jaime
  • Jaime + Antoine upgraded scap in the dev-tools project
  • Team likes each other
  • Train feels less stressful lately
  • GitLab runners in K8s now running buildkit with caching—nothing to a k8s cluster
  • Scap builds docs on GitLab

Nov

[edit]
  • Scap repo is fully moved over to GitLab
  • Internship opportunities!
  • Dan's daughter can now pedal a bike
  • Critical systems list
  • Antoine fixed mixed-case usernames in Gerrit
  • Gerrit upgrade
  • Scap3 dev env: https://gitlab.wikimedia.org/repos/releng/scap3-dev
  • Phabricator is now hosted on a new box at phab1004 and deployed with scap
  • registry-based caching
  • reggie is in use and working
  • Autoscaling
  • Kokkuri

Dec

[edit]

22/23-Q1

[edit]

July

[edit]
  • We don't specifically have any reason to think our GitLab instance has been owned, necessarily.
  • Small merges for mwpresync
  • GitLab runner config management changes merged!
  • Increased the team knowledge of scap3
  • Pending major update for GitLab
  • Scap-o-scap installed in beta! \o/

August

[edit]
  • Train-blockers toolforge scrapes from phab \o/
  • Nagged GitLab into updating their FAQ: https://gitlab.com/gitlab-org/gitlab/-/issues/363212#note_1066797431
  • Clare used scap backport for real
  • Phabricator (probably) deploys from scap 3
  • Beta exists still
  • Chad re-earning t-shirt
  • Upgraded Gerrit from 3.4.4 to 3.4.5
  • Scap-backport improvements, seeing increased use
  • Renewed GitLab relationship!
  • Moved Gerrit replica server!
  • Yet another successful train, automatic edition this time!
  • Team reviews are fast!
  • Gitlab JWT STUFF MERGEDDDDDD \o/

Sept

[edit]

21/22-Q4

[edit]

April

[edit]
  • Jamie deployed the train
  • Jaime rolled back the train
  • GPG keysigning
  • Fixed bug in proxy balancing in scap
  • Scap deploy-promote
  • Scap 4.7.0-fully out; 4.7.1 going out this week!
  • Scap 4.7.1 fixes cross-datacenter pulls!
  • New Phatality deploy
  • Scap backport

May

[edit]
  • Our long, grinding efforts at deployment training are finally starting to result in more people doing deploys (well, ok, they've resulted in Clare doing deploys, anyway) \o/
  • Rolled back train five times
  • Deployment tooling just kind of sucks less than it used to
  • Merged scap backport
  • scap stage-train \o/
  • Finally got rid of the generic service-pipeline-* jobs and migrated remaining 23 projects to use bespoke `.pipeline/config.yaml` based jobs
  • gitlab-a-thon
    • We found a a whooooole lot of blockers
    • Dan being a good open-source citizen: https://github.com/moby/buildkit/pull/2868
      • JWT implements oauth2
      • Could be used to authorize push access to namespace based on project path
  • Root access on phabricator
  • Updated Jenkins for Security—which broke Jenkins for a while
  • I think I finally remember how to use a standalone puppetmaster
  • ERC going well and DEI moving ahead
  • Dan got changes to buildkit merged upstream
  • Seems like we're pretty close to how auth will work for publishing images from GL
  • serviceops are plodding ahead on GitLab physical machines
  • CI for blubber in gitlab
  • update scap backport to work with new zuul plugin
  • new tests for scap backport
  • scap tests run without deprecation warnings (for stretch, buster, and bullseye)
  • Giuseppe plans to enable always-restart-php-fpm on Thursday.
  • Docs for GitLab are somewhat less crappy than they were a week ago
  • Upgraded Gerrit in train-dev to match production
  • Hired Backfill
  • Phabricator deployment runbook https://wikitech.wikimedia.org/wiki/Phabricator/Deployment

June

[edit]
  • GitLab Sprint summary by Brennen https://phabricator.wikimedia.org/phame/post/view/288/gitlab-a-thon/
  • We have GitLab on new metal, and can probably enable GL Container Registry \o/
  • We know more about git than we did in May
  • Functional scap already self-installed in prod
  • JWT presentation!
  • Phab deployment has a runbook https://wikitech.wikimedia.org/wiki/Phabricator/Deployment
  • scap scap
    • successfully used it
  • ITCs getting done
  • somebody noticed phab
  • scap backport revert
  • scap rollback
  • scap pushing commits with a shared ssh key
  • Chad has some equipment: monitor and docking station required
  • Dan's rested
  • uneventful train
  • Antoine got rest api in gerrit working for jsonschema

21/22-Q3

[edit]

January

[edit]
  • New release of Java Gearman plugin
  • Gerrit upgrade! (3.3.6 → 3.3.9)
  • Ahmon debugging of contint1001
  • mwcli coolest tool honorable mention
  • Production benchmarks of yaml parsing for MW
  • New test environment progress for GitLab
  • Plan for trusted runners coming together
  • Scap prep auto working in beta (on the way to prod)
  • Gerrit 3.3→3.4 prework to solve javascript incompat
  • CI-agents upgrade prep for stretch -> bullseye
  • MWCLI progress and GoLand for IntelliJ

February

[edit]
  • logspam-watch is fast now because Ahmon is good at computers and can tolerate modifying Perl
  • Wrapping up MediaWiki settings loader expedition caching—abstractions for MediaWiki config settings—they can declare how they're cached and how long (with probablistic early expiry to help with lock contention)
  • Mukunda has freed himself from train :D
  • Jaime joined, already has a scap patchset
  • Things we didn't do but benefit from: New GitLab test instance in devtools project: https://gitlab.devtools.wmcloud.org/ --> same puppet as prod!
  • Tyler's DigitalOcean exploration spike
  • Scap prep auto! is neat <3
  • Antoine's qemu learnings!
  • moving train-dev to helm3

March

[edit]
  • Scap 4.4.1 released (includes container image building stuff) \o/
  • Brennen talked publicly and was not shamed by it
  • Onboarding Jaime!
  • Dan's blubber demo!
    • docker build can use a blubber file directly now
    • Supports bd808 and developer tooling
    • Opens up for a more flexible release model
  • Job post posted
  • scap backport exits if change is not mergeable
  • Trainsperiment was instructive AND SUCCESSFUL!
  • Got a working mw container image deployed from deploy1002 woohoo 🐳🎉
  • Fixed check-new-errors script (extremely tiny win) \o/
  • Jaime's first scap release!
  • I don't have to press enter any more! (added -n)
  • Train schedule worked!
  • deploy-promote upstreamed to scap!

21/22-Q2

[edit]

December

[edit]

November

[edit]
  • Deploy MediaWiki manually in Train-Dev to k8s!
  • Pipeline supports copying files out of containers as published artifacts in Jenkins!
  • Deployed Tuesday with a script thing!
  • Scap 4.0.3 release!
    • record time for scap release
  • Worked?!l out a path forward for GitLab runner architecture with SRE; moved projects to top-level group with runners: https://gitlab.wikimedia.org/repos \o/

October

[edit]
  • Scap 4 release!! Now with more Python
  • GitLab upgrades
  • Gerrit added to train-dev unblocks scap backport dev
  • Data^3d is functional and ready for demos
  • More people using train dev
  • Antoine uses chrome^Hium |not sure that is a win|
  • Gerrit upgrade to 3.3.6 (fixes some minor ui glitches)
  • Client side errors are blocking the train
  • GitLab is open to all
  • dashboards getting close to demo-worthy?  http://173.17.185.55:8001/-/dashboards/project-metrics?project=PHID-PROJ-uier7rukzszoewbhj7ja

21/22-Q1

[edit]

September

[edit]

August

[edit]
  • Started dev-images to buster
  • Gerrit 3.3
  • Successful php_fpm_always_restart: true test (https://phabricator.wikimedia.org/T266055)
  • GitLab soft launch 
  • migrated mw-cli to gitlab, got docker-in-docker integration tests working (thanks addshore)
  • Finished dev-images to buster
  • Merged workboard metrics code!
    • Reviewed on GitLab
    • GitLab code review experience ftw
  • A successful (?) interaction with GitLab upstream
    • GitLab upstream merge request in progress
  • Node 14 patch updated
  • Emacs installed on releasesXXXX servers
  • Mukunda learned how to extend datasette with ddd/phab functionality

July

[edit]
  • Projects exist on GitLab
  • Gerrit upgrade pairing
  • Published local dev cli

20/21-Q4

[edit]

April

[edit]
  • Scap 3.17.1 tagged
  • GitLab Ansible code review
  • Deployment trainings

May

[edit]
  • Quibble 0.0.47
  • Jenkins upgrade to latest LTS
  • Released new upstream Jenkins Gearman plugin
  • Wikitech Gerrit docs updated
  • data³ used successfully to extract train blocker stats from Phabricator
    • Added transaction metadata to Phabricator task transactions api so that tools can get more detailed transaction details required for the train blockers analysis.
  • Quibble weekly meetings
  • gitlab.wikimedia.org is running (still needs cas registration)
  • Documented the process for adding languages to phabricator, as well as maintaining the translation strings from translatewiki. All of this is now documented in the README for the phabricator translation repo. That change can be seen here: https://phabricator.wikimedia.org/rPHTR0de9c13ef996326a99d6320f4c26669901f3aff4

June

[edit]
  • Knowledge transfer on Gerrit deployments
  • Running gitlab.wikimedia.org, real use now
  • Guiseppe reports: curl -H 'Host: en.wikipedia.org' https://staging.svc.eqiad.wmnet:4444/wiki/Main_Page works
  • Automatic notification of security patch application failures.  One real use so far.

20/21-Q3

[edit]

January

[edit]
  • Update dev images to split apache and php containers for local dev
  • Gerrit security bug discovery and deployed fix by Antoine
  • In sync with Gerrit upstream war (Java compiled code)
  • Target releases for apt packages in blubber deployed so wuvi can use npm

February

[edit]
  • PipelineLib fully working on releases-jenkins.wikimedia.org
  • Rust introduction talk (not strictly RelEng business)
  • logspam-watch
    • Minimum hits consolidation feature
    • Error histograms, at-a-glance status indicators (emoji, it's emoji), improved UTF-8 handling and terminal resizing
  • Gearman plugin deployed. Merged bunch of pending changes + a fork from GoodData company which adds support for Pipeline jobs

March

[edit]
  • PipelineLib fully working on releases-jenkins.wikimedia.org
  • Credentials added to pipelinelib
  • S&F contractors underway with production GitLab configuration
  • Terrible script for finding status of production errors on logstash dashboard
  • Ability to deploy phatality updates
  • scap apply-patches much improved

20/21-Q2

[edit]

October

[edit]
  • GitLab consultation

November

[edit]
  • Gerrit security upgrade
  • Gerrit grafana dashboard
  • Created pipelinelib-experimental cloud project for working on pipelinelib
  • Scap 3.16.0 release (tagged, waiting on SRE now)
  • logspam-watch improvements
  • apparently scap apply-patches may possibly work in some circumstances
  • Upstream fix for shallow cloning in git: https://github.com/git/git/commit/fb3d1a083f776f02caa514cad8b232d8b974641f

December

[edit]
  • Scap 3.16.0 released and deployed
  • Dropped scap plugins from mw-config
  • unconditional restart on deploy for opcache corruption deployed
  • https://doc-stage.wmcloud.org/ , staging area for doc.wikimedia.org. Next prod then update related docs.
  • Scap source formatted with Black now
  • Runnable runbook blog

20/21-Q1

[edit]

July

[edit]
  • CI now supports REL1_35 branches (and ignores REL1_33).
  • Eliminate elasticsearch dependency from Phabricator search engine
  • Cassandra Docker image
  • Jenkins node Docker image cleanup & re-onlining after disk space recovers
  • Collection of disk space stats on Jenkins workers
  • Credentials and environment variables in PipelineLib
  • Blubber now correctly supports multi-stage artifact copies

August

[edit]
  • Reduced the number of non-failure FAILURE messages in CI
  • After 9 months, Aphlict is finally back.
  • Scap version 3.15.0 released (in git, if not as .deb yet)

September

[edit]

19/20-Q4

[edit]

April

[edit]
  • Docker images published on buster-based contint2001 (as part of general temporary switch-over from contint1001 to 2001 for buster migration)
  • Composer is now authenticated with github
  • Dropped basic PHP 7.1 testing from CI
  • Published Kubernetes migration tutorial
  • Phabricator milestone columns can now be moved on workboards
  • Phabricator workboards can be sorted by most recent activity.
  • Tech talk on PGP basics
  • "Cache of wmf-config/InitialiseSettings often 1 step behind" fixed! - task T236104

May

[edit]
  • The release train branch cut is now an automatic job
  • Wikimedia Portals build and WDQS data release jobs moved to docker
  • The Continuous Integration instances on WMCS have been fully migrated off Jessie! T236576
  • Scap 1.14.0 released (by releng) and deployed (by serviceops)
  • Documentation for setting up a local dev environment for Phabricator: https://www.mediawiki.org/wiki/Phabricator/Local_Dev_Environment
  • CI server (contint) migrated to buster!

June

[edit]
  • Scap plugins will move from mediawiki-config to scap git repository with the next release.
  • Deployment script added to deployment-charts for deploying to k8s
  • MediaWiki branch cuts are fully automated, at last!!!!
  • TMH job runner works in MediaWiki-Docker
  • Interactive logspam-watch
  • Gerrit 3.2.2

19/20-Q3

[edit]

January

[edit]

February

[edit]

March

[edit]
  • scap has its first integration test
  • MediaWiki tarball / Wikimedia production are now PHP 7.4-compatible.
  • All extension and skin repos are now being tested against PHP 7.4.
  • Analytics Refinery release job isolated into a Docker container.

19/20–Q2

[edit]

December

[edit]

November

[edit]
  • branch.py for cutting the branch for train
  • logspam-watch for tailing logfiles

October

[edit]

19/20-Q1

[edit]

September

[edit]
  • Scap 3.12.1-1 released/deployed
  • Refactored Zuul layout to use per-branch pipelines
  • quibble -c Lets you run arbitrary code against a working MediaWiki install
  • The phabricator "Report Error Code" form (https://phabricator.wikimedia.org/maniphest/task/edit/form/46/ ) has been updated with separate fields for the stack trace and error code/request id.
  • T232608 Delete selenium-daily-beta-EXTENSION Jenkins jobs that are broken more than 30 days
  • Write cached config to JSON as well as serialised PHP https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/533592/ (first step towards a saner config)
  • MediaWiki PHP support target modernised from 7.0+ to 7.2+ for 1.34 onwards. https://phabricator.wikimedia.org/T228342
  • Quibble 0.0.35 release
  • 1.34.0-wmf.24 branch cut was done /mostly/ with branch.py instead of make-wmf-branch.php (some small bugs remain to work out but it's very close)
  • Creating accounts was broken on beta cluster since 2019-09-08. It was fixed today (2019-09-25). https://phabricator.wikimedia.org/T232796
  • Phatality extension for Kibana deployed to production and used for reporting production errors into Phabricator.
  • Train blocker tasks created for 1.35.0-wmf.1—1.35.0-wmf.25
  • Dev images are now automatically created as part of postmerge via the pipeline for MediaWiki

August

[edit]
  • Read only "gerrit-replica" active, handling 10% of all traffic (read from phab)
  • https://time.releng.team ¯\_(ツ)_/¯
  • Scap 3.12.0-1 in production

July

[edit]

18/19-Q4

[edit]

June

[edit]

May

[edit]

April

[edit]
  • Phabricator vandalism rollback tool completed 🎉 (blog post? 😉)
  • Upgrade Zuul to 2.5.1-wmf6 (which unblocks the Gerrit upgrade to 2.16) - https://phabricator.wikimedia.org/T208426
  • Team offsite in Chicago

18/19-Q3

[edit]

March

[edit]

Feb

[edit]