Wikimedia Release Engineering Team/Checkin archive/20171011

= 2017-10-11 =

Vacations/Important dates

 * https://office.wikimedia.org/wiki/HR_Corner/Holiday_List
 * How to do it


 * Oct week of 23rd thcipriani
 * October 25 (Wednesday) Željko on a conference
 * November 1 (Wednesday): Željko local holiday (All Saints' Day)
 * Nov 10 (Fri) - Veteran's Day
 * Nov 20th - Dec 1st: Greg vacation
 * Nov 23+24 - Thanksgiving
 * Dec 25-Jan 1 - End of year/new year holidays

Rotating positions and absences
Maniphest query for deployment blocker tasks: https://phabricator.wikimedia.org/maniphest/?project=PHID-PROJ-fmcvjrkfvvzz3gxavs3a&statuses=open%28%29&group=none&order=newest#R

Oct 9 and Oct 16

 * Train: Chad
 * wmf.3
 * wmf.4
 * SoS: Mukunda
 * Out
 * October 4-10th: vacation all I ever wanted
 * Oct 9 - Indigenous People's Day

This week

 * Blocking
 * Blocked
 * Updates
 * MW deployment train for this week is behind by a day, we plan to catch up today (doing both group0 and group1)
 * Updates
 * MW deployment train for this week is behind by a day, we plan to catch up today (doing both group0 and group1)

Last week

 * Blocking
 * Blocked
 * Need Ops review of patches for https://phabricator.wikimedia.org/T146381#3447319
 * Updates
 * Updates

Logspam \ Last week's train updates

 * https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Roles#Train_Conductor

still good, pleasently surprisingly

Other Team Business

 * Antoine https://phabricator.wikimedia.org/T145819
 * "Jobs invoking SiteConfiguration::getConfig cause HHVM to fail updating the bytecode cache due to being filesize limited to 512MBytes"
 * Can we have scap to trigger a generation of /var/cache/hhvm/cli.hhbc.sq3 or maybe just delete it ?  It keeps growing until that reach wfShellExec imposed ulimit.

Q2 goal/project check-in

 * All of it in table form: https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Goals/201718Q2

Program 1: Outcome 5: Milestone 1: Migrate majority of developers to JavaScript based browser test framework (webdriver.io)

 * Due: End of this quarter
 * Quarter Goal Task: Port Selenium tests from Ruby to Node.js -


 * T176315 Automated browser tests cannot create pages on the Beta Cluster as anonymous user in RelatedArticles tests
 * Working on it. I thought I have fixed it, but looks like I did not.


 * T171852 WebdriverIO tech talk
 * Rachel suggested November 1, but it's a local holiday, asked for another date


 * Mobile and Search teams active.

Ruby

 * T167432 Run Wikibase daily browser tests on Jenkins
 * Tests are running, but failing. Working on it.


 * T177924 Run Popups Selenium tests daily targeting beta cluster
 * Job set up but tests are failing, waiting for Jon to fix them

Program 3: Outcome 1: Objective 1: Define a set of code stewardship levels (from high to low expectations)

 * Due: End of this quarter
 * Quatertly Goal task: -


 * Started discussion with Tim re what Code Ownership and Maintenance means.

Program 3: Outcome 1: Objective 2: Identify and find stewards for high-priority/high use code segment orphans

 * Due: End of next quarter
 * Quaterly Goal task -


 * Received feedback from several folks that I reached out to regarding code ownership.

Program 3: Outcome 2: Objective 1: Define a “Technical Debt Project Manager” role that regularly communicates with all Foundation engineering teams regarding their technical debt

 * Due: End of this quarter

Program 3: Outcome 2: Objective 2: Define and implement a process to regularly address technical debt across the Foundation

 * Due: End of next quarter


 * First Blog post in series of Tech Debt post went live today.
 * https://blog.wikimedia.org/2017/10/11/mediawiki-code-health-group/

==== Program 6: Outcome 2: Objective 2: Set up a continuous integration and deployment pipeline to publish new versions of an application to production via testing and staging environments that reliably reproduce production ====
 * Due: End of this quarter
 * Complete build phase of release pipeline


 * Build test variant
 * Run test entrypoint w/developer feedback - services dependency
 * Build production variant w/developer feedback - services dependency
 * Tag production container
 * Push to production docker registry - ops dependency - staging namespace

Tracking: https://phabricator.wikimedia.org/T157469
 * current status: https://phabricator.wikimedia.org/project/view/2453/


 * https://wikitech.wikimedia.org/wiki/Streamlined_Service_Delivery_Design
 * https://gerrit.wikimedia.org/r/#/c/382608/ under review

Program 1: Outcome 1: Objective 1: Scap (Tech Debt Sprint FY201718-Q2)

 * workboard


 * Trying to pin down a time for Chad and Mukunda to meet (weekly) about this objective, not yet scheduled


 * Investigated git the issue with submodules wasting a lot of space for duplicated git packs/objects
 * https://phabricator.wikimedia.org/T137124
 * Have not arrived at a definite conclusion yet however...
 * I think the best solution is going to be to just nuke the .git metadata for scap deployed revisions

Program 1: Outcome 5: Objective 1: Maintain existing shared Continuous Integration infrastructure

 * Goal: A generalized POC for a docker-based CI.
 * https://phabricator.wikimedia.org/project/view/3008/ (shipyard workboard)


 * struggling with building an image using docker, limited and easy to screw up :)
 * Homegrow CI tool to build vs operations one for prod images
 * Still "docker pull" random image from docker hub
 * Not sure yet how to exactly run the commands in the container
 * We need CACHE (similar to Castor)

There are a few experimental jobs here and there. But nothing will be seriously migrated until the infrastructure is ready.

Program 1: Outcome 6: Milestone 2: Maintain Phabricator

 * Created tools for managing custom forms
 * Copy an existing form, maintaining it's default values / hidden fields and field ordering.
 * Changed custom fields to default to hidden
 * Updated the script for moving project hierarchies
 * Now it's possible to move an existing subproject into a different parent project
 * This is not particularly safe and it is generally inadvisable to use it except for extraordinary situations.
 * The script does not deal with project members so it is especially not advisable for projects where the membership matters

Team Kanban Board Review and Triage

 * closed and touched in the 7 days
 * No update for 4 weeks
 * No update for 3 weeks
 * No update for 2 weeks
 * No update for 1 week
 * All Open
 * Review To Triage column of #releng


 * Assigned
 * Unassigned

Once / month-ish review of backlog(s)

 * releng Review To Triage column of #releng
 * releng-kanban Review unassigned in kanban
 * releng-kanban Review 'backlog' colum of -kanban
 * releng-next - Review for things we need to put on our kanban backlog
 * releng-backlog - oh my, the huge backlog of things...

Kanban stats

 * Burnup chart