Wikimedia Platform Engineering/MediaWiki Core Team/Quarterly review, October 2014/Notes

The following are notes from the Quarterly Review meeting with the Wikimedia Foundation's MediaWiki Core team, October 2, 2014, 3:00PM - 4:30PM PDT.

Meeting overview page: https://www.mediawiki.org/wiki/Wikimedia_MediaWiki_Core_Team/Quarterly_review,_October_2014


 * Attending

Present (in the office): Aaron Schulz, Gabriel Wicke, Chad Horohoe, Tomasz Finc, Rachel diCerbo, Toby Negrin, Ori Livneh, Erik Moeller, Damon Sicore, Tilman Bayer (taking minutes), Rob Lanphier, Lila Tretikov, Dan Garry; participating remotely: Greg Grossmeier, Brad Jorsch, Nikolas Everett, Tim Starling, Chris Steipp, Chris McMahon, Arthur Richards, Bryan Davis

''Please keep in mind that these minutes are mostly a rough transcript of what was said at the meeting, rather than a source of authoritative information. Consider referring to the presentation slides, blog posts, press releases and other official material''



RobLa:

Welcome

this will be a somehow abbreviated version of the planned quarterly review

[slide 2] This is the team, and what they worked on since July

[slide 3] Intro round, with everyone describing what they have been working on this q:

Ori: perf engineer, worked on HHVM

Tim: HHVM + architecture

Aaron: worked on HHVM until recently

ChrisM: consumer of everything this team does ;)

ChrisS: Security, CentralAuth/SUL Finalization reviews

Dan: PM mainly on mobile apps, but also some platform stuff like SUL finalization

Bryan: SUL finalization, IEG support, Beta Labs, Vagrant

RobLa: Kunal (not here today) did a ton of work on SUL finalization

Brad: SecurePoll, API stuff, and random bugs, code review

Chad: worked on Search for >1y with Nik, and various other things that came up

Nik: Search, with Chad

Arthur: team practices, will do workshop with core team in a few weeks

Greg: release team manager, work closely with platform team

[slide 4] RobLa: will put bulk of energy in library infrastructure

SecurePoll

Search: once that's fully deployed, it will free up Chad a bit

[slide 5] Bigger projects from Q1:

[slide 6] Ori: historically, addressed performance with lots of caching

that masked perf deficiencies [at least for logged-out users]

(shows https://people.wikimedia.org/~ori/metrics/pngs/anons-vs-editors.png ) penalty of 300-500 milliseconds for logging in :(

(shows ): it's actually diverging further

This motivated strong emphasis on improving perf of web app

even though 97[?]% of views cached: those 3-5% contain editing activities, i.e. the lifeblood of the site

Dan: basically, anything interactive

[slide 7] Ori: coupled PHP version update with Ubuntu version update, carried out successfully

previously, Ops managed everything but MW, now more shared ownership

Tim got credited in HHVM release notes for one particularly relevant upstream contribution

Ori: (shows PHP5 vs. HHVM Editor perfomance graph https://grafana.wikimedia.org/#/dashboard/db/Edit%20Performance )

Lila: so this is basically before vs. after? yes

this is fantastic

very impressed, Ori deserves huge props for stepping up, getting everyone on board

Ori: still substantial effort to convert remaining servers to HHVM - handled by Ops team

decided to get big chunk of improvements first, with some work later:

JIT compilers suffer from slow startup time, have to analyze code first

HHVM can do that in advance, one can tell it to not constantly check file on disk [for changes]

[still need to do such configuration work]

Lila: what is the timeline for this?

Ori: handled by Ops, e.g. 25% of reader traffic by Nov 3[?]

Tim: I will scale back HHVM work in next quarter

Lila: so it will be complete next q from Ops perspective? yes

reader (IPs)?

why not prioritize this too, what else is in conflict?

Ori: Tim and Aaron are among the most prolific code reviewers and contributors; tying them up in HHVM has negative distance effects throughout the ecosystem

Lila: OK, but in general, it would be great to keep some people working on perf constantly

Erik: we updated recommended MW dev environment (Vagrant) to HHVM framework months ago already

Damon: what about metrics/analytics, how does that change with HHVM?

Ori: it has a lot of stats capabilities

Damon: e.g garbage collection, ...?

Tim: ...has several profiles

Damon: excellent

Chad: and, not losing instrumentation we already have

Toby: ...

Lila: also, right now on different clusters, so can still compare

Gabriel: are we going to distribute HHVM too?

Ori: HHVM landed in Debian Testing a month ago

apt.wikimedia.org is already included as package source in our Vagrant

[slide 8] Dan: SUL finalization

84 million accounts, around 5% of them...

Lila: even with single sign-on, still have different user pages on different projects

Dan: lots of renaming work

1st goal: everyone can have a global account

enable development without such corner cases

2nd goal: make sure accounts stay unified

developed organically, separate table for each wiki

[slide 9] request a rename: huge community engagement task.

want to make this easier for community to handle it by themselves

current rename process can solve issues for...

Lila: how much completed this q?

Dan: all of the engineering work this q, but might still [take longer for the other work]

most of it [engineering work] completed

RobLa: several weeks, had needed to pull Bryan off for IEG project committed to previously

Lila: so the October deadline is for all the work listed here? yes

Damon: (question about monolithic code base)

Erik: this is separate, won't provide OpenID, etc

i.e. this is just one slice of the identity pie - but an important one

also, this is just about public wikis which are world-writable

e.g. WMF wiki is still separate

Dan: ChrisS helped solve lots of CentralAuth issues, Keegan helped with CE, ...

tentative date: 15 April

work will still be going on, but not engineering (Rachel, Keegan, myself)

Tomasz: what's your time split between apps and this?

Dan: tricky, had to neglect some things in apps

Erik: possible proxy owner for SUL?

Rachel: let's discuss...

[slide 10] Nik: Search

old search didn't really work any more

was a big Java application, depending on lots of outdated libraries

move more of it into MediaWiki, where we have more expertise

also incorporated new features, which was made easier by this

but hit roadblock with backend

contributed upstream to Elasticsearch

Damon: how much do people use our search?

Nik: about 1 million hits/hour

but yes, primary way to stumble on our articles is via Google

one important customer: editors who search for typos, want to keep them happy

Dan: also, e.g. Google can not search by namespace (e.g. talk pages only)

Nik: ...

RobLa: (explains) Elasticsearch is the underlying search engine we are using, it's an external open source project

Cirrussearch is our own project, built on that

Lila: also keep in mind that it's getting [important to get search on mobile right]

Dan: on app, so far only prefix search - one typo, and you find nothing

Lila: even without typo, you find nothing

Tomasz: we should never show no search results

Bernd worked with search team

Nik: yes, he reviewed my code, very useful

team = Nik + 0.5 * Chad

and some hours per week from AndrewO and Filippo from Ops

now deployed on all except enwiki (50% of search traffic), dewiki, frwiki, zhwiki

hardware: will have 4x the I/O

[slide 11] RobLa: Bigger projects proposed for next quarter

purpose of library: peel off components from MediaWiki, "componentize" neatly

start at bottom of stack, find pieces of MW that touch everything

Damon: does that mean literal PHP library? or API..

RobLa: this is a bit more basic

Toby: how does this interact with services?

Bryan: a bit of both

if you have 3-4 years, can understand MW ;)

disentangle so that [it's not required to understand the entirety of MW]

it leads to service orientation, by isolating behaviors in code

e.g. once we have all storage for (somethng) behind an API, we could then rewrite in another language (than PHP)

Damon: ...

Tim: yes, taking components

see plan page on mw.org

candidates: logging, ResourceLoader, ...

Chad: also, for testing

Damon: right, APIs make good testing

RobLa: this will take bulk of our effort in coming q

get a good estimate

get good "inertia" on this (so it's moving on its own)

several extensions pull in their own libraries, in their own way

sometimes different libraries for same task

Lila: what's the final output this q? a design document?

four people on this? yes

Bryan: library decomposition is bigger than an epic, it's a theme, say 2 years

epic for this q: initial foundational work

use Composer (package manager for PHP)

figure out what it means for our deployment, release management, Github mirroring

I wrote an RfC

needs wiki work for documentation

how track bugs, promote use of library, ...

deliverables:

complete (removal of) logging into a library

and something else, like ResourceLoader

document how to use Composer

Lila: OK, but it's important to have plan

Erik: have we reached out to possible reusers outside MW Core?

RobLa: e.g. Timo (in Core)

outside: talked to e.g. Wikidata, but nothing concrete yet

Erik: important to socialize horizontally, e.g...., jQuery,

Bryan: not my primary focus, but yes, should have messaging across extension world

also, move stuff into core that belongs there

Lila: this is probably one of the most complex projects...

looks like more conceptual exercise than actual coding

RobLa: we'll focus on logging as concrete example

still some scoping to do

want to have at least 2-3

Gabriel: backend, storage interface (talked about this at architecture meeting)

eventually, should have MW as API consumer

would be good to work on that too

Erik: would be good to integrate Gabriel into a lot of these conversations

Ori: but there's some more basic work to do

lack of knowledge on how to integrate a library (also external ones)

leads to a lot of (redundant effort)

e.g. releases

some unsexy work like that

RobLa: just separating one library like logging (connected to a lot of things) will keep us busy

Lila: I tell all teams to plan conservatively

this strikes me as one of those unknown projects - how deep is the rabbit hole?

Dan: the more I hear, the more i think it's an amazing idea

Lila: as first step, write down what we want achieve, how is it improving things, success criteria

not redesign for the sake of it, but because it solves deep problems

can be open process

not frame as RfC

RobLa: RfC can have different meanings [in the Wikimedia world]

Lila: who drives MW RfCs, where do they reside?

RobLa: us / on mw.org

ChrisM (on chat): would the "Editor Performance" work also extend to e.g. performance in Flow? I ask because we've found some performance issues there recently that might take advantage of Core improvements, but I'm not sure

Erik: still a lot of perf issues both in old wikitext editor and in VE

and mobile in each

need the data

with VE, also need to measure internal perf - eg. after pressing a key

this is in scope (for improving editing perf)

Flow is not in scope

save rates, initial load time, e.g would [actual VE] section editing help?

Ori: worry about clear cross-team commitments

Dan: on app: conversion rate (tap edit -> save)

Ori: that sounds more like purview of analytics

Toby: some overlap, but yes

Tomasz: ..

Damon: What about regression policy on library infrastructure - are we allowed to regress perf?

Ori, Aaron: shouldn't have much of an effect

RobLa: ...

Erik: good opportunity to model that out (if x, should be reverted)

VE (save) times

Damon: I'm very interested in regression policy

Ori: ...

Gabriel: we did some of that [in VE/Parsoid], was surprisingly hard, a lot of noise, fight with very basic stuff

we did discover some issues that way and it was valuable, but...

Erik: realistically this is a Q2+Q3 thing

Nov + 2 weeks in Dec, then continue after holidays

RobLa: yes, and need to have followup on this review anyway

Lila: you main priority should be to finish what you started (with e.g. Ori's additional work on HHVM)

don't overextend, e.g. you have SecurePoll

Erik: don't want to be too fetishist about q boundaries, it's OK to launch in mid-q

Lila: do SecurePoll, finish perf, some smaller projects

put this [libraries] into stretch goals

Aaron: should come up with KPIs, have dashboards

Erik, Lila, RobLa: agree

Erik: continue using editing perf for that

(slide 12, 13: skipped)