MediaWiki 1.17/Release postmortem

MediaWiki 1.17 release postmortem This is a short-format postmortem purely for getting all of the issues on the table.

There are two sections: "What went well?" and "What do we need to look at?" We spend the 30 minutes brainstorming the list as quickly as possible, adding to both sections simultaneously. RobLa will take that list, organize it and maybe elaborate in parts (or follow up if I'm confused about something) and post the notes to wikitech-l.

For now, limit input to small (<140 character) bullet items. We can elaborate in a subsequent mailing list discussion.

So...this is specifically about the tarball release, rather than the 1.17 deployment
 * Ok :)

Also, don't spend a lot of time elaborating on other's points :)

To jog your memory: http://www.mediawiki.org/wiki/Release_checklist

Timeline
1.16.0 tagged 2010-07-28

1.17 branched 2010-12-07 1.17wmf1 branched 2011-02-03

1.17.0beta1 tagged 2011-05-05

1.18 branched 2011-05-06

1.17.0rc1 tagged 2011-06-14 1.17.0 tagged 2011-06-22

What went well

 * Release notes were well-written
 * Loads of new features
 * Loads of bugfixes
 * We put a 17 on the box
 * Not an immediate 1.17.1 (high quality release)
 * RC was indeed fairly close to final release ~a week
 * Beta/RC period was long enough that we managed to get a few extra bug reports from early adopters
 * Didn't mop up a lot of time from people outside of GenEng
 * Released tarball was high-quality -- we aren't getting any showstopper bug reports from users
 * Except maybe 29531
 * 1.16.0 -> 1.17.0 was less than a year :) [does 11 months count...]
 * "1.17" and other similar tags for noting "things to backport still" was very useful -- as long as people remember to untag once they've merged
 * the actual logistics of putting up the tarball etc. seemed fine? no ops issues
 * Don't know why there would be, this process hasn't changed in years
 * well, yay then! :-)

What do we need to look at

 * Time since branching to release
 * Time from WMF deployment, to tarball
 * Time to do initial code review backlog
 * Relied too heavily on Tim
 * Somewhat Chad's fault - schedule got unexpectedly busy
 * Chad doesn't have access to the download.wm.o box or posting rights to mediawiki-announce -> couldn't have pushed the tarball anyway
 * Checklist was too focused on the last steps before the release; many steps before were missing
 * Release notes -- the process of finding release notes that weren't added and then backporting them was a huge pain for Tim
 * Wasn't this mainly for the backports etc?
 * Yes, but people were backporting stuff without backporting release notes too. This ended up being a huge time-waster that should've been handled by people doing the backports
 * Communication, in the last few weeks, among Tim & Roan & other key personnel?
 * Beta 1 period was too long, we should have had a beta 2
 * "Who is doing the backports" was in question several times. It switched between Roan, Chad, Tim and random hangers-on.
 * Some unreviewed changes were backported (or directly applied) to the release branch, causing confusion and delay
 * How much could we fairly limit general users backporting stuff during stabilisation?
 * Actually we already have a policy on this (http://www.mediawiki.org/wiki/Commit_access_requests#Guidelines_for_applying_patches - bullet points 4 & 5). Might be time to refresh everyone's memory on the list.
 * Time to iterate security fixes (yes, not really helped by the person deciding to say "it's not fixed" after each improvement had been released)
 * We're still better than Microsoft
 * Need to find PostgreSQL maintainer
 * I'm kind of miffed that this happened. We have 2 maintainers who are usually fairly active in telling us we've broken PG. This time trying to engage them in the process was like pulling teeth.
 * Some confusion on the tags (e.g. 1.18 vs. 1.17revert) (this might be more about deploy)
 * scoping -- roadmap and deciding what features would go in
 * code review backlog from last fall (this has been largely addressed/discussed ad infinitum)
 * still less frequent than the historical 2-3 release/year norm
 * We're still better than Microsoft :)
 * Maybe try and set some longer term targets in advance?
 * MW 1.18 - codename Longhorn?
 * communication/momentum at the end. need daily scrum in last 2 weeks or so?
 * release process doesn't have clear phases like other projects (e.g. Ubuntu)
 * We should expand the checklist then. Make it a bunch of small incremental steps that anyone in the group can tackle.