Git/Conversion

This page discusses efforts to convert away from our current Subversion repository to Git, immediately affecting MediaWiki core and extensions used on Wikimedia sites. The current plan as of February 2012 is to do this by the end of March 2012.

Rationale
Our current Subversion-based version control system has served us well, but we're in need of a more suitable version control system for our development effort. Our community is very distributed, with many parallel efforts and needs to integrate many different feature efforts. After long consideration, we've decided to move from Subversion to Git.

Some advantages of git:


 * "I love git just because it allows me to commit locally (and offline)." - Guillaume Paumier
 * "[Y]ou can create commits locally and push them to the server later (great for working without wifi), you can tell it 'save my work so I can go do something else now' in one command, and it'll allow us to review changes before they go into "trunk" (master).... without human intervention in merging things into trunk. Gerrit automates this process." - Roan Kattouw

Affected development projects
MediaWiki Core (/trunk/phase3/) and MediaWiki extensions that WMF deploys will move to git in March 2012. Afterwards, any other extensions, tools, or projects that wish to move can do so. These might include operations, fundraising, pywikipediabot, etc.

We will leave some codebases in Subversion and not bother migrating them, because those extensions or tools have been abandoned. Some developers will choose to move their projects to Github or some other git site. We will also leave svn.wikimedia.org up for at least multiple years; for the subdirectories holding projects that have moved to git, the repository will be read-only.

The Git conversion team will publicize any changeover date with at least 2 weeks' notice. As of right now (1 Feb 2011) there are no specific cutover dates set.

Chad would like to gradually migrate all projects currently on Wikimedia's Subversion repository so that he can make all of svn.wikimedia.org read-only by the middle of 2013. They could move to WMF's git repo, or to another host; Chad can help them decide and migrate.


 * MediaWiki extensions (not used by WMF)
 * Starting in March or April 2012, Chad will move alphabetically through all extensions (that are not deployed on Wikimedia Foundation sites) and offer each of them choices as to when and whether to shift.


 * Pywikipediabot
 * The pywikipediabot community has not yet decided on whether to move, but is strongly leaning towards staying with SVN for now. Sumana Harihareswara, Wikimedia Foundation Volunteer Development Coordinator 00:23, 8 February 2012 (UTC)


 * Wikimedia Foundation fundraising
 * - will most likely move to git, timeframe unknown.


 * Wikimedia Foundation operations
 * Ops is pretty much aware of this since they've already started the git move. Happening piecemeal by them as they're ready.


 * Ariel Glenn's dumps infrastructure
 * Just two paths, /trunk/backups and /branches/ariel/. Should be pretty trivial, history's not complicated.
 * Can convert: as soon as we're ready, just give Ariel a day's notice or so.
 * Moved to operations/dumps.git on 15-Feb-2012. svn made r/o.


 * Wikimedia Foundation data mining and analytics, including Community Department


 * Toolserver internationalisation


 * Daniel Kinzler's WikiWord project
 * Per IRC: No rush, will move casually after main migration. Not under active development right now.


 * mwdumper
 * Not being actively developed right now. Can move this whenever.
 * Dumping this right now. Should have it moved to mediawiki/tools/mwdumper.git today


 * WM planet configuration
 * Was moved into operations/puppet.git (in files/planet) by Dzhan. This whole system needs redoing anyway (see bug 27208, GSoC 2012 project idea), but it'll do for now.
 * Should probably make svn r/o for this.

December 2011

 * Preliminary test conversions early in month [DONE, December 5ish]
 * Git workflow architecture review [DONE, December 19]
 * Agree on implementation strategies regarding remaining development process questions, e.g. how to handle multi-repo commits [DONE as of 1 Feb]

January 2012

 * CI tests and linting get run when a developer chooses to push to the stage between their branch and the mainline branch (see Ideal Workflow Document and https://labsconsole.wikimedia.org/wiki/Git-review ) (still in progess as of 1 Feb)

February 2012

 * Finish code review on trunk (progress at Code Review stats and MediaWiki 1.19/Revision report)
 * 2 weeks before migration of MediaWiki core, start communicating about cutover date -- give date & links to all the documentation with the 3 most frequently asked questions
 * techblog post
 * wikitech-l
 * mediawiki-l
 * add !gitconversion to mw-bot
 * Cut 1.19 release branch
 * Finish up specific Git management scripts / changes
 * to support WMF workflow
 * stage git-based tree on fenari
 * update documentation
 * i18n updates
 * Make Gerrit behave like we want it to -- TODO
 * Making permissions right [Mostly done]
 * Making hooks correct [In progress]
 * 1.19 release from SVN
 * 2 weeks before migration of MediaWiki extensions, start communicating about cutover date to extensions developers -- give date & links to all the documentation with the 3 most frequently asked questions (wikitech-l, mediawiki-l)

March 2012

 * Git migration -- core & extensions [scheduled for first weekend in March]
 * Make trunk/phase3 & trunk/extensions r/o
 * change links on mediawiki.org
 * do deauth of SVN as a pre-commit hook to output an informative error message in case someone tries to commit to MW core -- "Subversion is dead, we have moved to git, read Git conversion"
 * Git migration -- ANYTHING ELSE
 * Make paths r/o on case-by-case basis.
 * Ongoing, slow process.
 * Move towards git-based development and release process
 * First deployment from git mainline development branch
 * Move towards continuous integration via git, goalpost: weekly deployment
 * Jenkins (Testswarm/PHPUnit tests) on git branches

April 2012

 * First release from git mainline development branch
 * Is this a target for the conversion? Maybe shouldn't be on the roadmap...

Unscheduled items
Other cool things to do:
 * ❌ Write some documentation about git usage for our developers (existing page Subversion could be used as a starting point) and a list of useful links.
 * See Git/Guide and improve
 * ❌ Convert the Bugzilla code to recognize the new SHA-1 commits.
 * ❌ Create database of SVN revision ids -> Git SHA-1's for useful lookups.
 * Info is included in Git commits, just need to make a DB mapping of them.
 * ❌ Some ops stuff is being moved piecemeal to git by Mark Bergsma et al.

Split up and convert repositories
A naïve  conversion of the entire repository (with branches) weighs in at around 7.8GB (November 2011). It makes no sense to make one Git MediaWiki repository, it should be split up.

In Subversion everything gets squashed into one giant repository. In Git repositories are split at the boundaries over which code does not cross.

Splitting
We have a test repository up, but in February 2012 will redo the split to create a permanent git repo.


 * MediaWiki will go in mediawiki/core.git
 * Extensions will go in mediawiki/extensions/foo.git
 * Two meta-repos, mediawiki/extensions-all.git and mediawiki/extensions-wmf.git will be repos using submodules to maintain "all" or "wmf" extensions.
 * Other things across SVN need to find new homes in Git

/Splitting tests

Converting

 * ✅ Every commit needs to be rewritten to give name/email pairs to SVN users. We are using username@users.mediawiki.org for a unified e-mail address scheme for all old commits.

Ideal state
This is what we'd love to see:

History

 * See history of MediaWiki version control

Working on the conversion

 * User:^demon

Would like to see it happen

 * 
 * Aryeh Gregor
 * Ashar Voultoiz (already use git locally)
 * Daniel Friesen
 * Gregory Varnum

Documents

 * User requirements:
 * Specifications: https://labsconsole.wikimedia.org/wiki/Gerrit_bugs_that_matter
 * Software design document:
 * Test plan:
 * Documentation plan:
 * User interface design docs: https://labsconsole.wikimedia.org/wiki/Gerrit_bugs_that_matter
 * Schedule: see Timeline above
 * Task management: Task list from Bugzilla (bug 22596)
 * Release management plan:
 * Communications plan:
 * Status updates

Communications

 * draft plan
 * announcement of test repository
 * "git boot camp" from October 2011 NOLA hackathon GitBootcamp