Git/Conversion

This page discusses efforts to convert away from our current Subversion repository to Git. The current (very preliminary) plan as of December 2011 is to do this by the end of March.

Rationale
Our current Subversion-based version control system has served us well, but we're in need of a more suitable version control system for our development effort. Our community is very distributed, with many parallel efforts and needs to integrate many different feature efforts. After long consideration, we've decided to move to Git from Subversion

October 2011

 * "git boot camp" @ NOLA hackathon

December 2011

 * Prelim test conversions early in month [DONE, December 5ish]
 * Git workflow architecture review [DONE, December 19]
 * Agree on implementation strategies regarding remaining development process questions, e.g. how to handle multi-repo commits [IN PROGRESS, December 19]

January 2012

 * CI tests && linting get run when a developer chooses to push to the stage between their branch and the mainline branch (see Ideal Workflow Document)
 * Finish code review on trunk
 * Cut 1.19 release branch
 * Finish up specific Git management scripts
 * to support WMF workflow
 * i18n updates
 * new developers
 * Pushing git changes downstream
 * Make Gerrit behave like we want it to

February 2012

 * 1.19 release from SVN
 * Git migration -- CORE
 * Make trunk/phase3 r/o
 * Git migration -- EXTENSIONS
 * Make trunk/extensions r/o
 * Git migration -- ANYTHING ELSE
 * Make paths r/o on case-by-case basis.
 * Move towards git-based development and release process

(copied from Roadmap; needs incorporation)
 * Will Gerrit work as repo browser, or only patch manager?
 * has built-in gitweb, but it's not pretty.

March 2012

 * First release from git mainline development branch
 * Move towards continuous integration via git, goalpost: weekly deployment
 * Jenkins (Testswarm/PHPUnit tests) on git branches

Unscheduled items
To do a conversion of the repository we need:


 * ❌ Get a copy of the old CVS status to reimport its history properly. From an IRC talk between Avar and Brion on 20101026, the current svn repository suffers from cvs2svn bugs.
 * ✅ sourceforge CVS repository enabled by brion (2010-11-24) http://wikipedia.cvs.sourceforge.net/viewvc/wikipedia/
 * ❌ get repository with rsync : rsync -av USER@wikipedia.cvs.sourceforge.net::cvsroot/wikipedia/*
 * ❌ Write some documentation about git usage for our developers, existing page Subversion could be used as a starting point, and a list of useful links.
 * ❌ Convert the Bugzilla code to recognize the new SHA-1 commits.
 * ❌ Create database of SVN revision ids -> Git SHA-1's. Needed for redirecting CodeReview links and anything else that uses rXXXX to the new commit ID's.
 * Info is included in Git commits, just need to make a DB mapping of them.

Split up and convert
A naïve  conversion of the entire repository (with branches) weighs in at around 7.8GB (November 2011). It makes no sense to make one Git MediaWiki repository, it should be split up.

In Subversion everything gets squashed into one giant repository. In Git repositories are split at the boundaries over which code does not cross.

Splitting

 * MediaWiki will go in mediawiki/core.git
 * Extensions will go in mediawiki/extensions/foo.git
 * Two meta-repos, mediawiki/extensions-all.git and mediawiki/extensions-wmf.git will be repos using submodules to maintain "all" or "wmf" extensions.
 * Other things across SVN need to find new homes in Git

/Splitting tests

Converting

 * ✅ Every commit needs to be rewritten to give name/email pairs to SVN users. We are using username@users.mediawiki.org for a unified e-mail address scheme for all old commits.

Ideal state
This is what we'd love to see:

History

 * See history of MediaWiki version control

Working on the conversion

 * User:^demon

Would like to see it happen

 * 
 * Aryeh Gregor
 * Ashar Voultoiz (already use git locally)
 * Daniel Friesen
 * Gregory Varnum

Documents

 * User requirements:
 * Specifications:
 * Software design document:
 * Test plan:
 * Documentation plan:
 * User interface design docs:
 * Schedule:
 * Task management:
 * Release management plan:
 * Communications plan:
 * Status updates