Git/Conversion/Splitting tests

Primitive bash script to build out stub repos from a SVN checkout (not keeping history -- only suitable for testing layouts!)

Master repo layout
This provisional layout has a single git repository for MediaWiki core, then individual separate ones for each extension:


 * mediawiki/core.git
 * mediawiki/extensions/FooBar.git
 * mediawiki/extensions/QuuxBax.git
 * etc

All of these main repos would need write permissions for core devs & localization team. Additional extensions can be added in when 'officialized', pulling updates in from individual developers' work repos.

Note that other chunks of non-MediaWiki stuff will also need to be broken out, but they're not on my top agenda right now. ;)
 * Most WMF stuff should probably go to puppet. Others likes 'code-utils' could probably go in their own git repo too.

Branch and tag conversions
History from the branches and tags directories should be copied over as well if possible. Some straightforward mapping probably makes sense:


 * /trunk/phase3 -> mediawiki/core.git 'master'
 * /branches/REL1_17/phase3 -> mediawiki/core.git 'REL1_17'
 * /tags/REL1_17_0/phase3 -> mediawiki/core.git 'REL1_17_0'

If we're very brave we could rename the things from REL1_17_0 to rel1.17.0. :)

Same branch/tag setup on extensions ought to also work.

Checkout layout

 * git clone mediawiki/core.git mediawiki
 * git clone mediawiki/extensions/FooBar.git mediawiki/extensions/Foobar
 * etc

This produces a ready-to-run layout with core in the mediawiki directory, and all the extensions individually checked out into the extensions directory.

To make commits
To commit a core change, just do the usual 'git commit -a' etc from anywhere in the main repo (preferably top directory, but doesn't have to).

The biggest difference from an SVN checkout here is that git will not automatically dive into the extension subdirectories and commit them too (which SVN does under limited circumstances).

So if committing extension updates however you will have to individually commit each repo that's changed:

cd extensions/LiquidThreads git commit -a

This can be looped when doing batch operations, so doesn't require interactive pain:

for x in extensions/*; do   (cd $x && git commit -a -m 'localization batch updates' && git push origin master) done

or more solidly scripted fairly easily.

To pull updates
Basically same as making commits -- do a 'git pull origin master' or whatever in each extensions' dir as well as the master. This could be scripted.

One possible danger is losing track of what extensions are available -- an automatic checkout of 'everything' needs some sort of master list to work from.

To branch
Individual work repos may of course freely create a hojillion local branches to share work. That's the beauty of git!

Keeping version branches/tags as branches in the git masters probably makes sense; though there's also a separate 'tag' concept in git.

To make a local branch:
 * git checkout -b mybranchname

To push that branch upstream:
 * git push origin mybranchname

Note that mass branch-switching can be a little trickier, as you'll have to run it in each subdirectory (the usual scripting etc).

vs git submodules
It may sometimes be useful to use git submodules -- this facility is similar to svn externals, and mostly is a way to (partially) automate the extra cloning & updating. Probably this would be very handy for a ready-to-run batch checkout.

In particular, submodules might be a good way to make those 'everything' checkouts easy: a composite repo pulls everything from core, and has submodule entries for all maintained extensions. A bot can automatically update the submodule references in this copy as master branches change in the extension repos, so 'git pull && git submodule init' should always update things. (Is the 'git submodule init' needed every time or just once?) (Just once, you `git submodule update` after that. Daniel Friesen (Dantman) 06:08, 5 October 2011 (UTC))

But to avoid confusing anything, let's not write it into our main repos yet.


 * git submodules also track commit ids, so every commit in a separate repo requires a new commit in the main repo just to update the commit ids, which isn't what we want with repos. When you take out commit id tracking git submodules are nothing more than a quick way to git clone / git pull in a loop. Rather than submodules I think we should instead ship with a few scripts that can make handling git repos en-masse easy. The scripts could let us mass clone and mass pull like submodules can. But we can also make it know how to fetch a list of extensions and mass clone everything missing. It can also be made to make mass git commits and pushes. Daniel Friesen (Dantman) 06:08, 5 October 2011 (UTC)


 * Yeah, keeping track of the commit ids makes 'git submodule update' hella fast if most modules don't actually need an update, but maintaining the submodule entries in the main repo would be a bit ugly. A wrapper tool that can do a fast batch-fetch of the head positions could be helpful here by letting us iterate 'git pull' over changed repos & skip unchanged ones without having to tie all the extensions into the main repo and make constant commits to it. --brion 21:34, 5 October 2011 (UTC)