Making Subversion faster

The Subversion protocol is extremely inefficient, especially in terms of the number of network round-trips that need to be completed in order to perform operations such as diff, merge and annotate. A fairly ordinary merge operation can take several minutes if you have a high latency link, as you would if you lived in, say, Australia.

It's possible to use viewvc, that works for diffs and annotates. But it doesn't really work for merges.

Using svnsync
If you can afford 1.5 GB of disk space and you're going to be doing these operations regularly, you can make a local mirror of the repository. Here's how I set up my mirror of MediaWiki.

First, create a local repository

sudo mkdir -p /var/svn/mediawiki sudo chown tstarling /var/svn/mediawiki svnadmin create /var/svn/mediawiki

Create hooks for start-commit and pre-revprop-change that do nothing, to make svnsync stop whinging about permissions. Windows users should create empty start-commit.cmd and pre-revprop-change.cmd files in the hooks directory.

cd /var/svn/mediawiki/hooks/ echo '#!/bin/bash exit 0' > start-commit chmod 755 start-commit cp start-commit pre-revprop-change

Configure svnsync:

svnsync init file:///var/svn/mediawiki svn+ssh://svn.wikimedia.org/svnroot/mediawiki

Do the initial sync. This takes a while.

svnsync sync file:///var/svn/mediawiki

Then whenever you want to update your mirror, run that command again. It starts from where it left off.

The general idea is to never check out a copy of the local mirror. If you check it out, you might accidentally change it and then svnsync's whinging about hooks would have been justified. Instead, I just diff and merge with URLs.

But URLs are slow to type, so I use the following shell function:

function mi { local dir if [ -z "$1" ]; then dir="." else dir="$1" fi dir=`readlink -f "$dir"` trailing=${dir#/home/tstarling/src/mediawiki/} if [ "$dir" == "$trailing" ]; then echo "No mirror available for $dir" >&2 return 1 fi echo "file:///var/svn/mediawiki/$trailing" return 0 }

So now instead of

cd includes svn annotate Skin.php

You can type:

cd includes svn annotate `mi Skin.php`

which is about a thousand times faster. For the current directory, omit the filename:

cd .. svn diff -c 42767 `mi`

Using SVK
Another alternative is to use SVK, an advanced distributed version control system that works on top of Subversion. Instructions for SVK 2.2.0 (latest version):

svk mirror svn+ssh://svn.wikimedia.org/svnroot/mediawiki //mirror/mediawiki svk sync //mirror/mediawiki svk checkout //mirror/mediawiki/trunk/phase3

All history operations are now local. If you want to go offline:

svk branch --offline

While "offline", you use  and   to sync with the master repository.

It would be possible to tremendously speed up mirroring once MediaWiki provides a svn dump from the repository. Then mirroring would be:

svk mirror --bootstrap=mediawiki-repo.svndump //mirror/mediawiki ...