Git/Conversion/translatewiki

translatewiki is a set of MediaWiki extensions hosted on http://www.translatewiki.net/ which let volunteer translate MediaWiki core, extensions and third parties software. With the git migration, the existing set of script and tool need to be freshen up. This migration is tracked with assigned to Antoine "hashar" Musso.

For the context, the working dir is /home/betawiki, scripts being in /home/betawiki/bin/ and projects checkouts in /home/betawiki/projects/

The very first action, was to locally init a git repository in /home/betawiki/bin to make sure changes are tracked.

svn/git Workflow
The rough workflows as far as subversion is concerned, are:

For importing:
 * a list of i18n files is build
 * those files are fetched individually from the subversion repository using svn co --depth=empty

For exporting:
 * a script generates new i18n files based on messages in translatewiki.net instance
 * they are fetched on a developer local machine
 * change is reviewed against trunk@head, eventually amended and committed.

Git is almost exactly a like. The main difference is that you must clone the whole repository and can not fetch individual files, at least out of the box.

To adapt to git, two approaches have been taken:


 * 1) individual fetches
 * 2) git submodules

Method 1 : individual fetches
To mimic subversion ability to fetch individual files, hashar wrote a perl script wich would fetch files from gitweb. That is a ugly hack, but it makes the job done.

The script is fetch-i18n-files.pl, written in perl which is way easier than shell or php for that kind of task. Basic usage is to provide:
 * 1) a list of all potentially interesting i18n files (--list, default to filename i18nfiles.list).
 * 2) a list of extension we are interested in, which is the list of extension hosted on WMF server (--wmf</tt>, default to wmf.list</tt>).

Script will filter out i18n files which are not part of a WMF extension, fetch them from the Gerrit gitweb interface and write them to the import</tt> directory (override with --to).

To make it easier, a shell script wrapper update-mediawiki-ext-git</tt> takes care of setting parameters passed to the script and changing path.

Basically:

$ cd /home/betawiki/bin $ ./update-mediawiki-ext-git

Files are made available in /home/betawiki/projects/wmf-ext/extensions/</tt>

TODO: describe wmf-ext being a local git repository.

That method key point is that the repository hosted on translatewiki.net ends up tracking all i18n changes.

Method 2 : submodules
In this approach, we set up a local git repository in /home/betawiki/projects/wmf-all/</tt> which could later get hosted on Gerrit (for example as mediawiki/i18n</tt>). We then add as submodules the mediawiki/core</tt> repository, fetch a list of available extensions from WMF Gerrit to add them all as submodules as well.

Creation: mkdir -p /home/betawiki/projects/wmf-all cd /home/betawiki/projects/wmf-all git init

Then execute the shell script /home/betawiki/bin/update-mediawiki-submodules</tt>.

Whenever a new extension is added to Gerrit WMF, one can rerun the script which will happily add the new one.

To update all submodules (core + extensions), use the git submodule foreach</tt> command which take any command as an argument. To fetch any new objects available in the default upstream repository:

$ git submodule foreach 'git fetch'

Foreach will end whenever the subcommand exit with non zero. You can override that by appending '|| :', which means: "or repeat last command". To review all differences made to extensions use:

$ git submodule foreach 'git diff origin/master || :'

We could probably restrict that to i18n files with "git diff origin/master *i18n* *alias*"

Then to update everything: $ git submodule foreach 'git pull'

Will merge the tracked remote branch (default origin/master) into the current branch (default master).

This method makes it a bit harder to review changes since there is no easy way to diff changes made to i18n files.

Pushing is alike:

$ git submodule foreach "git push -m 'some commit message there' "

Exporting
Export changes using /home/betawiki/bin/bpmw-git</tt> which takes a number of hours as argument. The tarball will hold two directories. The directory named core</tt> holds MediaWiki core language files. extensions</tt> holds extensions. Using the file hierarchy from method 2, one would just have to untar the exported file in its repository then he will be able to tweak / review change before committing.

Mixing svn and git based extensions
The way translatewiki.net works, it needs to have all extensions in the same directory wherever they are hosted on git or subversion. The way extensions are registered is through the /trunk/translatewiki/MediaWiki/mediawiki-defines.txt file which list the extension name then some options.

initial setup
Under method 2 above, we would have the following structure based on extensions being git submodules in ./extensions/

. ├── core/ │  └── .git/ └── extensions/ ├── .git/ ├── git_extension_A/ │  └── .git/ └── git_extension_B/ └── .git/

Since we want to checkout extensions from subversion as well, we want to initialize extensions as subversion working copy:

$ cd ./extensions/ $ svn co http://svn.wikimedia.org/svnroot/mediawiki/trunk/extensions --depth empty. U. Checked out revision 114372. $

The new directory structure is thus:

. ├── core/ │  └── .git/ └── extensions/ ├── .git/ ├── .svn/ <- it has subversion support as well ├── git_extension_A/ │  └── .git/ └── git_extension_B/ └── .git/

From there we can cherry pick the extension we want to use. I E the ones which has not been migrated to git then check them out:

$ svn update <list of extension only in subversion>

Migrate from svn to git
When migrating an extension from subversion to git, the following will need to be done on translatewiki.

Given the extension name MyExtension. You will want to make sure no local change are left behind by doing a svn status. Then delete the local copy and update subversion local repository.

To delete the extension:

$ cd extensions $ ls MyExtension $ rm -fR MyExtension $ svn status !     MyExtension $

Subversion still know about the MyExtension name because it is keep somewhere in the .svn directory. You will have to freshen your working copy:

$ cd extensions $ svn up D    MyExtension Updated to revision 12345. $ svn status $

Whenever the extension is migrated to Gerrit, it will be listed at https://gerrit.wikimedia.org/mediawiki-extensions.txt which will let the git submodule update script to find out about it. So just run update-mediawiki-submodules</tt> :-)

you will probably want to setup local .svnignore to ignore extension in git and .gitignore to ignore extension that stay in subversion. That would avoid cluttering the output of svn/git status. When migrating from svn to git, the extension will be removed from .gitignore and added to .svnignore.

TODO

 * Create an i18n dedicated Gerrit account.
 * change remotes URLs on translatewiki. Currently use HTTPS, should be SSH once account is created
 * Figure out why svnignore does not work in /home/betawiki/projects/wmf-all/extensions/

Future
The translation team really need a few more scripts to help them daily:
 * to show up any keys modifications made to English language
 * verify messages.inc against MessagesEn.php
 * a FUZZ marker