Git/Conversion/pywikibot

From mediawiki.org

This page serves as a scratchpad for the pending move of Pywikipedia from SVN -> Git.

Current status[edit]

gerrit/gitblit/github mirroring SVN; updated every hour at :55. [logs]


Schedule[edit]

14 june - initial conversion

26 july - complete switch to gerrit

Note. Changes to the conversion code resulting in force pushes to gerrit need to be transferred manually with full_repush, which updates gerrit's references 1000 commits at a time. Large batches make gerrit puke.

Current SVN structure[edit]

/archive/Before_multifamily_changes/pywikipedia/
/archive/init3/pywikipedia/
/archive/init2/pywikipedia/
/archive/init1/pywikipedia/
/archive/trunk/
/archive/messages/
/archive/i18n/
/archive/old python 2.3 scripts/
/branches/rewrite/
/trunk/pywikiparser/
/trunk/pywikipedia/
/trunk/spelling/
/trunk/threadedhttp/

Proposed Git structure[edit]

  • pywikipedia
    • /trunk/pywikipedia/ -> master
    • /branches/rewrite/ -> rewrite
  • pywikipedia/pywikiparser
    • /trunk/pywikiparser/ -> master
There are some notes to be made here - we actually have five distinct repositories at the moment: trunk, rewrite, scripts, families and i18n. Scripts, families and i18n are shared (but not consistently) between trunk and rewrite. I would actually like to split it up further somewhat, but git subrepositories are a PITA, so that might not be the best way to go. In any case, I would like to have trunk and rewrite not as two branches, but as two distinct repositories. Valhallasw (talk) 20:29, 2 June 2013 (UTC)[reply]
There is another external named 'simplejson' that when I check out, this is be downloaded. and the repo isn't in wikimedia site and it's in googlecode.com Ladsgroup (talk) 13:08, 10 June 2013 (UTC)[reply]

Proposed Git structure / valhallasw[edit]

  • pywikibot/core
    • /branches/rewrite -> master
    • EXCEPT for i18n
    • SUBMODULE: scripts/i18n -> pywikibot/i18n
  • pywikibot/compat
    • /trunk/pywikipedia -> master
    • SUBMODULE: i18n -> pywikibot/i18n
  • pywikibot/i18n
    • /branches/rewrite/scripts/i18n/ -> master
  • pywikibot/spelling
    • /trunk/spelling


Submodules[edit]

As far as I can see, git2svn does not have support for submodules at the moment - so I'm not 100% sure how we can implement the structure above... Ideas:

  1. implementing submodule support (incl. .gitmodules, the mode 160000 directories *and* updates for each upstream change)
  2. implementing partial submodule support (just .gitmodules files)
  3. having the submodules as directories for old commits, then adding a 'remove directory & add submodule' commit after the conversion
  4. no submodules or directories for old commits, then an add submodule commit after the conversion.

Valhallasw (talk) 18:01, 16 June 2013 (UTC)[reply]

Ok, I think we can do a couple of things here. For submodules of things that are also in Gerrit (eg: i18n), Gerrit has support for auto-updating submodules. This will ease the maintenance burden for these repositories and make them behave more like SVN externals (this is what we do for the mediawiki/extensions meta repository, fwiw). For third party projects that are also in Git elsewhere (eg: github), I think we can just use normal submodules. Yes this requires updating manually when an upstream library changes, but this isn't difficult and is generally a good idea (helps track down when upstream broke something). For upstream projects that aren't in Git (eg: SVN or $somethingElse), I think we'll just have to copy the upstream code in manually when it updates. Hopefully this won't be much at all. ^demon[omg plz] 14:47, 17 June 2013 (UTC)[reply]


Submodule fast-export format:

blob
mark :3
data 86
[submodule "svn2git"]
        path = svn2git
        url = https://github.com/pywikibot/svn2git.git

commit refs/heads/master
mark :4
author Merlijn van Deen <valhallasw+lisilwen@gmail.com> 1371755491 +0200
committer Merlijn van Deen <valhallasw+lisilwen@gmail.com> 1371755491 +0200
data 11
+submodule
from :2
M 100644 :3 .gitmodules
M 160000 a3bc2923a139645ada307fd4b53d94dee74da0c1 svn2git

Groups[edit]

Initial members[edit]

Current list: [1]

To Do[edit]

MUST be sorted BEFORE migration[edit]

OK after, but should be fixed[edit]

Lower priority[edit]

  • Automated unittests + stuff - would be nice if we had a way to run them on Windows too.
  • Gitblit r### links point to Special:CodeReview/MediaWiki, not pywikipedia.
  • ...?


Distribution[edit]

Currently, changes are distributed to end-users in two ways:

  • SVN updates
  • Nightlies

The nightlies can easily switch to a git-based system, but this is less the case for SVN-based users. There are several options we have for this:

  • Block SVN and force them to switch to git: add a note to wikipedia.py 'we have moved from SVN to git -- please read <this> guide on how to switch!'. This is the method we used when we switched from CVS in 2007: [2]
  • Use a new SVN repository to (essentially) share nightlies. We can switch people by making /trunk/pywikipedia an svn:external to another repository, or by making people use svn switch (not sure if that works)
  • Something inbetween: an 'upgrade to git' script that (windows) downloads a portable git or (linux) uses the system git to download the from gerrit (could also be used as installer for new users?)
  • Something inbetween v2: create an update script that downloads & unpacks the latest nightly, and tell people to use that
  • I wrote a crappy shell script [3] that clones the repo from github and the i18n, and can also update it. It should work fine for any unix user (I only tested on OSX though), we would still need a windows solution though. Legoktm (talk) 05:10, 8 July 2013 (UTC)[reply]
  • ...?