Git/Conversion/pywikibot

This page serves as a scratchpad for the pending move of Pywikipedia from SVN -> Git.

Current status
gerrit/gitblit/github mirroring SVN; updated every hour at :55. [logs]


 * Conversion code available at https://github.com/pywikibot/svn2git
 * Project 'pywikibot' on Tools Labs runs run_sync in a crontab

Schedule
14 june - initial conversion

26 july - complete switch to gerrit

Note. Changes to the conversion code resulting in force pushes to gerrit need to be transferred manually with full_repush, which updates gerrit's references 1000 commits at a time. Large batches make gerrit puke.

Current SVN structure
/archive/Before_multifamily_changes/pywikipedia/ /archive/init3/pywikipedia/ /archive/init2/pywikipedia/ /archive/init1/pywikipedia/ /archive/trunk/ /archive/messages/ /archive/i18n/ /archive/old python 2.3 scripts/ /branches/rewrite/ /trunk/pywikiparser/ /trunk/pywikipedia/ /trunk/spelling/ /trunk/threadedhttp/

Proposed Git structure

 * pywikipedia
 * /trunk/pywikipedia/ -> master
 * /branches/rewrite/ -> rewrite
 * pywikipedia/pywikiparser
 * /trunk/pywikiparser/ -> master


 * There are some notes to be made here - we actually have five distinct repositories at the moment: trunk, rewrite, scripts, families and i18n. Scripts, families and i18n are shared (but not consistently) between trunk and rewrite. I would actually like to split it up further somewhat, but git subrepositories are a PITA, so that might not be the best way to go. In any case, I would like to have trunk and rewrite not as two branches, but as two distinct repositories. Valhallasw (talk) 20:29, 2 June 2013 (UTC)
 * There is another external named 'simplejson' that when I check out, this is be downloaded. and the repo isn't in wikimedia site and it's in googlecode.com Ladsgroup (talk) 13:08, 10 June 2013 (UTC)

Proposed Git structure / valhallasw

 * pywikibot/core
 * /branches/rewrite -> master
 * EXCEPT for i18n
 * SUBMODULE: scripts/i18n -> pywikibot/i18n


 * pywikibot/compat
 * /trunk/pywikipedia -> master
 * SUBMODULE: i18n -> pywikibot/i18n


 * pywikibot/i18n
 * /branches/rewrite/scripts/i18n/ -> master


 * pywikibot/spelling
 * /trunk/spelling

Submodules
As far as I can see, git2svn does not have support for submodules at the moment - so I'm not 100% sure how we can implement the structure above... Ideas:


 * 1) implementing submodule support (incl. .gitmodules, the mode 160000 directories *and* updates for each upstream change)
 * 2) implementing partial submodule support (just .gitmodules files)
 * 3) having the submodules as directories for old commits, then adding a 'remove directory & add submodule' commit after the conversion
 * 4) no submodules or directories for old commits, then an add submodule commit after the conversion.

Valhallasw (talk) 18:01, 16 June 2013 (UTC)


 * Ok, I think we can do a couple of things here. For submodules of things that are also in Gerrit (eg: i18n), Gerrit has support for auto-updating submodules. This will ease the maintenance burden for these repositories and make them behave more like SVN externals (this is what we do for the mediawiki/extensions meta repository, fwiw). For third party projects that are also in Git elsewhere (eg: github), I think we can just use normal submodules. Yes this requires updating manually when an upstream library changes, but this isn't difficult and is generally a good idea (helps track down when upstream broke something). For upstream projects that aren't in Git (eg: SVN or $somethingElse), I think we'll just have to copy the upstream code in manually when it updates. Hopefully this won't be much at all. ^demon[omg plz] 14:47, 17 June 2013 (UTC)

Submodule fast-export format:

blob mark :3 data 86 [submodule "svn2git"] path = svn2git url = https://github.com/pywikibot/svn2git.git

commit refs/heads/master mark :4 author Merlijn van Deen  1371755491 +0200 committer Merlijn van Deen  1371755491 +0200 data 11 +submodule from :2 M 100644 :3 .gitmodules M 160000 a3bc2923a139645ada307fd4b53d94dee74da0c1 svn2git

Groups

 * pywikipedia - self managing, allow them to add Gerrit users themselves
 * All users with commit access (Special:Code/pywikipedia/author) should probably get +2? Legoktm (talk) 17:06, 1 June 2013 (UTC)


 * Sounds reasonable as an initial list to populate the group with. Like I said, the group will be self-managing, so pywikipedia developers can add new users to it as they'd like. ^demon[omg plz] 12:16, 2 June 2013 (UTC)
 * As long as we adhere to a strict '+2 must be from someone else than committer' rule, I'm fine with adding everyone. Otherwise, I'd rather see a smaller group, e.g. xqt, drtrigon and legoktm. The option 'everyone who has committed in 2013' might be even better. Valhallasw (talk) 20:29, 2 June 2013 (UTC)
 * Just adding people from 2013 sounds like a better idea. I don't know if we have enough active developers to be able to ensure that everything gets +2'd by someone else... Legoktm (talk) 20:22, 7 June 2013 (UTC)
 * 2013 sounds fine, that would end up making the following list (new section to ease formatting). ^</b>demon</b><sup style="color:#c22">[omg plz] <i style="font-size:10px;">01:15, 8 June 2013 (UTC)</i>

Initial members

 * alexsh - done
 * amir - done
 * my gerrit account is "ladsgroup", i like to change it to amir but It seems it's not possible Ladsgroup (talk) 21:05, 28 June 2013 (UTC)
 * Added. Legoktm (talk) 21:08, 28 June 2013 (UTC)
 * binbot - done as binaris
 * btongminh - done
 * drtrigon - gerrit account "DrTrigon" --DrTrigon (talk) 14:21, 12 July 2013 (UTC)
 * done Ladsgroup (talk) 10:21, 13 July 2013 (UTC)
 * huji - done
 * jhsoby - no gerrit account?
 * I sent an e-mail and asked for making account but, they is not interested.Ladsgroup (talk) 22:52, 18 July 2013 (UTC)
 * legoktm - done
 * malafaya - done
 * multichill - done
 * russblau - done
 * saper - done
 * siebrand - done
 * valhallasw - done
 * xqt - no gerrit account?
 * account created @ xqt 23:15, 24 July 2013 (UTC)
 * added Ladsgroup (talk) 11:05, 25 July 2013 (UTC)
 * yurik - done
 * l10n-bot -- This will be necessary for auto-committing of i18n updates

Current list:

MUST be sorted BEFORE migration

 * figure out submodule mess
 * gerrit-wm should report to #pywikipediabot ✅ with 70780 and confirmed working. Legoktm (talk) 20:43, 5 July 2013 (UTC)
 * gerrit patches (initial patchsets, maybe not each update, and actual commits) should be mailed to pywikipedia-svn
 * l10n-bot on gerrit
 * config is in gerrit, at https://phabricator.wikimedia.org/diffusion/GTWN/browse/master/bin/repocommit;849f362c24ac9aaebecb83870ed7eab5ffb80574 / https://phabricator.wikimedia.org/diffusion/GTWN/browse/master/bin/repocreate;849f362c24ac9aaebecb83870ed7eab5ffb80574 / https://phabricator.wikimedia.org/diffusion/GTWN/browse/master/bin/repoexport;849f362c24ac9aaebecb83870ed7eab5ffb80574 / https://phabricator.wikimedia.org/diffusion/GTWN/browse/master/bin/repoupdate;849f362c24ac9aaebecb83870ed7eab5ffb80574
 * bot needs self-merge rights -- added l10n-bot to pywikibot group, which as 'submit' rights on all pywikibot repositories. Valhallasw (talk) 11:09, 27 June 2013 (UTC)
 * automatic submodule update of core and compat
 * nightly generation (core, core w/ externals, compat w/ externals, spelling)
 * ✅ http://tools.wmflabs.org/pywikipedia/

OK after, but should be fixed

 * Create .gitreview files in all repos
 * Done for core and compat in 11726 and 11727. Legoktm (talk) 05:41, 8 July 2013 (UTC)
 * Github mirroring to pywikibot/pywikibot-core instead of wikimedia/pywikibot-core
 * Github pull requests
 * We can use User:Yuvipanda/G2G, would need to ask Yuvi to enable it for pywikibot/* repos. Legoktm (talk) 02:20, 27 June 2013 (UTC)
 * I broke gitblit while pushing a new conversion: https://phabricator.wikimedia.org/diffusion/PWBO/
 * ^demon can force a re-import. Best to do this when we are done with force pushing.

Lower priority

 * Automated unittests + stuff - would be nice if we had a way to run them on Windows too.
 * hashar started setting this up, see 50302 and 70845
 * pep8 and pyflakes linters are set up to run on every patchset.
 * Also nice would be https://github.com/liamcurry/py3kwarn
 * 50344 is for running  on patchsets
 * Gitblit r### links point to Special:CodeReview/MediaWiki, not pywikipedia.

Distribution
Currently, changes are distributed to end-users in two ways:
 * SVN updates
 * Nightlies

The nightlies can easily switch to a git-based system, but this is less the case for SVN-based users. There are several options we have for this:


 * Block SVN and force them to switch to git: add a note to wikipedia.py 'we have moved from SVN to git -- please read guide on how to switch!'. This is the method we used when we switched from CVS in 2007:
 * I'm going with this. I've explained how users can use SVN through github, and I think thats good enough. Legoktm (talk) 04:29, 22 July 2013 (UTC)
 * Use a new SVN repository to (essentially) share nightlies. We can switch people by making /trunk/pywikipedia an svn:external to another repository, or by making people use svn switch (not sure if that works)
 * Something inbetween: an 'upgrade to git' script that (windows) downloads a portable git or (linux) uses the system git to download the from gerrit (could also be used as installer for new users?)
 * Something inbetween v2: create an update script that downloads & unpacks the latest nightly, and tell people to use that
 * I wrote a crappy shell script that clones the repo from github and the i18n, and can also update it. It should work fine for any unix user (I only tested on OSX though), we would still need a windows solution though. Legoktm (talk) 05:10, 8 July 2013 (UTC)