Git/Conversion
| Group: | Platform |
| Start: | 2011-10-01 |
| End: | 2013-07-30 |
| Management: | Chad Horohoe |
| Team: | Antoine Musso, Sumana Harihareswara |
This page discusses efforts to convert away from our current Subversion repository to Git. MediaWiki core and extensions used on Wikimedia sites have now switched to Git, but moving additional affected projects, and improving our new development infrastructure with tools integrating Gerrit and Git into our workflow, will continue to take engineering time until we completely switch our Subversion repository to read-only in the summer of 2013.
Contents |
[edit] Status
-
[edit status] • [add new]2012-04-monthly: MediaWiki core and all extensions deployed on Wikimedia sites are getting into regular development cycles. The second group of extensions were migrated on April 27. We've also branched and released 1.20wmf1 to most Wikimedia sites. The priorities for Gerrit are currently fixing our UTF-8 problem, improving Gerrit integration with IRC and e-mail, making project/user information more discoverable, and upgrading Gerrit to 2.3 in the next week or so.
[edit] Rationale
Our current Subversion-based version control system has served us well, but we're in need of a more suitable version control system for our development effort. Our community is very distributed, with many parallel efforts and needs to integrate many different feature efforts. After long consideration, we've decided to move from Subversion to Git.
Some advantages of git:
- "I love git just because it allows me to commit locally (and offline)." - Guillaume Paumier
- "[Y]ou can create commits locally and push them to the server later (great for working without wifi), you can tell it 'save my work so I can go do something else now' in one command, and it'll allow us to review changes before they go into "trunk" (master).... without human intervention in merging things into trunk. Gerrit automates this process." - Roan Kattouw
[edit] Affected development projects
MediaWiki core (/trunk/phase3/) and MediaWiki extensions that WMF deploys moved to Git in March 2012. Afterwards, any other extensions, tools, or projects that wish to move can do so. These might include operations, fundraising, pywikipediabot, etc.
We will leave some codebases in Subversion and not bother migrating them, because those extensions or tools have been abandoned. Some developers will choose to move their projects to Github or some other git site. We will also leave svn.wikimedia.org up for at least multiple years; for the subdirectories holding projects that have moved to git, the repository will be read-only.
The Git conversion team will publicize any changeover date with at least 2 weeks' notice. As of right now (1 Feb 2011) there are no specific cutover dates set.
Chad would like to gradually migrate all projects currently on Wikimedia's Subversion repository so that he can make all of svn.wikimedia.org read-only by the middle of 2013 -- they can use Git/New repositories. They could move to WMF's git repo, or to another host; Chad can help them decide and migrate.
- MediaWiki extensions (not used by WMF)
- cf. Git/Conversion/Extensions queue
- Starting in March or April 2012, Chad will move alphabetically through all extensions (that are not deployed on Wikimedia Foundation sites) and offer each of them choices as to when and whether to shift.
- Pywikipediabot
- Wikimedia Foundation fundraising
- Fundraising extensions (including DonationInterface, FundraiserLandingPage, ContributionReporting, et al) can move along with core
- Fundraising stuff in the Wikimedia repository can move on a timeline TBD
- Fundraising migration is under discussion.
- Wikimedia Foundation operations
- Ops is pretty much aware of this since they've already started the git move. Happening piecemeal by them as they're ready.
- Status: In progress
- Ariel Glenn's dumps infrastructure
- Just two paths, /trunk/backups and /branches/ariel/. Should be pretty trivial, history's not complicated.
- Can convert: as soon as we're ready, just give Ariel a day's notice or so.
- Status: Done Moved to operations/dumps.git on 15-Feb-2012. svn made r/o.
- Wikimedia Foundation data mining and analytics, including Community Department
- don't know
- Toolserver internationalisation
- In active use but maintainers don't have time right now to deal with migration. Another time.
- Daniel Kinzler's WikiWord project
- Per IRC: No rush, will move casually after main migration. Not under active development right now.
- mwdumper
- Not being actively developed right now. Can move this whenever.
- Status: Done Has been moved to mediawiki/tools/mwdumper.git on 15-Feb-2012. svn made r/o.
- WM planet configuration
- Wikimedia Mobile
- Currently being done on github -- moving will be easy, just have to talk to mobile team about adjusting their workflow.
- Continuous integration, for example TestSwarm
- Not yet migrated to Git
- Status: Done All the testswarm/jenkins stuff is ongoing in git. Nothing from SVN is being used anymore (still maybe need to make paths r/o?)
[edit] Timeline
[edit] December 2011
- Preliminary test conversions early in month
Done, December 5ish - Git workflow architecture review
Done, December 19 - Agree on implementation strategies regarding remaining development process questions, e.g. how to handle multi-repo commits
Done as of 1 Feb
[edit] January 2012
- CI tests and linting get run when a developer chooses to push to the stage between their branch and the mainline branch (see Ideal Workflow Document and Git-review) (still in progess as of 9 March -- Antoine is working on this. right now, only happens on the master)
[edit] February 2012
Finish code review on trunk (progress at Code Review stats and MediaWiki 1.19/Revision report)Code review backlog went back up at the end of the month.- Cut 1.19 release branch -
Done, Branched at r110996 on 9th February, 2012 - Ask people to stop creating new extensions in Subversion now -
Done, 11 Feb - Communicate to larger community about migration
- techblog post -
Done, 15 Febuary 2012
- techblog post -
- Make Gerrit behave like we want it to -- TODO [in progress, see March]
- Training documentation and interactive training
- Update Git/Workflow
Done - Video training. Set up a test repo, tell 3 people to submit patches simultaneously, walk them through that, then walk them through reviewing it
Done
- Update Git/Workflow
[edit] March 2012
(Chad Horohoe unavailable March 10th-19th; Antoine Musso is his backup.)
- Finish up specific Git management scripts / changes
- to support WMF workflow
Done
- i18n updates Bug 34137
Done This is nearly complete by March 30th. l10n updates can be pushed from translatewiki, finishing the auto-approval process.
- Automate the translators' process as much as possible--it currently takes about 20mins/day for Raymond to do and that's a huge timesink. Chad to follow up. This has to be ready before ANY migration
of coreto git starts.
- Automate the translators' process as much as possible--it currently takes about 20mins/day for Raymond to do and that's a huge timesink. Chad to follow up. This has to be ready before ANY migration
- to support WMF workflow
- MediaWiki 1.19RC1 release from Subversion. [Week of 5 March]
Done
- 2 weeks before migration of MediaWiki core, start communicating about cutover date -- give date & links to all the documentation with the 3 most frequently asked questions [Week of 5 March]
- wikitech-l, mediawiki-l [ongoing]
- add !gitconversion to mw-bot
Done
- Keep Extension:ExtensionDistributor working. See bugzilla:27812. [In progress - Sam]
- MediaWiki 1.19.0 release from
SubversionGit. [Week of 12 March] - Mass-create Gerrit accounts from the SVN users whose USERINFO has an email address in it, and tell them via wikitech-l that they should just go to the password reset page on labsconsole to start logging in. [Ryan, March 12th]
Done - Make Gerrit behave like we want it to -- [[labsconsole:Gerrit bugs that matter|TODO] (better TODO to come)
- Documentation and training
- Chad to update the code review guides, with help from Guillaume, before the migration of core. This will include:
- docs on how current SVN committers can link their LDAP accounts, get passwords put on them, and thus get gerrit accounts (currently on labs, need improvement)
- Create a page where people can request Gerrit accounts [Sumana, March 13th]:
Done labsconsole:Help talk:Access - procedure for adding and removing people from gerrit project owner groups, including for WMF deployment branch and WMF master
Done When/how we'll add, remove people from Gerrit project owner groups, Git/Gerrit project ownership
- Chad to update the code review guides, with help from Guillaume, before the migration of core. This will include:
- Git migration -- core & extensions [scheduled for Wednesday, March 21]
Done
- do deauth of SVN as a pre-commit hook to output an informative error message in case someone tries to commit to MW core -- "Subversion is dead, we have moved to git, read Git/Conversion"
- pre-commit hooks are in puppet, when someone does this (probably Chad)
- Right now it's a hard de-auth in authz
- Get native packages on Windows and Mac for git-review since it's so much easier and better than the manual commit messages hook process. See bugzilla:35145.
Not done
[edit] April 2012
- Git migration -- ANYTHING ELSE
- Make paths read-only on case-by-case basis.
- Ongoing, slow process.
- change links on mediawiki.org
- Developer Hub
- How to become a MediaWiki hacker
Done - Commit access
Done - Code review guide
- Download from SVN
- templates for extensions (which ones?)
- Writing an extension for deployment
Done - Subversion
- Links to svn.wikimedia.org -- Not super high priority, none of this is disappearing anytime soon.
- and more!
- and moar!!
- and MOAR, interwiki links
- m:Interwiki map doesn't seem to have a usable way to link stuff on git/gerrit
- There's now gerrit: as an interwiki link, also git:. Might also want more to gitweb.
- More documentation updating. Ask Guillaume for help.
- Code review page is a high priority.
- get out-of-date git template & stick it on every page that mentions Subversion, including Subversion
- Possibly 1.20wmf1 (first mini deployment untethered to release schedule, first of many...1.20wmf2, 1.20wmf3, etc) from git.
Done - Move towards git-based development and release process
Done - First deployment from git mainline development branch
Done - Move towards continuous integration via git, goalpost: weekly deployment
- Jenkins (Testswarm/PHPUnit tests) on git branches
- bugzilla:34141
- MediaWiki 1.20, first deployment and release from git mainline development branch [Targeted for October currently]
- Unsorted "blockers"
[edit] June - July 2012
Evaluation of Git/Gerrit workflow and consideration of alternative tools/workflows.
Goal: Determine whether we need to undertake a significant round of Git/Gerrit improvements, or whether there are major deficiencies in the Git/Gerrit workflow that would justify switching to a different review tool/process.
Alternative open source review tools:
- ReviewBoard (example use by Khan Academy)
- Phabricator (developed initially by Facebook and now community-maintained, example use by Phabricator project ; see also Phabricator)
- Gitorious (developed by Gitorious AS and used to host Gitorious.org; has cross-repo merge request functionality similar to GitHub's pull request feature; example merge request list for StatusNet).
- GitLab (developed by ???, positions itself as open source alternative to GitHub and seems to target a similar feature set -- a bit more active than Gitorious, but does not seem to self-host yet; code on GitHub. Ruby on Rails with a dependency on Gitolite, which is written in Perl :-)
Things about Gerrit we like:
- We've got core unit tests running prior to any human review
- We have (imperfect) e-mail notification about changes
- We can push changes to a specific reviewer
- Patch sets allow gradual improvement of a change prior to merge
- Frequent releases of Gerrit itself and active dev community
Things about Gerrit we dislike:
Workflows to consider:
- Short-lived bug/feature branches (e.g. "bug/36987")
- Long-lived feature branches (e.g. Wikidata)
- Well-maintained, regularly pushed extensions (e.g. MobileFrontend)
- Cross-project maintenance and development (e.g. i18n/l10n updates and fixes)
- Security-sensitive or access-restricted code (e.g. ops production changes)
- Complex third party extension development without significant WMF implications (e.g. Semantic MediaWiki)
[edit] Split up and convert repositories
A naïve git-svn conversion of the entire repository (with branches) weighs in at around 7.8GB (November 2011). It makes no sense to make one Git MediaWiki repository, it should be split up.
In Subversion everything gets squashed into one giant repository. In Git repositories are split at the boundaries over which code does not cross.
[edit] Splitting
We have a test repository up, but in February 2012 will redo the split to create a permanent git repo.
- MediaWiki will go in mediawiki/core.git
- Extensions will go in mediawiki/extensions/foo.git
- There will be an extension "meta repository" at mediawiki/extensions.git which will contain all extensions as submodules.
- Other things across SVN need to find new homes in Git
[edit] Converting
Done Every commit needs to be rewritten to give name/email pairs to SVN users. We are using username@users.mediawiki.org for a unified e-mail address scheme for all old commits.
- Only for those without a known mailaddress or all?
- What about username@svn.wikimedia.org instead?
[edit] Unscheduled items
Other cool things to do (not blockers):
Not done Convert the Bugzilla code to recognize the new SHA-1 commits. Come up with a shorthand to autolink from BZ to gerrit changeset
- Would Mark H. mind looking into this, since Bugzilla is his baby?
- Filed as bugzilla:35144. Workaround: For now, when referring to a Git diff, please paste changeset "Change ID"s, or the changeset number in the Gerrit changeset URL, into the BZ comment. Both are globally unique.
Not done Create database of SVN revision ids -> Git SHA-1's for useful lookups.
- Info is included in Git commits, just need to make a DB mapping of them. Good weekend project after the conversion is complete if someone is feeling bored.
- Coren may work on this.
Not done Rusty and Roan's effort to turn every Bugzilla patch into a git pull request.
Not done change MediaWiki CR to readonly, no new comments or statuschanges. This is in the very long run: how to write redirects for viewvc -> gitweb? But that can wait till like 2013.
Not done Enforce commit message format http://tbaggery.com/2008/04/19/a-note-about-git-commit-messages.html
Done:
Done Change the configs of automatic bots -- when I commit, where will it spit out?
- Isn't this done? gerrit-wm now posts to #mediawiki by default.
Wontfix:
Not done Bump gerrit ids to some big number (eg. 200000) greater than the largest svn revision (so that they could be treated as 'logical' revisions, without conflicting the svn ones).
- Not easy -- we've already started using gerrit in a production setting and we've got over 2k changesets.
- We can live with those 2k changesets having the same number of MW ones, as they are different repos. For the actual change, it seems as easy as running ALTER TABLE tbl_name AUTO_INCREMENT = N.
- WONTFIX
- Not easy -- we've already started using gerrit in a production setting and we've got over 2k changesets.
[edit] Ideal state
This is what we'd love to see: ![]()
[edit] History
[edit] People
[edit] Working on the conversion
- User:^demon
- Antoine Musso
- Roan Kattouw
[edit] Documents
- User requirements:
- Specifications: labsconsole:Gerrit bugs that matter
- Software design document:
- Test plan:
- Documentation plan:
- User interface design docs: labsconsole:Gerrit bugs that matter
- Schedule: see Timeline above
- Task management: Task list from Bugzilla (bug 22596)
- Release management plan:
- Communications plan:
- Status updates
[edit] Communications
- draft plan
- announcement of test repository
- "git boot camp" from October 2011 NOLA hackathon etherpad:GitBootcamp
- https://blog.wikimedia.org/2012/02/15/wikimedia-engineering-moving-from-subversion-to-git/
- Git, Gerrit, and You! or, Gerrit training available starting Monday 27 February
- Postponing Git migration until March 21
- When/how we'll add, remove people from Gerrit project owner groups
- Git migration: documentation and short-term considerations
- MediaWiki core deployments starting in April, and how that might work
| Git | Git intro · Download from Git · Our Git workflow · Commit message guidelines · Request a new Git repository · Migration to Git · Outstanding Git conversion issues · Git GUI applications more.. » |
|---|