Interwiki cache/Setup for your own wiki

Setting up interwiki links on your own wiki
Since MediaWiki release 1.19, the Wikimedia projects do not use the interwiki table but rely instead on a cdb file which contains information about the way links to external projects work. This means that you can't just download the interwiki.sql.gz file from download.wikimedia.org for a given wiki and import it into your database to make interwiki links work.

For impatient readers
If you want links to external projects from your own wiki to work like they do on Wikimedia projects, download and run the script. It will retrieve the interwiki cdb file in use on the Wikimedia projects and update it for use with your wiki. It's alpha code, beware.

What's in the interwiki cdb file
A cdb file is a flat file database format containing key/value pairs.

The interwiki cdb file has the following types of key/value pairs in order to handle various sorts of links:

key   __global:devmo value 0 https://developer.mozilla.org/en/docs/$1 key   __global:wiktionary value 1 //en.wiktionary.org/wiki/$1 key   _wiktionary:aa value 1 http://aa.wiktionary.org/wiki/$1 key   __sites:guwikibooks value wikibooks
 * _global:wikiabbrev
 * Some of these are used for 'absolute' interwiki links, where the wiki is available in only one language and site type, and the abbrev points to the same web site every time. You can add as many arbitrary external sites as you like by adding entries like these to the cdb file.
 * Example entry from Wikimedia:
 * The rest are used for interwiki links where the wiki is available in multiple languages. We choose one language as the reference point (usually en) so that interwiki links of the form wikt:el from fr.wikipedia (for example) will work by bouncing the user first to en.wiktionary.org and from there to el.wiktionary.org. (CHECK ME is that really how the link forwarding works?)
 * If you have new multilingual site types, you should add a corresponding entry here. If there is some other language that each site type is guaranteed to have, rather than English, you need to change the url appropriately.  And in any case, if you want the links to point to a wikifarm on your domain, you'll need to update the urls accordingly.
 * Example entry from Wikimedia:
 * _sitetype:langcode
 * These map site type and language code (or other prefix) to the corresponding url. If you add new languages or site types you'll want to add entries here.
 * Example entry from Wikimedia:
 * __sites:fullprojectname
 * In these entries, 'fullprojectname' means the name of the wiki database, typically the langcode concatenated with the site type. These are used to map wiki database names to site types. If you add a new site you will need to have an entry here, mapping the wiki db name to the site type. Good defaults are wiki ('wikipedia' type), or wikimedia.
 * Example entries from Wikimedia:

key   __sites:wikimaniateamwiki value wiki key   aawiki:n value 1 http://aa.wikinews.org/wiki/$1
 * Note that *all* Wikimedia project wiki db names end in the site type, with one exception, wikidata, and you don't want to look too closely at that. If you  don't follow this model, other things may be more annoying for you and require workarounds.
 * fullprojectname:iwabbrev
 * These are used to map an abbreviation on a given project to a different site type in the same language. So for example the abbreviation 'q' when given on el.wikipedia should lead to el.wikiquote, which would be achieved by an appropriate entry here.  If you add new site types you'll need entries here, and if you have a new fullprojectname, you need entries here for each known abbreviation. As of this writing, known Wikimedia abbreviations are w, wikt, q, b, n, s, v, chapter, voy
 * Example entries from Wikimedia:

key   liquidthreads_labswikimedia:q value 1 http://liquidthreads-labs.wikiquote.org/wiki/$1

The keys which start with _list are used for getAllPrefixesCached which is used at present only in retrieval of the interwikiMap  when the MediaWiki api is queried for interwikimap site info. Example query: http://www.mediawiki.org/w/api.php?action=query&meta=siteinfo&siprop=interwikimap


 * __list:__global should contain all xxx for which there is an entry with key __global:xxx
 * __list:_wiktionary should contain all languagecodes for which there is an entry with key _wiktionary:langcode (and so on for the other site types)
 * __list:__sites should contain all fullprojectnames xxx for which there is an entry with key xxx
 * __list:fullprojectname should contain all abbreviations xxx for which there is an entry with key fullprojectname:xxx

Expanding an interwiki link
When we want to expand a piece of wikitext that might be an interwiki link, how does it work?

This depends on the value of the global $wgInterwikiScopes which has a default value of 3 and can be overriden in your wiki's LocalSettings.php file.


 * $wgInterwikiScopes = 1:
 * There is no lookup in the interwiki cache cdb file at all.
 * $wgInterwikiScopes = 2:
 * Check for the key __global:wikiabbrev and if it exists, use the corresponding value
 * $wgInterwikiScopes = 3:
 * Check for the key __sites:fullprojectname in order to get the site type (is the current wiki a wikipedia, a wikiquote, etc).   If that does not exist, we wil use the value of $wgInterwikiFallbackSite which by default is 'wiki', i.e. site type wikipedia.
 * Check for the entry _ :langcode where sitetype is the value we just retrieved.  If there is no entry we fall back to wgInterwikiScopes = 2 and try that.

Adding entries to the cdb file by hand
If you are setting up a mirror of en wikipedia with wikidbname enwiki:
 * steal a copy of our interwikicache.cdb from [here]
 * copy it into cache/interwiki.db under the root of your MediaWiki installation
 * add $wgInterwikiCache = "$IP/cache/interwiki.cdb";  to your LocalSettings.php config file
 * You are done. All wikilinks will 'just work', as your wiki database name will be parsed into language code en and wiki type wiki, (i.e. wikipedia), both of which are fully specified in the cdb file already.  Since interwiki links only affect links leading off of your wiki, you need to change nothing.

If you are setting up a mirror of en wikipedia with wikidbname enwiki and table name prefix (for example) mw_: key  enwiki-mw_:w value 1 http://en.wikipedia.org/wiki/$1 key  enwiki-mw_:wikt value 1 http://en.wiktionary.org/wiki/$1 key  enwiki-mw_:q value 1 http://en.wikiquote.org/wiki/$1 key  enwiki-mw_:b value 1 http://en.wikibooks.org/wiki/$1 key  enwiki-mw_:d value 1 http://en.wikidata.org/wiki/$1 key  enwiki-mw_:n value 1 http://en.wikinews.org/wiki/$1 key  enwiki-mw_:s value 1 http://en.wikisource.org/wiki/$1 key  enwiki-mw_:v value 1 http://n.wikiversity.org/wiki/$1 key  enwiki-mw_:voy value 1 http://en.wikivoyage.org/wiki/$1 key  enwiki-mw_:chapter value 1 http://en.wikimedia.org/wiki/$1
 * steal a copy of our interwikicache.cdb from [here]
 * add entries:

key   __sites:enwiki-mw value wiki

key   __list:enwiki-mw value b d chapter n q s v voy w wikt

and the entry for the key __list:__sites so that it includes enwiki-mw
 * copy it into cache/interwiki.db under the root of your MediaWiki installation
 * add $wgInterwikiCache = "$IP/cache/interwiki.cdb";  to your LocalSettings.php config file
 * You are done. All wikilinks will 'just work', as your wiki database name will be parsed into language code en and wiki type wiki, (i.e. wikipedia), both of which are fully specified in the cdb file already.  Since interwiki links only affect links leading off of your wiki, you need to change nothing.

If you are setting up a site which is not a wikipedia but you want to have interwiki links to all of the Wikimedia projects, follow the instructions above for enwiki with mw_ db table prefix, substituting in the name of your wiki db for enwiki-mw_ everywhere.

If you are setting up a site which is not a wikipedia and you have the db prefix xxxx_, follow the above instructions for enwiki with mw_ prefix but substitute in yourdbname-xxxx for enwiki-mw_ everywhere.

If you are setting up a mirror of several wikipedias with wikidbnames enwiki, frwiki, etc (and no special db table prefix): key    __global:wiki value 1 //en.wikipedia.org/wiki/$1
 * steal a copy of our interwikicache.cdb from [here]
 * change the entry
 * to point to your domain

key   _wiki:en value 1 http://en.wikipedia.org/wiki/$1
 * change the entry
 * to point to your domain, and do the same for each language you are setting up


 * copy the cdb file into cache/interwiki.db under the root of your MediaWiki installation
 * add $wgInterwikiCache = "$IP/cache/interwiki.cdb";  to your LocalSettings.php config file
 * You are done. You have modified interwiki links between the wikis in your farm to point to your domain, the rest will lead offsite as they should, and no other modifications are necessary.

The easiest way to add entries to a cdb file is to use a set of command line cdb utilities (aptget install freecdb for Ubuntu, yum install tinycdb for Fedora).

For freecdb, you can dump the existing cdb file to a flat text file using , and add entries in the format  where nn is the length of the key and mm is the length of the value in bytes. You can then convert the text file back to cdb using  after which you can move the new cdb file into place and test it.

The commands for tinycdb are different but the procedure is the same: dump the cdb file into a flat text file, add entries of the above format and convert the text file back to cdb.

Use the script, Luke
There is a script that can make this easier (tested only on Linux). It's designed for altering the Wikimedia interwikicache.cdb file for a single wiki. You will need to specify the site type, the db name including table prefix if any, the language code if any, or alternatively the path to your wiki's LocalSettings.php file. The script will do the rest, writing out a new cdb file with the desired entries, as needed. See the README file or run the script with the --help option for more information. It's alpha code, beware.