User:Robchurch/Interwiki existence checks

Copied from IRC:

Hmm. One horrible problem with interwiki links is, as mentioned, the resolution problem. w:en:Foo and en:w:Foo are the same but work differently. It would be rather nice to have a consistent mechanism to convert w:en:Foo into http://en.wikipedia.org/wiki/Foo in a single pass without reliance upon interwiki redirection. :) We could allow a specific format in interwiki.iw_url, e.g.  http://$lang.wikipedia.org for 'w' en:w:Foo gets split up, "en" would be detected as a language   code and replaced into it. For something like Foo, since there's no language code,   use the content language code. The biggest problem I envision with that is making sure it's   backwards-compatible. The actual existence checking is quite simple; we can have   something like LinkCache but for interwiki links. This could have a special batch job like LinkBatch and we could   introduce a simple API method to do batch existence lookups. On wiki farms, this cache could be shared across the entire cluster. In that particular case, updating the cache is rather straightforward,   since each wiki can maintain its own entries. For non-farm setups, what we could potentially do is introduce a   special kind of link table which stores a URL as the "from" value - when doing a cache update, do some sort of specific XML callbackesque thing to the wiki that requested existence state in the first place. Later on, if we wanted, we could extend and override bits for a custom implementation for Wikimedia using direct access to the databases to make it that bit faster.