Requests for comment/URL shortener

This is a request for comment about implementing a URL shortener service for use by Wikimedia projects (42085).

Background
A URL shortener is a service that takes long URLs (such as ) and shortens them in terms of number of characters needed to represent that URL.

There are generally two types of URL shorteners:


 * 1) http://enwp.org/foo and http://youtu.be/foo kind that do direct expansion of the URL; and
 * 2) http://bit.ly/foo kind that convert a hash or shortened version into a longer form of the URL.

Both of these implementations generally use HTTP 301 server-side redirects.[citation needed?]

Traditionally these types of links have only been needed on external social media services that have arbitrary character limits, such as Twitter. However, the need for their use in other contexts is allegedly expanded (see below).

It's also important to note that many wikis block URL shorteners as they're a spam vector (this very page can't have links to youtu.be, for example).

Use-cases

 * Links in Echo notifications that are e-mailed or broadcast via XMPP
 * Neither e-mail nor XMPP have arbitrary character limitations, do they? I don't see the use-case for a shortened URL. --MZMcBride (talk) 20:59, 17 November 2012 (UTC)
 * Links to fundraiser landing pages that are posted to social media or sent via email
 * So Twitter and identi.ca? --MZMcBride (talk) 20:59, 17 November 2012 (UTC)
 * URL sharing via the Mobile App
 * File sharing from Commons
 * For use by the Wikimedia Foundation Communications Department
 * More specifically? Blogs don't have URL character limitations. What does the communications department use shortened URLs for beyond tweeting (discussed above)? --MZMcBride (talk) 20:59, 17 November 2012 (UTC)
 * Long Gerrit and Bugzilla URLs

Considerations
We have Extension:ShortUrl, which is a solid foundation for creating short hashes of titles (which means the link will be a permanent link to that title, not the pageid (of which the title can be renamed) and not the revision id. However without a dedicated domain to configure it with (more on that below) it will still be limited to the length of our primary domains (e.g..

Domain
The main thing needed is a short domain name. This would most likely have to be donated to us since short domain names aren't cheap.
 * The Wikimedia Foundation has a pretty big budget these days. If it really wanted a short domain, it could buy one. --MZMcBride (talk) 21:16, 17 November 2012 (UTC)

It would be best to use some sort of subdomain or path scheme to make it usable for all Wikimedia Foundation projects:

e.g. lang.abbrev-site. .org/ or abbrev-site. .org/lang/ or .org/abbrev-site/lang/


 * w.org (exists, for sale)
 * w.co (available)
 * w.ly (exists, unavailable)
 * wmf.org (exists, unavailable)
 * wmf.co (exists, for sale)
 * wmf.ly (available)
 * wi.ki (exists, for sale?)
 * w.mf (available)

And then of course there's the question of protocol: HTTP v. HTTPS.
 * How is that a question? The protocol and the domain are not related to each other, and our servers support HTTPS. They will both work. Krinkle (talk) 23:12, 17 November 2012 (UTC)
 * Question was poor wording on my part. I guess I was thinking about how a separate domain requires an additional SSL certificate. And I wasn't sure it was always a given that every Wikimedia service will support both protocols.
 * In this scheme, I guess http and https would redirect to their corresponding expanded forms. Perhaps I should have written "HTTPS support" and left it at that. --MZMcBride (talk) 07:00, 18 November 2012 (UTC)

Maintenance
Who's going to maintain this service for the indefinite future? Is the Wikimedia Foundation willing to maintain this service forever? If so, who within the Foundation will be in charge of maintenance?

The Wikimedia Foundation currently has a number of services (such as OTRS) that it has difficulty maintaining. An additional service may have real costs (adding features, fixing bugs, etc.). What are the actual costs here?

Obfuscation and mis-use
URL shorteners have a cost: they introduce a middle-man dependency. By including a shortened (hashed) URL, you obfuscate where the underlying content is. If the service is unreachable (offline, broken, down) and there's no dictionary to resolve the URL, the content can be lost or irretrievable.

URL shorteners can also be mis-used, such as being included in contexts where there is no legitimate reason to use a shortened URL (such as blog posts or in HTML). Nearly all URLs are clicked or copied and pasted.[citation needed]

Approach
Most URL shorteners look at domain hacks, but domain hacks are arguably just a fad. An alternate approach to domain hacks and hashing would be pushing for the implementation of a new protocol such as wiki://. So you'd have something like:


 * wiki://w/en/Barack_Obama

The part following the protocol could follow our current interwiki syntax.

However, this would be a much longer process (convincing Web browsers and the world to adopt the protocol) and would still run into the issues discussed above with regard to youtu.be and enwp.org-type URL shorteners: namely that page titles can be quite long (up to 255 bytes), so you might not ultimately save many characters.

Plan

 * 42085
 * Simply set up lilurl? Is more needed?