Jump to content

Requests for comment/Mobile domain sunsetting

From mediawiki.org
Request for comment (RFC)
Mobile domain sunsetting
Component General
Creation date
Author(s) Timo Tijhof
Document status accepted
Weekly status updates
See Phabricator.

The m-dot domains at WMF (e.g. "en.m.wikipedia.org") were introduced for reasons and benefits that no longer apply today. As such, its continued existence has been questioned and is under technical re-evaluation since 2019 in RFC: T214998. In addition to its existence adding measurable costs and complexity, there are 5 concrete problems that we would naturally solve if we phase the mobile domain (such as mobile site performance, and technical debt).

I recommend we phase out the mobile domain, in favour of providing the mobile experience directly on our standard URLs. See also #Recommendation.

Diagram of technical change.

The current effort started in December 2024, after discovering that Google no longer supports the m-dot concept.

Context and Problem Statement

[edit]

The m-dot domains were introduced in 2008 as an alternative way to browse Wikipedia.[1] The separate domain allowed the proxy service to launch at relatively low risk, by being fully opt-in and separate from our standard "canonical" URLs. Adoption spread socially and externally via marketing, word of mouth, and Google. Some wikis deployed a JavaScript snippet that detected mobile user agents and would forcefully redirect visitors that way.

In 2011, the mobile gateway was shut down in favour of the MobileFrontend extension for MediaWiki. The m-dot domain was repurposed to serve traffic directly from MediaWiki, the same way our standard URLs do. Our CDN takes mobile traffic and adds an internal HTTP header ("X-Subdomain: m"), which instructs MediaWiki/MobileFrontend to apply the mobile layout.

Also in 2011, we made the mobile site the default for mobile visitors. The Wikimedia CDN started redirecting mobile user agents that visit our standard URLs, away to the m-dot version of that URL. This remains how it operates today, as of February 2025.

Google used to maintain and associate both a "mobile" URL and canonical URL with each web page. When searching via the mobile version of Google Search, they would present this "mobile" URL instead of the canonical URL, thus directing visitors there without incurring our redirect.

In short, the following benefits existed in 2008:

  1. Low risk way to launch an experimental service.
  2. "m-dot" domains were an industry convention in the early 2000s that people were somewhat familiar with.
  3. Anyone can opt-in, anyone can opt-out (so long as you understand the m-dot URL concept, and are comfortable with modifying a URL in their address bar).
  4. Easy to separate HTML caches in the Wikimedia CDN by having two separate URLs.[2]
  5. Google natively supported the m-dot concept in their index.

Reflecting on this seventeen years later in 2025:

  1. It is no longer experimental. We are committed to having a mobile experience. Even if this were to change one day, it is the default today, and thus we'd at the very least need to support these URLs indefinitely through a redirect.
  2. It is no longer common for websites to have m-dot domains. Our use of it is surprising to our present day audience. It may also decrease the strength of domain brand awareness.
  3. We have an opt-in toggle on every page ("Mobile view" and "Desktop" in the footer) that is more accesible and discoverable than modifying URLs in the address bar. These toggles has the benefit of also being sticky (e.g. the browser remembers your choice).
  4. Today the Wikimedia CDN has efficient and well-tested support for variable responses under a single URL. This is something we actively use on high-traffic endpoints, and works via the HTTP Vary header.
  5. Google no longer supports the m-dot concept (see § Problem 3 below).

Problem 1: Infra cost and limitations for CDN purges

[edit]

After saving edits in MediaWiki, we send purges to our CDN ("Varnish" and "ATS") to clear the cache of articles for each URL variant. It has been a perennial concern from SRE teams since at least 2016, that our rate of purges is unsustainable.[3] Unifying domains would instantly cut MediaWiki's purge rate by 50% (20% overall, when including legacy RESTBase). While other ideas to reduce purge load do exist and are worth exploring, those tend to be smaller in impact, or increase complexity, or lower service quality.

Domain unification would make the biggest cut, and yet do so at relatively low cost, whilst on the whole reducing operational complexity, and without removing features or lowering quality of service.

Problem 2: Failed UX expectations

[edit]

Today our redirect goes in one direction only. When people read Wikipedia on mobile, and share articles with friends over instant messaging, or with the public in chat rooms and social media, the mobile URL hardcodes the mobile experience for whomever receives that link. This includes, for example, journalists who link to Wikipedia in their articles. We fail these audiences today, by leading readers into an experience they do not expect or associate with Wikipedia on that device.

A knock-on effect is that, inevitably, a new Wikipedia editor or journalist or celebrity, will be "corrected" by someone who knows this quirk, informing them that they shared the "wrong" link. This wasted energy distracts from what could be a more productive conversation about Wikipedia. It is within our power to simply remove this speedbump.

Problem 3: Google SEO

[edit]

Google used to support the m-dot concept, with their index having a special place for the "mobile" version of each URL, for websites that have one. When people query Google Search from a mobile device, they would present that link instead of the standard link.

For the ~60% of pageviews referred from Google,[4] Google determined whether a visitor is on mobile, and Google sent them directly to our mobile URL. This nicely avoided a performance penalty in the form of our redirect delaying people's browsers on every mobile page load.

Since early 2024, Google no longer supports m-dot domains. Google now presents the same canonical link to everyone. This has a number of implications:

  1. Site speed. We are now in a situation where Google sends visitors to our standard domain, but our servers send mobile clients away to the mobile domain instead. This adds a considerable delay to mobile pageviews. See also: § Problem 4: Site speed.
  2. Mobile device detection. We've had our own detection logic since 2011. For Google referrals we did not rely on our detection logic, because Google sent traffic directly to our mobile domain. Now that Google sends traffic to our standard domain, we are fully reliant on our redirect and mobile detection. The good news is that our mobile detection is solid and we've had no reports for years about incorrect detection. Between 2011 and 2025, many new mobile device brands and hardware models have been released. But, software has gotten less diverse. I've analysed the effectiveness of our mobile detection at T214998#10551073. There is no reason for concern here.
  3. Redirect confusion. As of 2024 (T366790), we're seeing cases where Google is sending people to unusually hand-crafted redirect URLs for a subset of articles. I believe our mobile redirect may be in part to blame. We claim URLs like "en.wikipedia.org/wiki/Banana" as canonical, but when Google's mobile-first crawler visits this link, we send it away to an m-dot URL. This means for Google to "find" the canonical link, it has to overcome this redirect (which makes it seem non-canonical), and they have to pick something earlier in the redirect chain and/or from thousands of URL permutations found on third-party sites. While Google arguably should be able to figure this out (and usually, they do) my theory is that if we stop redirecting our canonical links, we naturally prevent this problem as Google will no longer have to backtrack from the canonical link to their (buggy) alternative selection logic.

Problem 4: Site speed

[edit]

When analyzing the last two years of performance data (Navigation Timing), there is a clear regression starting in May 2024. Our redirect adds a full round-trip between client and server before a mobile device can start loading an article. Mobile pageviews have gotten slower around the world, with geographies furthest from our data centers affected the most. This is essentially a regressive tax. Solving this would bring us closer to equitable performance, as it will cut the most from those with the slowest experience.

I looked at two metrics:

  • "fetchStart" (Duration from when a user clicks a link, e.g. in search results, to when the browser starts loading the page. If there were no redirects, this would be ~0ms.)
  • "responseStart" (Duration from clicking a link, until the browser receives a response from our CDN. This is also known as TTFB or time-to-first-byte, and includes any redirects, DNS, and TCP connections).

Summary from data at P73601:

  • Worldwide average:
    • p75 fetchStart for mobile increased by 3X from 80ms to 240ms.
    • p75 responseStart for mobile increased by +10% from 0.6s to 0.7s.
    • p75 unchanged for desktop in the same period.
  • Indonesian Wikipedia (id.wikipedia):
    • p75 fetchStart for mobile increased by 3X from 100ms to 350ms.
    • p75 responseStart for mobile increased by +20% from 0.65s to 0.8s.
    • p75 unchanged for desktop in the same period.
  • German Wikipedia (de.wikipedia):
    • p75 fetchStart for mobile increased by 3X from 60ms to 180ms.
    • p75 responseStart for mobile increased by +20% from 0.35s to 0.45s.
    • p75 unchanged for desktop in the same period.

Problem 5: Tech debt widespread

[edit]

The mobile domain concept is only recognized in our CDN and MobileFrontend. (The CDN generates a redirect, and converts mobile traffic back to a standard URL and forwards it to MediaWiki. MobileFrontend generates the "Mobile view" link, and decides whether to apply the mobile layout on a pageview.)

The 2011 development of MobileFrontend did not include adding support for "mobile URLs" in the MediaWiki platform, nor has this developed since. We have a decade-long trickle of bugs where new products or platform capabilities are (after being deployed to production) found to be broken, corrupt caches, or send users to the wrong experience (e.g. CentralAuth SUL3, OAuth, SecurePoll, ResourceLoader). While these can sometimes be worked around by hardcoding references to MobileFrontend utilities to fix-up the "wrong" URL, there remain long-standing bugs that have no solution or are unprioritized. Full list in the task description atop T214998.

Implementing native support for mobile URLs in MediaWiki, unfortunately, presents a paradox. It would be easy to build an option to return a "mobile" URL in various core APIs. The paradox is in knowing when to call upon that option, as there is no one-size-fits-all solution. There is an inherent ambiguity between the "current", "canonical", and "right" URL for any given consumer. No amount of abstraction or engineering solves this. At best we can create shortcuts that make the current workaround easier to invoke. But, so long as the mobile domain exists, I see no future in which we don't develop, deploy to prod, find breakage, apply workaround, repeat.

MediaWiki is a leader among CMSes in its long-standing adoption of relative URLs, like "/wiki/Banana". Our protocol- and domain-agnostic URLs leverage the browser to automatically resolve these in their current context. This design serves us well, e.g. handling the HTTP/HTTPS migration without a doubling in cache cost (back in 2015), and this is what has allowed the mobile site to work as well as it has with most products and components naturally doing the right thing. However, this doesn't change the fundamental problem when a full URL is needed, e.g. when interacting between two wikis (interwiki links), or between Wikipedia and a third-party site (e.g. OAuth signing an exact URL, or Google Search), as only the caller knows what it needs.

This is not limited to MediaWiki. It is even harder for standalone services and third parties to adopt.[5] For example, the recently viral WikiTok website initially launched with Wikipedia links that served the desktop experience to mobile users.

Recommendation

[edit]

It is my recommendation that we phase out the mobile domain, in favour of providing the mobile experience directly on our standard URLs. A single standard URL (such as "en.wikipedia.org/wiki/Banana") will transparently and automatically serve the appropriate experience based on whether the user agent is a mobile device.

Engineering prep

[edit]
  1. Fix cases where frontend JavaScript scrapes the URL to detect mobile mode. We have a standard mechanism (for 10+ years) to detect the mobile skin and mobile frontend (mw.config.get). The vast majority of our code bases use this. There are no circumstances in which an extension or gadget can't call this. There are 4 recently developed MediaWiki features that instead scrape the address bar, looking for "m" to toggle their mobile-specific behaviour (Codesearch). While unsupported, these can be fixed in a day to use mw.config. We should also ask gadget authors via Tech News to look for this anti-pattern and adjust if needed. I will make an example edit, and include a link to that in the Tech News message.
  2. Ensure we differentiate in Logstash between error events from mobile and desktop. The Web team currently uses a domain name regex in Logstash to find mobile errors. Jon recommends we add a "skin" or "isMobile" field to the event schema for client errors. This will cost ~ 1 week, as it will involve changes in multiple places, and coordinating between Data Engineering and Web team.
  3. Disable MobileFrontendHooks::onTitleSquidURLs in wmf-config, in favour of a local hook. This is to allow us to disable $wgMobileUrlCallback during the rollout (to promote the standard URL), whilst still keeping purges in-tact for old URLs during the transition. Otherwise, MobileFrontend would stop sending purges during the transition.
  4. Data Pipelines and Metrics. The webrequest and unique devices pipelines need to be updated to derive the access method without the mobile domain. Downstream datasets are unaffected as they consume webrequest.access_method ("mobile web", "desktop web") or access_site ("mobile-site", "desktop-site") which remains compatible.
  5. Add missing "X-Subdomain" to "Vary" header in MobileFrontend, as required by the HTTP/1.1 spec. This bug is masked today because we (also) vary the response by domain name (except on Wikitech, T383656).

Engineering rollout

[edit]

Phase 1: Enable mobile frontend via standard URLs on pilot wikis

[edit]
  • Change the Varnish configuration to remove the redirect and instead use the same detection logic to vary the canonical response between standard and mobile. This would involve SRE Traffic, and likely a week or two of iterating patches, code review, and testing. I recommend we limit this to a subset of domains at first, e.g. testwikis, officewiki, and wikis where the mobile experience is currently broken, such as Wikitech (T383656). At this point, mobile visitors of pilot wikis will use the new standard, including when arriving via Google. Any references to mobile URLs would continue to work as-is. The "Mobile view" footer link still promotes the mobile URL at this stage. MediaWiki will still send purges for both URL variants to the CDN.
  • Change the MediaWiki configuration for MobileFrontend on the pilot wikis, so that the mobile/desktop toggle sets URL-parameters/cookies, without changing the domain name. This involves disabling $wgMobileUrlCallback. At this point, the "Mobile view" footer link on pilot wikis will switch within the standard domain instead of between standard and mobile domain.

Phase 2: Redirect m-dot to canonical on pilot wikis

[edit]
  • Change the Varnish configuration, for pilot domains, to redirect m-dot to canonical.
  • Change the MediaWiki configuration to stop sending purges to mobile URLs for pilot wikis.

At this point we pause to test relevant functionality and infrastructure:

  • manual toggling between desktop and mobile via "View mobile" / "Desktop" in both mobile and desktop browsers alike.
  • confirm there are no more canonical-to-mobile redirects seen in Hadoop for a pilot wiki.
  • confirm Varnish is not creating more cache objects than status quo.
  • confirm Varnish correctly varies caches between mobile and non-mobile.

Phase 3: Enable mobile frontend on standard URL for more/all wikis

[edit]

Phase 4: Redirect m-dot to canonical for more/all wikis

[edit]
  • Open question: Should we divide Phase 3-4 into smaller steps?
    • Consider: Varnish capacity to temporarily store three variants (canonical-desktop, canonical-mobile, m-dot).
    • Consider: MediaWiki capacity to backfill cache misses under the new URL.

Phase 5: Cleanup

[edit]
  • Remove unused Varnish code.
  • Remove unused mobile-purge code.

Other Options

[edit]

Status quo

[edit]

Refer to Problems 1-5 for the status quo.

Alternative: CDN purges

[edit]

The 20% cut in our CDN purge load from the recommended option can only be achieved by phasing out the mobile domain. Per § Problem 1, there are other more expensive options and/or options that make smaller cuts.

Phasing out the mobile domain — in the CDN layer — can be done in two ways. The recommended option phases it out from the public (thus also solving Problems 2-5: UX, Google, Site speed, and Tech debt).

If we wanted to only solve Problem 1 (CDN infra cost), there is one alternative option. We could emulate the proposed solution within our CDN layer, rather than actually doing it for the public.

We currently rewrite mobile requests to standard URLs after the cache lookup. If we were to perform this rewrite (and add a Vary-header) before the cache lookup in Varnish, that would similarly allow us to un-deploy purging of mobile URLs, as the standard URL purge would suffice to match and purge both variants in the cache.

Downside: This would expand the gap between how the CDN appears to work, and how it actually works. Such surprises and onboarding speedbumps add to what every SRE may eventually need to learn and remember. It also adds a cognitive weight to a system that is already complex to reason about; on the contrary, Domain unification would make the system easier to understand.

Alternative: Mobile perf

[edit]

I'm not aware of other removable redirects or roundtrips in our page load cycle.

I'm not aware of other software changes that could eliminate the current redirect.

One way to speed up latency to servers in general (thus masking the redirect cost, instead of removing it), would be to build more caching data centers in more countries.

Alternative: Tech-debt around mobile domain

[edit]

I've described under § Problem 5: Tech debt widespread that this is a paradox inherent to having a mobile domain, and that there are marginal gains in how we do the workarounds, but I'm not aware of (other) ways to solve the inefficient develop-deploy-discover-workaround loop.

See also

[edit]

Footnotes

[edit]
  1. The 2008 mobile gateway was a separate service written in Ruby. Learn more at History of Skins#2008 and Mobile Gateway.
  2. The ease of HTML cache separation was first documented in the 2008 revision of the Mobile Gateway page. It has not been mentioned in documentation since 2010.
  3. Examples of incidents and problems around CDN purging include T124418, T249325, and T250205.
  4. We often hear "90% of Wikimedia pageviews" come from Google. This number misses important context. In analytics "Wikimedia pageviews" usually means pageviews from users, excluding known bots and spiders. Google refers ~30% of Wikimedia pageviews. The biggest referrer is Wikimedia itself. ~35% of pageviews have a Wikimedia referer (i.e. you read one article after another). Those are included in most reports, but when talking about Google, we exclude internal referrals since those are likely continuations of sessions that started thanks to the first referrer. Google refers ~60% of externally-referred pageviews, meaning of pageviews that started from a search engine, news site, blog, education material, email, messaging app, social media, browser home page, address bar, or bookmarks. It is only when we focus on search engines that we get the 89% or ~90% figure. Google refers ~90% of external-search-referred pageviews. This is a statement about search engine market share. Data at P73499.
  5. The m-dot scheme may appear at glance as simple as putting an M between "en" and "wikipedia.org", the full implementation that works for our wikis requires 30 lines of code (example 1A and 1B, example 2).