Requests for comment/Publishing the RecentChanges feed

From mediawiki.org
Request for comment (RFC)
Publishing the RecentChanges feed
Component General
Creation date
Author(s) Timo Tijhof, Ori Livneh, Kunal Mehta
Document status implemented
See Phabricator.

wikitech:RCStream

MediaWiki supports (as of v1.22; gerrit:52922) multiple types of recentchanges feeds (see Manual:$wgRCFeeds), including a machine readable JSON feed (JSONRCFeedFormatter). Right now the only feed exposed for Wikimedia sites is the "IRCColourfulRCFeed" via irc.wikimedia.org. There are multiple options on how to broadcast the new feed format, which are discussed below.

Proposals for endpoints[edit]

xmpp pub/sub[edit]

See http://xmpp.org/protocols/pubsub/

  • well known protocol, plenty of client libraries already available
  • w:ejabberd is scalable and well tested (was used by facebook, powers jabber.ru, etc) XMPP application server, plus already packaged in debian
  • leave it to third-parties to rebroadcast the data in whatever format they want (websockets)
  • XMPP packets can embed arbitrary XML subdocuments, which could carry structured data directly instead of embedding a JSON blob in XML or something --brion (gerrit:105430)
  • See also: m:Recentchanges via XMPP, bugzilla:17450 - Make Recent Changes available via XMPP

WebSockets[edit]

  • Useful for browser based tools.
  • node.js + nginx

IRC[edit]

  • Re-use irc.wikimedia.org, and create new channels like #en.wikipedia-json
  • Much easier for people who are already consuming the feed and just want to switch to machine readable data

Proposals for internal traffic[edit]

How MediaWiki should send the data to the proposed endpoint.

UDP[edit]

Implemented via the UDPRCFeedEngine class

ZeroMQ[edit]

Code is in gerrit:105117

  • "ZeroMQ is a high-performance asynchronous messaging library aimed at use in scalable distributed or concurrent applications." (w:ZeroMQ)
  • "Speaking as the person who introduced the current UDP solution: I don't know why you would want to continue using UDP. There's no reason to do that now that TCP queue daemons like zeromq exist. -- Tim Starling (talk) 06:17, 17 July 2013 (UTC)" [1][reply]

Redis[edit]

MediaWiki version:
≥ 1.22
Gerrit change 80958

A publish/subscribe transport. This transport was implemented in core, in includes/rcfeed/RedisPubSubFeedEngine.php. The class handles the redis:// URI scheme. Recent changes are published to the channel 'rc'.

Implementation[edit]

RCStream was developed in response to this RFC. It uses Socket.IO/WebSocket servers as backends behind an nginx endpoint (at stream.wikimedia.org for the Wikimedia production cluster). $wgRCFeeds is set on the wiki servers to publish a JSON-formatted RCFeed to Redis, and the RCStream servers subscribe to this.

See also[edit]