User:GWicke/tmp/services talk

Hear hear! Yeah, we like this one... see my own comments/proposal about Mediawiki as a Service here: User:Owyn/Mediawiki_As_Service I was going to submit that myself but I think rather than cluttering up the RFC page with a second proposal which seems very similar, we could work on merging these together? I'd be happy to help contribute more to this proposal. Owyn (talk) 00:37, 19 December 2013 (UTC)


 * There is indeed a lot of overlap here. I think we are propagating something in between your API and functional decomposition alternatives. Each service should have a clearly defined API, but I don't see a need to have a single API service only.
 * Lets try to merge the two. -- Gabriel Wicke (GWicke) (talk) 00:19, 3 January 2014 (UTC)

Seems like what we're doing already
There isn't really much I can say about this. Yes, we have always used external services, and will continue to do so. The decision as to whether something should be a service or should be integrated will continue to be made on a case-by-case basis. An RFC is meant to be a change proposal, but the picture painted here seems to be pretty similar to what we're doing and what we've always done.

There's not much difference between an HTTP service which takes POSTed data in some application-specific format and an application-specific binary protocol like Memcached or MySQL. Both approaches are limited in the same sense. One has the advantage of a common toolset, the other has better efficiency. Wirth's Law states that we will eventually migrate to the solution which is less efficient but more abstract. This is fine.

The RFC states that the move from NFS to Swift was the first migration of an existing service of MediaWiki to an HTTP service. This is incorrect, the migration of search from MySQL to Lucene.NET in 2005 was the first. But it doesn't seem like an important landmark to me, since we had similar services before that, they just used protocols other than HTTP.

As for SOAP, WSDL, etc., well, we've never used those and nobody is saying we should start. We generally haven't used REST in the past because, strictly defined ( etc.) its applications are very limited. Swift calls itself REST, but it is unclear to me whether it would qualify as REST under the HATEOAS constraint. Luckily the RFC adds "plain HTTP" as an implementation option in addition to REST, which I think covers everything we've done in the past and everything we're planning on doing in the future.


 * "Currently, each team needs to handle the full stack from the front-end through caching layers and Apaches to the database."

Well yes, on some level this is true, but the situation is mitigated by a great deal of existing modularity. It's not necessary for every team to be aware of all the details of how MySQL or Varnish work, and each team is not required to reimplement those components. And there are plenty of small teams working on internal services, with little need for awareness of the frontend.

There has been some debate as to whether we should have product teams made up of various kinds of specialist, or whether we should have specialist teams which collaborate across teams to produce products. There is something to be said for the product team approach, even if it does require a broad view of the system architecture. The workload for product teams would be reduced by having well-documented modules provided by internal service teams.

Regardless of the level of modularity, software developers will always be faced with the problem of understanding and integrating several different modules.


 * "This tends to promote tight coupling of storage and code using it, which makes independent optimizations of the backend layers difficult. It also often leads to conflicts over backend issues when deployment is getting closer."

This is a good argument to use for splitting out a particular service, but I don't see any general principle. Sometimes a prospective backend layer can be identified for abstraction, sometimes not. Sometimes it makes sense to split the backend layer out into a separate network service, sometimes it doesn't.

-- Tim Starling (talk) 06:03, 20 December 2013 (UTC)


 * Much of this RFC is about the interfaces a service defines. We argue for narrow and well-defined interfaces, while something like MySQL provides a very rich interface with the full power and problems of SQL. I also agree that we have been using other "external" and optional services like Lucene.NET in the WMF setup for a while. The RFC however proposes to gradually restructure the very core of MediaWiki in terms of narrow service interfaces.
 * "Both approaches are limited in the same sense. One has the advantage of a common toolset, the other has better efficiency. Wirth's Law states that we will eventually migrate to the solution which is less efficient but more abstract. This is fine."
 * I share your concern for efficiency, but actually believe that raising the level of abstraction in interfaces will prove beneficial for performance by enabling important macro-optimizations. A higher-level interface typically means that much fewer requests are necessary to perform the same task. This makes per-request overheads much less critical. Additionally, at a micro-optimization level SPDY and relatively efficient service platforms are bringing per-request overheads down to a point where the dichotomy between efficient low-level protocols and inefficient high-level protocols is very much disappearing.
 * "The workload for product teams would be reduced by having well-documented modules provided by internal service teams."
 * Indeed. The important part here is providing well-defined and narrow interfaces. I am preferring service-style interfaces as those can additionally support distribution and external use, and are easier to integrate in a common parallel IO abstraction.
 * Using narrow interfaces between different parts of a single product (say, front-end and storage back-end) is