Talk:Requests for comment/Content API

Documentation and specification format
Swagger has some useful code generation tools (e.g. swagger-codegen and swagger-js-codegen) that make it much easier to follow design by contract (in which the API is specified indepenently of the code that implements it). This helps all sorts of things, including test automation (e.g. ensuring that implementations correctly follow the specification), modularization (e.g. allowing disparate teams to trust the specification rather than interdependencies on implementations), and documentation (e.g. via tools like swagger-ui). James Earl Douglas (talk) 03:45, 13 October 2014 (UTC)


 * James, thanks for plugging Swagger. We decided to go with Swagger 2 for many of these reasons. The implementation is tracked in this bug. -- Gabriel Wicke (GWicke) (talk) 01:55, 14 October 2014 (UTC)

Background
I've been thinking about how to write an up-front specification for the API, and I have some questions about the design and layout of the URIs for bucket handlers.

By "up-front specification", I mean one that is designed ahead of and independently of a backend implementation, and is considered a contract to which a backend implementation must adhere to be considered correct.

Overview
The current URI design allows for some specification ambiguity, as different bucket handlers can implement similar paths (prefixed with /v1/{domain}/{bucket}/) that can only be resolved to an implementation at runtime by inspecting the type of the bucket named in the {bucket} path parameter.

Consider, for example, the current kv and pagecontent bucket handlers in lib/filters/bucket.

The kv bucket handler implements the following paths:


 * /v1/{domain}/{bucket}
 * /v1/{domain}/{bucket}/
 * /v1/{domain}/{bucket}/{key}
 * /v1/{domain}/{bucket}/{key}/
 * /v1/{domain}/{bucket}/{key}/{revision}

The pagecontent bucket handler implements the following paths:


 * /v1/{domain}/{bucket}
 * /v1/{domain}/{bucket}/
 * /v1/{domain}/{bucket}/{key}
 * /v1/{domain}/{bucket}/{key}/
 * /v1/{domain}/{bucket}/{key}/redlinks
 * /v1/{domain}/{bucket}/{key}/{format}
 * /v1/{domain}/{bucket}/{key}/rev/
 * /v1/{domain}/{bucket}/{key}/{format}/
 * /v1/{domain}/{bucket}/{key}/rev/{revision}
 * /v1/{domain}/{bucket}/{key}/{format}/{revision}

These overlap in a few places, including:


 * /v1/{domain}/{bucket}/{key}
 * /v1/{domain}/{bucket}/{key}/
 * /v1/{domain}/{bucket}/{key}/{revision} vs. /v1/{domain}/{bucket}/{key}/{format}

A disadvantage of this URL overloading is that it prevents a specification (i.e. a Swagger document) from being written prior to runtime. This is because there isn't a way to describe the specific requirements of a request or the expectations of a response for a given overloaded path, since these details can vary from bucket handler to bucket handler. For example, we can't say for certain what query parameters or data model should appear in a request to /v1/{domain}/{bucket}/{key}/foo, because we can't know whether foo will refer to {revision} from kv or {format} from pagecontent.

An advantage of fixing the URI patters with respect to bucket types is that it would allow an up-front API specification to be written, which would be a boon to long-term stability by sticking to an intentional API specification.

Questions
What are the advantages of keeping the bucket type hidden from the routing like this?

What are the disadvantages of moving the bucket type into the URI (e.g. /v1/{domain}/{bucket-type}/{bucket-name}/...)?

What are the thoughts on maintaining an up-front API specification that neither depends on the implementation or runtime configuration of the API?

Jdouglas (WMF) (talk) 15:56, 24 November 2014 (UTC)