Talk:Requests for comment/Content API

Documentation and specification format
Swagger has some useful code generation tools (e.g. swagger-codegen and swagger-js-codegen) that make it much easier to follow design by contract (in which the API is specified indepenently of the code that implements it). This helps all sorts of things, including test automation (e.g. ensuring that implementations correctly follow the specification), modularization (e.g. allowing disparate teams to trust the specification rather than interdependencies on implementations), and documentation (e.g. via tools like swagger-ui). James Earl Douglas (talk) 03:45, 13 October 2014 (UTC)


 * James, thanks for plugging Swagger. We decided to go with Swagger 2 for many of these reasons. The implementation is tracked in this bug. -- Gabriel Wicke (GWicke) (talk) 01:55, 14 October 2014 (UTC)

Background
I've been thinking about how to write an up-front specification for the API, and I have some questions about the design and layout of the URIs for bucket handlers.

By "up-front specification", I mean one that is designed ahead of and independently of a backend implementation, and is considered a contract to which a backend implementation must adhere to be considered correct.

Overview
The current URI design allows for some specification ambiguity, as different bucket handlers can implement similar paths (prefixed with /v1/{domain}/{bucket}/) that can only be resolved to an implementation at runtime by inspecting the type of the bucket named in the {bucket} path parameter.

Consider, for example, the current kv and pagecontent bucket handlers in lib/filters/bucket.

The kv bucket handler implements the following paths:


 * /v1/{domain}/{bucket}
 * /v1/{domain}/{bucket}/
 * /v1/{domain}/{bucket}/{key}
 * /v1/{domain}/{bucket}/{key}/
 * /v1/{domain}/{bucket}/{key}/{revision}

The pagecontent bucket handler implements the following paths:


 * /v1/{domain}/{bucket}
 * /v1/{domain}/{bucket}/
 * /v1/{domain}/{bucket}/{key}
 * /v1/{domain}/{bucket}/{key}/
 * /v1/{domain}/{bucket}/{key}/redlinks
 * /v1/{domain}/{bucket}/{key}/{format}
 * /v1/{domain}/{bucket}/{key}/rev/
 * /v1/{domain}/{bucket}/{key}/{format}/
 * /v1/{domain}/{bucket}/{key}/rev/{revision}
 * /v1/{domain}/{bucket}/{key}/{format}/{revision}

These overlap in a few places, including:


 * /v1/{domain}/{bucket}/{key}
 * /v1/{domain}/{bucket}/{key}/
 * /v1/{domain}/{bucket}/{key}/{revision} vs. /v1/{domain}/{bucket}/{key}/{format}

A disadvantage of this URL overloading is that it prevents a specification (i.e. a Swagger document) from being written prior to runtime. This is because there isn't a way to describe the specific requirements of a request or the expectations of a response for a given overloaded path, since these details can vary from bucket handler to bucket handler. For example, we can't say for certain what query parameters or data model should appear in a request to /v1/{domain}/{bucket}/{key}/foo, because we can't know whether foo will refer to {revision} from kv or {format} from pagecontent.

An advantage of fixing the URI patters with respect to bucket types is that it would allow an up-front API specification to be written, which would be a boon to long-term stability by sticking to an intentional API specification.

Questions
What are the advantages of keeping the bucket type hidden from the routing like this?

What are the disadvantages of moving the bucket type into the URI (e.g. /v1/{domain}/{bucket-type}/{bucket-name}/...)?

What are the thoughts on maintaining an up-front API specification that neither depends on the implementation or runtime configuration of the API?

Jdouglas (WMF) (talk) 15:56, 24 November 2014 (UTC)


 * What are the advantages of keeping the bucket type hidden from the routing like this?
 * What are the disadvantages of moving the bucket type into the URI (e.g. /v1/{domain}/{bucket-type}/{bucket-name}/...)?
 * Identifying a bucket type is not the only goal in the design of a REST API. Other goals include a logical grouping of functionality, sensible listings, and the ability to extend the API with new behavior. Adding new bucket types provides extensibility as well, but I see little difference between this and adding new bucket instances and documenting those, especially when details like content type / JSON schema are likely to differ for each bucket instance anyway.


 * What are the thoughts on maintaining an up-front API specification that neither depends on the implementation or runtime configuration of the API?
 * Planning ahead is great, but I also think that it's not realistic to plan an entire API up-front without the benefit of user feedback. There are many applications and use cases that we can't know about yet. We'll definitely have to iterate on the spec and implementation.
 * That said, it is normally possible to identify core parts of the API that will remain stable for a long time. Those can be clearly marked as stable, and should be thoroughly tested beyond what just a spec could provide. Any backwards-incompatible change in the stable portion of the API would increment the major API version, with backwards compatibility maintained for the old paths via a rewrite framework. Outside of the stable API, we can add entry points dynamically (marked as unstable where suitable) to accommodate new use cases. Since this won't affect the stable API parts, it won't break existing users of the stable API. The entire API including these more dynamic entry points needs to be accurately documented, and should be well discoverable via hypertext. -- Gabriel Wicke (GWicke) (talk) 04:43, 25 November 2014 (UTC)