User:DKinzler (WMF)/REST Entity Path Guidelines

This document describes norms for REST API resource paths.

The path is the component of the URL that follows the API root [TBD: referens API address guidelines]. The path specifies the endpoint [TBD: ref gloassary] and the resource [TBD: ref gloassary] to interact with.

Terminology

 * path element: ...
 * identifiers
 * must not be empty
 * resources vs entities: generally speaking, when a resource is requested, an entity is returned. A resources has persistent identity, it may change over time. An entity is identified by its content, it cannot change, it can only be replaced with another entity (compare etag spec).

Types of Resources
There are three basic types of resources:


 * singletons: a resource is a singleton, if only a single instance of this type of resource exist. The path of the singleton consists of just the name of the singleton, as a singlular noun. E.g..
 * collections: a collection represents the set of all instances of a given type of resource. The path of a collection is the name of the type, as a plural noun. E.g. . A trailing slash  must be ignored or trigger a normalization redirect [TBD ref].
 * elements (of collections): a single instance of a given type of resources.The name of a collection element is the name of the type followed by the id of the element, e.g.

Resoucres can be nested by allowing singletons and elements (but not collections) to contain singletons or collections. For instance, a user's contributions could be accessible as a nested collection: ; And a cities coordinates could be accessible as a resource under   [TBD: not a good example!].

NOTE: Deep and complex nesting of resources should be avoided. Nesting must only be used if the sub-resources only exist as parent resource. If the nested resource can also be references by itself, it not be bested. [TBD: this could be clearer].

Naming Path Elements
Path elements that are not identifiers (path parameters) should:


 * be American English nouns.
 * use plural if it is a collection [NEW!], and singular if it is a singleton.
 * use "kebab-case" if they are multi-word phrases.

Encoding and Escaping
All parts of the path, along with query parametes and their values, MUST be represented as UTF-8. Path elements that are not parameter values MUST use the alphanumeric ASCII characters (the hyphen/minus and the underscore are also allowed).

The following characters MUST be escaped using percent-encoding when they occurr in path parameters, since they hold special meaning per the URL spec:


 * the question mark (Unicode codepoint XYZ)
 * the percent sign (Unicode codepoint XYZ)
 * the hash sign (Unicode codepoint XYZ)

In addition, the following characters must also be escaped, because they carry meaning the the context of REST endpoint routing:


 * the slash sign (Unicode codepoint XYZ)
 * the pipe characters (Unicode codepoint XYZ) (MAYBE, for custom verbs?) [TBD]
 * the ampersand characters (Unicode codepoint XYZ) (MAYBE, for consisteny?) [TBD]

The following characters SHOULD not be escaped:


 * the colon (Unicode codepoint XYZ) [TBD]

NOTE: the set of characters that must be escaped in query parameter values and secment ID is slightly different! [TBD]