Requests for comment/Service-oriented architecture authentication

Problem statement
With many more entry points and the need for inter-service authentication, a service-oriented architecture requires a stronger authentication system.

Goals

 * Single sign-on support
 * Support a relatively timely revocation of rights (minutes)
 * Minimize the risk & impact of exploits:
 * Principle of least privilege: for example, most services should not have direct access to sensitive user information (password hashes etc)
 * Minimize the risk of confused deputy attacks and attack surface by primarily trusting unforgeable user capabilities rather than services, and always checking capabilities at the lowest possible layer (example: storage service)
 * Be efficient for high request volumes (APIs)
 * No synchronous checking with other services required for common requests (reads in particular)
 * Small token size to minimize network overhead in particular for non-SPDY clients
 * Follow best security practices, use established standards & existing implementations
 * Support derivative and asynchronous requests like link updates
 * But don't allow them directly

JWT Bearer tokens / OpenID connect

 * All authenticated traffic uses TLS
 * Authentication service is only service that has access to sensitive user information
 * Client authentication
 * Normal browser auth / our domains: Bearer token is set in HTTP-only cookie (instead of / in addition to session id)
 * Cross-domain: Client follows the normal OpenID connect token request flow with auth service
 * retrieves time-limited signed Bearer token
 * token encodes user id / name (in signed JWT); no group memberships beyond the implicit 'user', as those would not scale for single sign-on
 * Client sends token with each request (SPDY can make sure it only goes over the wire once)
 * Most backend services have no special rights; they merely forward the user-provided token to other services
 * Ultimate checking happens at the lowest possible layer to avoid multiple entry point issues. Example: storage service
 * Common requests like read only require a signature and timestamp check
 * Less common requests (edits for example) require calls back into auth service
 * Authentication service functionality:
 * Swap a JWT for a full list of group memberships & blocking status for the user & wiki
 * Provide & verify CSRF tokens for state-changing operations
 * Ideally sign responses or check TLS certs to ensure auth service requests can't be faked
 * Authenticate derivative requests (background jobs) when enqueuing them, based on per-queue ACLs. Once authenticated & enqueued, an additional signed assertion added by the queue removes the auth timeout, so that jobs will be able to complete in the background. Ideally such jobs are idempotent and depend on a primary authenticated request having gone through. This would make them relatively harmless & avoid the need for additional protection against making those requests directly beyond not exposing them publicly.

Ecosystem & implementation

 * JWTs & OpenID connect / OAuth2 are used fairly widely: PayPal, Microsoft, Salesforce, Google, Deutsche Telekom, mobile carriers (GSMA) etc
 * Libraries are readily available

Implementation phases
A rough summary of what we (Chris & Gabriel) came up with.

Phase 1: Signed JWTs in cookie, testing only

 * add capability to set a cookie with a signed JWT on session start in MediaWiki core
 * Use static key initially, with key size to make this secure.
 * Validate this in services on request (check signature using public key & timestamp using JWT library)
 * this means that read requests for content accessible to the 'users' group could now be authenticated & authorized without sessions or db
 * if timed out, call authentication service API end point (likely part of core initially) with old token in order to get a fresh token & refresh the cookie with set-cookie header
 * don't use any of this in production yet

Phase 2: Key distribution from auth service

 * Distribute set of rolling public keys to services including MW core
 * in advance, by time range
 * signed by authentication service using long-lived service key
 * Widen testing, but not yet production

Phase 3: Support advanced operations in auth service

 * auth service provides end points and extension mechanisms for more complex authentication needs
 * End point to swap token against list of rights per wiki
 * Per-page restrictions?
 * ideally using existing libraries -- e.g. passport
 * Start use in production

Phase 4: Use auth service exclusively

 * auth service becomes the only service with access to sensitive user data
 * Ability to execute SQL in random code doesn't imply access to user table any more
 * used by both MW core & services
 * remove session cookie

Status quo
The MediaWiki PHP codebase performs all authentication and authorization checks internally.

It provides:
 * Account creation and autocreation
 * Authentication (password, or plugin based)
 * Authorization / session management
 * Sets up the appropriate User object (e.g., $wgUser) for each request, before MediaWiki processes the request
 * MediaWiki uses the User object for most code that determine authorization
 * The User object tracks if the user has been blocked, and can validate anti-CSRF tokens for the user's session
 * Logout (destroy the session)

This RFC is about handling these tasks in a service with a well-defined interface.