Requests for comment/Service-oriented architecture authentication

Problem statement
With many more entry points and the need for inter-service authentication, a service-oriented architecture requires a stronger authentication system.

Goals

 * Single sign-on support
 * Support a relatively timely revocation of rights (minutes)
 * Minimize the risk & impact of exploits:
 * Principle of least privilege: for example, most services should not have direct access to sensitive user information (password hashes etc)
 * Minimize the risk of confused deputy attacks and attack surface by primarily trusting unforgeable user capabilities rather than services, and always checking capabilities at the lowest possible layer (example: storage service)
 * Be efficient for high request volumes (APIs)
 * No synchronous checking with other services required for common requests
 * Reads in particular; All on private wikis & all private user-specific info like notifications, watchlist, deleted revisions etc on public ones.
 * Small token size to minimize network overhead in particular for non-SPDY clients
 * Follow best security practices, use established standards & existing implementations
 * Support derivative and asynchronous requests like link updates
 * But don't allow them directly

JWT Bearer tokens / OpenID connect

 * All authenticated traffic uses TLS
 * Authentication service is only service that has access to sensitive user information
 * Client authentication
 * Normal browser auth / our domains: Bearer token is set in HTTP-only cookie (instead of / in addition to session id)
 * Cross-domain: Client follows the normal OpenID connect token request flow with auth service
 * retrieves time-limited signed Bearer token
 * token encodes user id / name (in signed JWT); no group memberships beyond the implicit 'user', as those would not scale for single sign-on
 * Client sends token with each request (SPDY can make sure it only goes over the wire once)
 * Most backend services have no special rights; they merely forward the user-provided token to other services
 * Ultimate checking happens at the lowest possible layer to avoid multiple entry point issues. Example: storage service
 * Common requests like read only require a signature and timestamp check
 * Less common requests (edits for example) require calls back into auth service
 * Authentication service functionality:
 * Swap a JWT for a full list of group memberships & blocking status for the user & wiki
 * Provide & verify CSRF tokens for state-changing operations
 * Ideally sign responses or check TLS certs to ensure auth service requests can't be faked
 * Authenticate derivative requests (background jobs) when enqueuing them, based on per-queue ACLs. Once authenticated & enqueued, an additional signed assertion added by the queue removes the auth timeout, so that jobs will be able to complete in the background. Ideally such jobs are idempotent and depend on a primary authenticated request having gone through. This would make them relatively harmless & avoid the need for additional protection against making those requests directly beyond not exposing them publicly.

Ecosystem & implementation

 * JWTs & OpenID connect / OAuth2 are used fairly widely: PayPal, Microsoft, Salesforce, Google, Deutsche Telekom, mobile carriers (GSMA) etc
 * Libraries are readily available

Implementation phases
A rough summary of what we (Chris & Gabriel) came up with.

We will start by phasing in the "Auth Service" as backend for CentralAuth. We can migrate pieces of CentralAuth to either use the existing code and db backend, or use the service, based on a configuration flag. This lets us move the most sensitive pieces to the new service while allowing us to revert quickly if we encounter unexpected problems.

Initially (phases 1-3), we will focus on identifying the user in other services, then (phased 4 & 5) moving the authentication (password and token verification) and session management into the service. We will use the MediaWiki plugins and hooks that CentralAuth already implements. This will hopefully ensure that changes to the authentication system in mediawiki (i.e., Requests_for_comment/AuthStack) integrates with our work, and other mediawiki auth backends could potentially use this Auth Service too.

Longer term, we would like the Auth Service to also handle authorization for mediawiki, but since this will require significant changes to core, we will delay specifying that now and may make that a separate RFC.

Phase 1: Signed JWTs in cookie, testing only

 * add capability to add a cookie with a signed JWT as part of the CentralAuth session
 * Use static key initially, with key size to make this secure.
 * Validate this in services on request (check signature using public key & timestamp using JWT library)
 * this means that read requests for content accessible to the 'users' group could now be authenticated & authorized without sessions or db
 * if timed out, call authentication service API end point (likely part of core initially) with old token in order to get a fresh token & refresh the cookie with set-cookie header
 * don't rely on any of this in production yet

Phase 2: Key distribution from auth service

 * Distribute set of rolling public keys to services including MW core
 * in advance, by time range
 * signed by authentication service using long-lived service key
 * Widen testing, but not yet production

Phase 3: Support advanced operations in auth service

 * auth service provides end points and extension mechanisms for more complex authorization needs
 * End point to swap token against list of rights per wiki
 * ideally using existing libraries -- e.g. passport
 * Start use in production

Phase 4: Passwords and Token checking in service

 * Service will support setting and checking passwords, and CentralAuth will have the option of delegating those functions to the service.
 * Probably all of the global_user table data will be moved, so Locking/hiding, etc will be handled too?
 * Service will support generating and checking authentication tokens.
 * Possibly handling all session verification, and managing all session information?
 * Once this is reliably handling all authentication checks, we can remove passwords and tokens from the centralauth database
 * The Auth Service is the only code with access to CentralAuth's security-critical data

Phase 5: Groups and wiki associations moved to service

 * Global groups (and wikisets), and account attachments (if still relevant after SUL finalization?) are moved into the service.

Phase 6: Use auth service exclusively

 * auth service becomes the only service with access to sensitive user data
 * Ability to execute SQL in random code doesn't imply access to user table any more
 * used by both MW core & services
 * remove session cookie

Status quo
The MediaWiki PHP codebase performs all authentication and authorization checks internally.

It provides:
 * Account creation and autocreation
 * Authentication (password, or plugin based)
 * Authorization / session management
 * Sets up the appropriate User object (e.g., $wgUser) for each request, before MediaWiki processes the request
 * MediaWiki uses the User object for most code that determine authorization
 * The User object tracks if the user has been blocked, and can validate anti-CSRF tokens for the user's session
 * Logout (destroy the session)

This RFC is about handling these tasks in a service with a well-defined interface.