Auth systems/OAuth/Design

Features

 * To simplify the design, and to work with many of the existing OAuth client libraries that current work with Twitter's OAuth 1.0 implementation, we will start by implementing OAuth 1.0a
 * We can move to one or more OAuth 2 flows in next year if there is demand
 * This will allow users to use HTTPS or HTTP when using OAuth, since every call must be signed. To keep things simple, we may only support HMAC to start?


 * There will be an Application Registration page, for registering applications
 * If this process is usable for power users, it may enable Desktop apps, as each user would generate their own appid


 * There will be an approval page, where a logged in user will grant permissions to an application
 * Users should be warned of privacy implications
 * Should users reenter their password?
 * Allow application to update their privileges? This is not part of the specification, but should be a simple addition.
 * What's the difference between an application that already has X updating to add Y, and the application just making a new request for X+Y?
 * The application would need to start using new client credentials (we don't want to track what users have accepted or not), so when the user is redirected to the WMF to authorize the new client, in the UI, it's the difference between saying that an app wants to authorize with certain permissions vs. saying an existing application would like to to add / remove these permissions.


 * There will be a page where Users can see and manage their approved applications
 * See grants
 * revoke if desired


 * The MW API accepts signed requests
 * wgUser setup for the user
 * Hook ApiCheckCanExecute to authorize?
 * What is the granularity of OAuth grants?
 * By module: For most action modules, it makes sense for OAuth to grant access to the module. But for action=query with its many submodules, not so much: many are basically public information where asking permission for each module would be too much, but then some like list=deletedrevs, list=checkuserlog, and list=checkuser do need special permission. And some modules have some restricted-use features, e.g. action=edit for user css/js or protected pages or MediaWiki-namespace pages, and prop=revisions might someday get a flag to request revdeled content.
 * By user right: Again, in some places these are too granular, and in others not enough. For example, editing a page requires 'edit' and 'writeapi' (and also 'read' unless you're blindly overwriting pages), and likely 'minoredit' and 'skipcaptcha' would also be wanted, and maybe also 'createpage', 'createtalk', 'autoreview', 'autopatrol', 'autoconfirmed', and 'bot'. And at the same time you can't authorize normal edits but not edits to your user CSS/JS.
 * By adding some sort of "API rights": This solves the granularity problem, but it's a major addition. The idea here is that in addition to whatever user rights are required, each module would also require some set of "API rights" (which might depend on the query string parameters); normal password-based login would supply all "API rights" and so would work as it does now, while an OAuth session would only grant the authorized subset.
 * Group modules and rights into permission categories: E.g., "Create and Edit Pages" give access to modules edit and query, and the rights read, edit, minoredit, and create.
 * Should we further restrict these methods by Namespace, or Content Type? Or by default, require an extra permission to work in the MediaWiki namespace?

Logging

 * Lots of people gave the input that it would be desirable to log the appid, userid, and revisionid (or other identifying id for the action), to identify apps that were misbehaving.
 * Logs would only be written on edits, or api calls that made a change to the system (specifically, we will exclude any calls for searching or reading)
 * Viewing these logs would be available to a privileged user, and/or the user
 * Collection/Storage: It seems like we could use either the EventLogging system, or our own logging table, to collect and store the data
 * EL is asynchronous, which is nice. However, this would mean other MW users, who wanted to use OAuth, would either need to have EventLogging setup, or not have a log of the activity.
 * For the anticipated load, keeping the logs on extension1 would probably be fine

Implementation Notes

 * While discussing implementation strategies (irc check-in 20130404)
 * OAuth should likely be implemented as an extension
 * The extension should be usable by mediawiki installs that do not include CentralAuth
 * For the WMF deployment, some CentralAuth aspects are necessary. Likely will need hooks to allow CentralAuth to update pieces, or we may extend some classes to implement CentralAuth-specific functionality
 * The storage for OAuth data (Client registration, user permissions, logging) will be in a single database. User permissions will include a "wiki" parameter to allow giving a client permissions on only some wikis, or "*" for authorizing the client on all wikis.
 * We'll probably use some pieces from existing/reference libraries to handle the OAuth encoding and signatures (one library had a good implementation of both RSA and HMAC signatures in ~50 loc). For actual interactions, we'll probably write the integrations from scratch. We are not trying developing our own library.
 * NB: we'll chose a couple libraries soon to test as clients, to ensure our system integrates with them.
 * It seems like a good idea to require https for the WMF endpoints, to address

Open Questions

 * Does a Consumer need to get an AccessToken for each wiki, or will a single Token for each user authorization suffice?
 * Should/Can we enforce using Authorization headers only, as recommended in the standard? Adding the information to the body or query parameters is supported, but the header is preferred.
 * Granularity of OAuth grants? (see above)

Basic Authorization Flow

 * Diagram of the basic authn flow