Talk:Requests for comment/Service-oriented architecture authentication

From mediawiki.org
Jump to navigation Jump to search

Similarity to AuthStack[edit]

How different is this from AuthStack? This RfC seems like just a generalized version of the latter. Can they just be merged? Parent5446 (talk) 18:14, 9 June 2014 (UTC)

This RFC focuses on authentication in a SOA world, and formulates some architectural goals. One of those goals is a separation of concerns and isolation. Most code should not have access to sensitive user information, so that security issues in random features don't lead to an exposure of sensitive information. Another goal is to push authentication to the lowest layers (storage service) wherever possible to avoid the risk of a confused deputy & address the issues of different services collaborating to provide specific functionality.
The solution presented in the AuthStack RFC does not seem to address several of these goals. This leads me to believe that the goals of the two RFCs are actually different. -- Gabriel Wicke (GWicke) (talk) 18:32, 9 June 2014 (UTC)
I think "Authentication" is a bad name for this RFC, but can't think of a better one. Maybe "Inter-service user identification, authorization, and session management"?
While AuthStack deals (primarily) with Authentication in MediaWiki, this RFC is about MediaWiki acting as an Identity Provider for other services, and how to efficiently make those assertions. If we make a MediaWiki Authentication Service, then it would need to account for all of the stuff discussed in AuthStack, as well as how MediaWiki core would consume those. I don't think that discussion should happen until we have the inter-service session management pieces working in production. CSteipp (talk) 22:39, 12 June 2014 (UTC)

Tokens[edit]

gwicke and I had talked about using JWT's for identification. I did a quick test to see the size, and encoding basic information about the user, issuer, validity timestamps, and the list of user rights that user has generates a JWT that's about 4k. The RS256 signature was about 600 bytes larger than HS256, using a 4096 bit rsa key. CSteipp (talk) 22:40, 12 June 2014 (UTC)

I'd expect the size to be mostly determined by the key size. Do we need 4096 bits, especially with key rotation? We might also be able to gzip + base64 encode the value for the benefit of plain HTTP users, although this should not do much for the signature. -- Gabriel Wicke (GWicke) (talk) 22:41, 12 June 2014 (UTC)
2048 would probably be ok if we actually do key rotation. Key management is hard, but if someone is willing to stay on top of it, we can assume that. So that takes the signature down to about 300 bytes. Also, we would want to use RS512 to get equivalent security to HS256, so using my test user gives:
  • Uncompressed: RS512 JWT = 4054 B, HS256 JWT = 3755 B
  • Compressed: RS512 JWT = 2482 B, HS256 JWT = 2183 B
So ~2.5k overhead on every request. CSteipp (talk) 23:35, 12 June 2014 (UTC)
My understanding is that RS256 is recommended (SHA-2, signed with 2048 bit RSA key [1]). Why do you feel that RS512 is necessary for 2048 bit RSA?
My understanding is also that HS256 is just a SHA-2 over the message & a shared secret: "The HMAC SHA-256 MAC is generated per RFC 2104, using SHA-256 as the hash algorithm "H", using the octets of the ASCII [USASCII] representation of the JWS Signing Input as the "text" value, and using the shared key." [2]
Based on the size, I'm guessing that HS256 in your data is RS256?
So assuming we go with the recommended RS256, it looks like we'd end up with 2183bytes. This compares with 441bytes worth of cookies in production, although some of those will still be needed.
Without SPDY / HTTP2 this would not be impossible, but also not ideal. By the current stats at http://caniuse.com/spdy this would affect about 30% of all HTTPS traffic, and all of HTTP traffic. SPDY support will further improve soon with Apple just announcing support and IE gaining it fairly recently.
We might still want to wait with using full tokens until SPDY support is more common, and we actually support it as well. Until then we can start using this for API requests. We could also consider storing the tokens in memcached based on the session id, and retrieving those for API requests with a session cookie only. -- Gabriel Wicke (GWicke) (talk) 04:20, 13 June 2014 (UTC)
Since the secret key mixed into the hash is unknown, the attacker has to essentially brute force the key that we use-- which happens to be 256 bits when we use HS256 for OAuth right now. To spoof a signature, they "just" need to find a collision in the hash. Sha256 is takes a lot of work to find collisions (I think it's still over well over 128 bits of work, which is virtually impossible), but it's less than 256 bits, so a larger hash ensures that the hash is not weakest part of the signature. The 2048 bit key is approximately equivalent to 112 bits of brute forcing, so that becomes the weaker link. Again, not that any of those attacks are feasible right now, but in 5 years, it's anyone's guess. And no, I did mean HS256 in my test, not RS256.
Hmm, but isn't 2048 bit RSA then the weakest link even with SHA-2?
I'm surprised that the overall size is that large even with just a SHA-2 signature. It sounds like the JSON itself is fairly large. Could you paste the JSON somewhere? I could try to see if I can represent the user data a bit more compactly. -- Gabriel Wicke (GWicke) (talk) 05:24, 13 June 2014 (UTC)
Correct, the json is very large. Like I said, the signature is 300-600 bytes of the 4k. The biggest section is the array of user rights. Much smaller, but second largest is the array of groups the user is a member of. Since groups have different rights per wiki, I think we want both. So a service can know it will grant certain abilities to Stewards, or users with the revisionsuppress right.
I was actually thinking about only encoding membership in the 'user' group in the JSON. That's sufficient for the bulk of all requests & actions, and can be represented in a single boolean (if it isn't already implicit in having a token). We could encode more group memberships in a bitmap, but at that point it's IMO fine to call back into the auth service to check whether the user has this rare right or that. As a side effect, this also lets us revoke more sensitive group memberships more quickly than the token validity period.
Regarding variance of rights associated with groups across wikis: In the longer term this can be stored per bucket in the storage service. In the shorter term, the storage service can fetch the right info per group from the auth service. -- Gabriel Wicke (GWicke) (talk) 20:51, 13 June 2014 (UTC)
I think it will give us much more flexibility long-term to have the rights explicitly in whatever we pass to the service. SAML and OpenID both behave this way. Kerberos, in it's basic form, doesn't. Although microsoft extended kerberos to have the central authority add an assertion about the user's group memberships into the protocol. So if history is an indication, I think we're going to want to have an assertion of the user's permissions in the token. And the basic unit of that is a user right.
If want each service to register which rights its interested in first, and the token contains 0/1 values for an specific array of user rights, we could do that. That obviously means if the permissions change, you have to wait for current authorizations to time out before the service can check for those authorizations, but it will make the token much smaller. CSteipp (talk) 20:36, 16 June 2014 (UTC)
I'd like to cover the vast majority of requests (95+%) with the minimal overhead possible. This means that cookies / tokens should be fairly small, and that there should be no extra per-request network calls in services while processing such common requests.
To me it also seems that calling back into the auth service for rare and sensitive actions would provide us with *more*, not less, flexibility in how quickly we'd like to revoke such sensitive rights. Could you describe a case where the reverse would be true? -- Gabriel Wicke (GWicke) (talk) 20:46, 16 June 2014 (UTC)
As we discussed, at minimum, the service needs to be able to say if a user has a right in the context of a given title, so that services can do the equivalent of $title->userCan(), and checks by extensions implementing the userCan hook are consulted. Due to that, this service would potentially be impacted by Requests_for_comment/AuthStack, so we need to make sure the implementation of that should take this use case into account. CSteipp (talk) 21:19, 17 June 2014 (UTC)
Arbitrary access right schemes will always require arbitrary code, which means that they'll involve a callback into the auth service. A goal of this RFC is to still support such requirements in the auth service while speeding up the typical case where users can read all articles in a wiki. -- Gabriel Wicke (GWicke) (talk) 21:46, 17 June 2014 (UTC)
Also, the services should authenticate to the authorization service. I'd prefer mutually authenticated TLS. CSteipp (talk) 21:19, 17 June 2014 (UTC)
Ideally we would not place special trust in ordinary services. There should be no way to retrieve non-public information about a user from the auth service without presenting a valid token provided by the user. This means that services can only act on a user's behalf. Mutual authentication of services can additionaly help in a belt-and-suspenders kind of way, but IMHO it should probably not be the primary protection. -- Gabriel Wicke (GWicke) (talk) 21:46, 17 June 2014 (UTC)
Added a goal sub-bullet of small token sizes for non-SPDY clients in the RFC. -- Gabriel Wicke (GWicke) (talk) 21:20, 16 June 2014 (UTC)
Just to document, using ES512 will give us about 256 bits of security (same as our current edit tokens and session ids), and the signature portion is about 176 characters long when B64 encoded (JWT format). CSteipp (WMF) (talk) 18:28, 5 September 2014 (UTC)
That sounds like a very attractive alternative as it's quite a bit smaller. The disadvantage of EC is an order of magnitude more computation during signature verification (from openssl speed):
                           sign      verify     sign/s verify/s
256 bit ecdsa (nistp256)   0.0001s   0.0003s    8256.5 3257.5
521 bit ecdsa (nistp521)   0.0004s   0.0008s    2762.3 1245.8
rsa 2048 bits              0.001206s 0.000036s  829.2  27556.6
0.8ms is not insignificant especially if it's verified several times. For a service that currently does about 5k req/s per core that would cut the throughput to about 1000/sec, while 2048 bit RSA would only drop it to about 4237/s. The other disadvantage is the uncertainty around the NIST-selected ECC curves. -- Gabriel Wicke (GWicke) (talk) 19:09, 5 September 2014 (UTC)
Gwicke, what is your vision for how these tokens would be issued. At login? By calling another service? CSteipp (talk) 21:19, 17 June 2014 (UTC)
As we discussed, there can be two flows: a) a cookie-based flow, and b) an OpenID connect based non-browser flow. Lets focus on a) for now.
The cookie-based flow is basically identical to that of current session cookies, except that the additional signed information in the cookie lets services authenticate most requests without a need for additional backend / auth service requests. A cookie with a signed and time-limited token is issued on log-in. Timed-out tokens (validity elapsed) can be implicitly refreshed by asking the authentication service for a fresh token, and then sending the result in a set-cookie header. The refresh business can be handled generically by a front-end proxy (restface for example), so that back-end services don't need to deal with it. -- Gabriel Wicke (GWicke) (talk) 21:46, 17 June 2014 (UTC)
This makes the validity time period of the token meaningless, unless there's some other authentication happening. If I steal a cookie from a user, I can keep using it forever as long as I periodically send it to the authz service and get a new one. At the very least, it needs to be tied to the user's session so a logout invalidates it. It would be better for the user's browser to do the refresh.
How would you propose to do a browser-initiated refresh? -- Gabriel Wicke (GWicke) (talk) 18:45, 18 June 2014 (UTC)
For your case b), I'm definitely not convinced. Can you come up with a use case for when a non-browser would use this, but wouldn't use OAuth? CSteipp (talk) 00:45, 18 June 2014 (UTC)
Case b) is actually OAuth2. But as I said, lets shelve that for now. -- Gabriel Wicke (GWicke) (talk) 18:45, 18 June 2014 (UTC)

Alternate proposal[edit]

I think we're narrowing in on a proposal that will work. I wanted to list what I think we're in pretty close agreement on, and we can work through further details.

  1. We use a JWT assertion of the user's basic identity details. We'll set it in a cookie when the user logs in, and periodically as core sees it hasn't been issued in a while and it's convenient to restore.
    1. To keep the json small, we'll store:
      1. the user's id (which shows they have an account)
      2. if they have read access on the wiki
      3. if they are blocked
      4. current username?
      5. The time the token was issued
    2. Each service can decide how old a token can be to still trust it. Maybe recommend 5 minutes for non-security-critical services?
    3. At any point, a service can exchange the user's session id (or login token?) for a current basic assertion JWT. The reference implementation will be a separate mediawiki endpoint in PHP, but we'll agree on an SLA that it will provide 90% of request in under XXms (we can work out an exact, reasonable number). That may require writing it in a faster language, or HHMV may be fast enough.
      1. If a service does this, they should return the new JWT in the user's secure cookie.
  2. We'll provide two other methods / services to allow you to:
    1. Exchange the basic JWT, or the user's session ID, for a full assertion of all rights on the wiki.
    2. Give an authorization determination for a (user + title + action) triple. This will interact pretty deeply with the AuthStack RFC I think.

Open Questions:

  • How do we do key rotation and revocation for the JWT signing key? A service that returns the current key?
  • Key sizes

-- CSteipp (WMF)

There is indeed a lot of agreement. We are both shooting for roughly the same system, which allows most requests to be authenticated by checking signatures only. The remaining discussion is about details, which should all be pretty straightforward to work out:
  • Introducing per-wiki right information doesn't scale for SUL. For reads on most private wikis all we need to know is whether the user is authenticated, which is already vouched for by having a valid token in the first place. Additional group memberships can be checked by calling back into the authentication service. Same for blocks.
  • I think the authentication service should provide a way to check individual group membership / block assertions, and perhaps a way to retrieve group memberships / block status per wiki. The translation of group memberships to rights is ultimately specific to each service. As an example, read access for a bucket in a storage service might be restricted to a specific group within the domain.
  • I don't see a need to handle per-page restrictions in the authentication service. In the short and medium term those will continue to be handled by core. In the longer term they could potentially be handled by the storage service.
  • For the implementation, the main things I care about are:
    • Really good response times, so that writes and other less common actions remain snappy. Ballpark based on experience with similar services: < 5ms at the 95th percentile.
    • Isolation from app code, so that eventually the authentication service will be the only service with access to user data, and exploits in app code can't directly compromise this data.
-- Gabriel Wicke (GWicke) (talk) 18:33, 3 September 2014 (UTC)
Regarding the open question you listed:
  • How do we do key rotation and revocation for the JWT signing key? A service that returns the current key?
    • The authentication service should be able to provide this. Instead of a single key it can also return a set of public keys to be used in the future with their validity time ranges so that services don't need to poll this all the time to remain up to date. We already assume that the clocks are reasonably synchronized (and use NTP everywhere).
  • Key sizes
    • I'm inclined to follow the recommendations / best practices on this one, which from my reading of the RFCs is currently RS256.
-- Gabriel Wicke (GWicke) (talk) 19:40, 3 September 2014 (UTC)

RFC meeting 2014-09-03[edit]

This RFC has been scheduled to be discussed in the Architecture RfC meeting today, 2014-09-03. Sorry for the late notice.--Qgil-WMF (talk) 15:49, 3 September 2014 (UTC)

21:01:37 <TimStarling> #startmeeting RFC meeting September 3
21:01:37 <wm-labs-meetbot`> Meeting started Wed Sep  3 21:01:37 2014 UTC and is due to finish in 60 minutes.  The chair is TimStarling. Information about MeetBot at http://wiki.debian.org/MeetBot.
21:01:37 <wm-labs-meetbot`> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
21:01:37 <wm-labs-meetbot`> The meeting name has been set to 'rfc_meeting_september_3'
21:02:20 <TimStarling> #topic RFC meeting September 3 | Please note: Channel is logged and publicly posted (DO NOT REMOVE THIS NOTE).| https://meta.wikimedia.org/wiki/IRC_office_hours | Logs: http://bots.wmflabs.org/~wm-bot/logs/%23wikimedia-office/
21:02:26 <marktraceur> Be afraid.
21:03:34 <TimStarling> csteipp and gwicke, are you at your respective keyboards?
21:04:27 <csteipp> TimStarling: gwicke is talking to me in person... he's headed back to his desk so we can record
21:04:30 <gwicke> yup, just returned from a chat with cscott
21:04:35 <gwicke> eh, csteipp
21:04:39 <gwicke> stupid completion
21:04:45 <csteipp> I get that a lot
21:05:01 <gwicke> yeah, sorry..
21:05:03 <TimStarling> tab completion should really know who you are talking about so that it can prioritise correctly
21:05:12 <csteipp> Anyone know the incantation to get the part to officially start the rfc talk?
21:05:12 <gwicke> definitely
21:05:25 <gwicke> can easily spare a core for that task
21:05:34 <TimStarling> #topic SOA Authentication | RFC meeting September 3 | Please note: Channel is logged and publicly posted (DO NOT REMOVE THIS NOTE).| https://meta.wikimedia.org/wiki/IRC_office_hours | Logs: http://bots.wmflabs.org/~wm-bot/logs/%23wikimedia-office/
21:05:40 <TimStarling> #link https://www.mediawiki.org/wiki/Requests_for_comment/SOA_Authentication
21:06:15 <TimStarling> so, it looks useful to me
21:06:29 <csteipp> So as I mentioned to gwicke irl, I think he and I are a little further apart in what we envisioned this would look like, but I think we have enough to look into
21:06:39 <TimStarling> the idea of this is that it will replace CentralAuth following the SUL finalisation project?
21:06:45 <gwicke> I think it might make sense to walk through the goals first
21:07:10 <gwicke> TimStarling: it should be able to do that, yes
21:07:37 <csteipp> gwicke: You want to lead through the goals?
21:07:50 <gwicke> sure
21:07:53 <gwicke> https://www.mediawiki.org/wiki/Requests_for_comment/SOA_Authentication#Goals
21:08:11 <gwicke> we already talked about single sign-on
21:08:36 <csteipp> ^ on that.. So user signs into <something> and is their user across all wikis and services?
21:08:50 <gwicke> a big goal is to be able to authenticate the bulk of the requests by checking a signature only
21:09:12 <TimStarling> by the way, Brion and Mark give their apologies today, they are in a conflicting meeting until 45 minutes past the hour
21:09:47 <gwicke> csteipp: functionally it'd basically be the same as it's intended with SUL
21:09:58 <csteipp> gwicke: To be pedantic, do mean "authorize" requests based on a signature?
21:10:05 <TimStarling> gwicke: then how do you revoke permissions
21:10:16 <gwicke> TimStarling: lets first walk through the goals
21:10:34 <gwicke> the other part is minimizing the impact of exploits
21:10:55 <gwicke> by limiting the code that has access to sensitive user data
21:11:30 <gwicke> csteipp: authenticate to authorize
21:12:11 <csteipp> gwicke: So... Identify the user?
21:12:22 <gwicke> another part is limiting the trust we need to place into random services & entry points by pushing the checking into the lowest possible layer
21:12:52 <gwicke> csteipp: yes
21:13:41 <gwicke> the other goals are minor I think
21:14:32 <TimStarling> so how do you revoke permissions?
21:14:34 <legoktm> does this RfC also include re-writing the AuthPlugin things (basically https://www.mediawiki.org/wiki/Requests_for_comment/AuthStack) ?
21:14:48 <gwicke> so any comments on the goals?
21:15:24 <gwicke> TimStarling: tokens are time-limited, and certain permissions need to be checked with an authentication service every time they are required
21:15:48 <TimStarling> ah right, hence "revocation within minutes" on the RFC
21:16:01 <csteipp> So there are some distinct functions you've listed. I want to delve into authentication.
21:16:09 <gwicke> TimStarling: that's a possible solution
21:16:21 <csteipp> i.e., only having this service have access to credentials
21:17:53 <gwicke> yes?
21:18:12 <TimStarling> he's probably got some sort of hand injury ;)
21:18:15 <csteipp> It seems limiting to make this new service handle authentication, when i.e., authstack is aiming to privide multiple authn mechanisms.
21:18:24 * csteipp tries to phrase this all.
21:18:38 <csteipp> If we're just trying to protect hashes... let's use ldap auth
21:19:06 <csteipp> So it seems to tackle the authentication side, we need to build a plugable system, i.e., for 2fa
21:19:09 <gwicke> I haven't looked at authstack much
21:19:23 <gwicke> if it isolates the authentication service as well, then great
21:19:30 <TimStarling> you are thinking "log in with your facebook account" etc.?
21:20:13 <csteipp> Mostly 2factor is my concern, but yeah, facebook / openid / etc.
21:20:13 <gwicke> there are many ways users could authenticate
21:20:33 <csteipp> So in this RFC, are we handling those?
21:20:41 <TimStarling> 2FA could presumably be integrated with the proposed service
21:20:45 <gwicke> that's a matter for the authentication service
21:20:54 <gwicke> if that's running some authstack magic internally then great
21:21:27 <csteipp> I assumed not initially, which is why I put that part on the talk page about session management and identification... but if we're doing it, that increases the scope a lot-- I want to make sure we're clear on if we are or not.
21:21:30 <TimStarling> maybe 2FA/OpenID can be a second project?
21:21:33 <gwicke> there are a few solutions for stand-alone authentication services out there
21:21:40 <TimStarling> it seems to me that we need to have a narrow scope
21:21:54 <gwicke> yes, it's out of scope for this RFC
21:22:16 <TimStarling> presumably we can do simple feature parity first and add OpenID/2FA later?
21:22:25 <gwicke> as far as this RFC is concerned, magic happens inside the authentication service and a signed token plops out
21:23:09 <TimStarling> the authentication service in that case should be pretty simple
21:23:20 <TimStarling> the MW integration is probably a larger part of the project
21:23:36 <TimStarling> especially if we want to fix a few architectural issues while we are at it
21:23:49 <csteipp> "magic happens inside the authentication service and a signed token plops out" <- so the service does the authentication?
21:23:55 <gwicke> yes, especially if we want MW to use this service as well
21:24:09 <TimStarling> presumably MW would send the username and password to the auth service
21:24:21 <gwicke> yup
21:24:40 <gwicke> or the user's browser would even talk to the auth service directly
21:24:59 <gwicke> I'd think mw forwards though
21:25:10 <gwicke> csteipp: yes
21:25:19 <TimStarling> omitting 2FA/OpenID means we don't need frontend work, since the UI will be the same
21:25:50 <TimStarling> anyway, I am unconvinced on goal 1, authenticating requests by signature only
21:26:23 <gwicke> it's pretty common these days
21:26:47 <gwicke> TimStarling: which concerns do you have about it?
21:27:35 <TimStarling> seems like premature optimisation
21:27:52 <gwicke> heh
21:28:08 <TimStarling> remote checks can presumably be done in a few ms
21:28:24 <TimStarling> in exchange for a saving of a few ms, you are making auth cookies be short-lived
21:28:36 <TimStarling> which is going to be a hassle for all client implementors
21:28:42 <gwicke> not necessarily
21:28:59 <gwicke> in oauth that's all handled by the library anyway
21:29:01 <TimStarling> plus you introduce a delay before revocation takes effect
21:29:02 <gwicke> and standard practice
21:29:23 <TimStarling> which is sometimes relevant in our world -- sometimes you want to lock people out within seconds
21:29:24 <gwicke> with cookies we can handle it on the server by asking for a new token from the auth service, and returning a set-cookie header
21:30:08 <gwicke> so the validity of those tokens is typically on the order of a minute, at most a few minutes
21:30:13 <TimStarling> consider it this way: if you want people to be able to do zero requests after revocation, then the expiry time has to be shorter than the mean inter-request time
21:30:30 <TimStarling> which means that you need a new set-cookie header on average for every request
21:30:44 <TimStarling> seems like pointless additional complexity to me
21:30:55 <gwicke> it matters for apis
21:31:29 <gwicke> if you aim for a response time < 50ms total, you don't want to spend 10ms or so calling an auth service
21:31:52 <TimStarling> so make it faster than 10ms
21:32:00 <gwicke> for the common case, which would be read requests
21:32:10 <csteipp> So I want to call out a distinction. This ^ is session management. MediaWiki uses memcache for it, and it's a lot faster than 10ms.
21:32:51 <TimStarling> how would session management work?
21:33:11 <gwicke> just as it does right now?
21:33:35 <csteipp> Right now mediawiki exchanges the session cookie for a cached user object (in most cases)
21:33:43 <TimStarling> right now, CA does session management, but CA is going away
21:33:45 <gwicke> so direct access to memcached wouldn't work for many reasons
21:33:53 <gwicke> security being a big one
21:34:10 <gwicke> you don't want random services read & write to memcached
21:34:41 <gwicke> csteipp: that's fine, mw can continue doing that
21:34:52 <TimStarling> the revocation list will presumably be very short -- that is a rare but important event
21:35:16 <TimStarling> suppose you have a service which simply verifies the signature and checks the payload against a revocation list which it has in memory
21:35:30 <TimStarling> surely that could be done a lot quicker than 10ms
21:35:32 <gwicke> normally there is no revocation
21:35:47 <csteipp> Each service keep it's own revocation list? yikes.
21:35:54 <gwicke> if it's a sensitive & typically rare operation, just check with the auth service
21:36:00 <gwicke> which gives you instant revocation
21:36:09 <TimStarling> no, "a service" = the auth service
21:36:25 <csteipp> Oh, that's what gwicke is saying, right?
21:36:29 <gwicke> the auth service would just use the db, as it does right now
21:37:12 <gwicke> so, to clarify: we are mainly talking about reads being authenticated by signed token
21:37:15 <TimStarling> what I am asking about sessions is first -- how do you validate a session? what is in a session cookie?
21:37:18 <gwicke> for public wikis, even that can be skipped
21:37:35 <TimStarling> currently it is a token which is checked against the MW DB, presumably it will not be that anymore
21:37:36 <gwicke> writes would always be validated with the auth service
21:38:01 <csteipp> TimStarling: For that, some sort of signed token
21:38:16 <TimStarling> but I thought signed tokens have a lifetime of 2 seconds or something
21:38:17 <gwicke> TimStarling: it can be just that
21:39:01 <gwicke> we can just keep the session cookie if that works better than a signed user id inside a token
21:39:17 <gwicke> token lifetimes are longer than a typical request sequence
21:39:26 <gwicke> so on the order of single-digit minutes
21:40:06 <TimStarling> by signed token do you mean JWS?
21:40:14 <gwicke> yes
21:40:18 <gwicke> JWT
21:40:42 <gwicke> https://www.mediawiki.org/wiki/Requests_for_comment/SOA_Authentication#JWT_Bearer_tokens_.2F_OpenID_connect
21:40:55 <TimStarling> a JWS is a signed JWT
21:41:10 <gwicke> ah, didn't know that acronym yet
21:41:41 <TimStarling> so you have a JWS in a long-lived session, then the client does a request, gets back a second short-lived JWS?
21:41:47 <csteipp> TimStarling: Cite?
21:42:00 <gwicke> csteipp: http://self-issued.info/docs/draft-ietf-oauth-json-web-token.html
21:42:04 <TimStarling> https://tools.ietf.org/html/draft-ietf-jose-json-web-signature-31
21:42:39 <TimStarling> yeah, in the JWT spec it says "This example shows how a JWT can be used as the payload of a JWE or JWS to create a Nested JWT."
21:42:55 <TimStarling> then it has a link to the JWS spec where it discusses MAC serialization etc.
21:43:11 <gwicke> TimStarling: for cookies we'd refresh as necessary
21:43:17 <csteipp> Ah, I haven't kept up with jose. Yeah, looks like they're moving to that terminology now
21:43:20 <TimStarling> a JWE is an encrypted JWT
21:43:25 <gwicke> for oauth / openid connect it would follow the normal refresh flow
21:44:23 <TimStarling> it's getting late, we have to talk about tgr's RFC
21:45:02 <TimStarling> this session stuff is not in the RFC at the moment is it?
21:45:05 <gwicke> well, thanks for checking it out!
21:45:18 <gwicke> TimStarling: sessions are pretty orthogonal
21:45:18 <csteipp> It's a large part of the talk page
21:46:15 <gwicke> most services don't need sessions
21:46:24 <gwicke> and the ones that do can store them any way they like
21:46:49 <TimStarling> sounds like you're expanding scope
21:46:51 <gwicke> using the user id, name, or some session id communicated in another cookie if desired
21:47:06 <TimStarling> what I want is documentation of a CA replacement project
21:47:16 <TimStarling> and obviously sessions are fairly relevant to that
21:47:48 <TimStarling> if there is any really urgent service use case that needs to be part of the initial project, maybe that can be added, but it seems like a bit of a drift to me
21:48:28 <gwicke> well, that sounds like a different RFC to me then
21:48:38 <gwicke> sessions are a per-service concern to me
21:48:38 <TimStarling> it's too late for another RFC really, I have another meeting immediately after this one

OAuth[edit]

Just stating the obvious: OAuth is an authorization protocol, not authentication or session management. I think the scope of this RFC is broader than authorization, but let's make sure we get that defined.

It's about authentication and authorization, in that order. I consider session data storage to be outside this RFC's scope. -- Gabriel Wicke (GWicke) (talk) 22:39, 3 September 2014 (UTC)

Phases and relationship to CentralAuth[edit]

Tim put in the RFC discussion, "<TimStarling> what I want is documentation of a CA replacement project"

Can we make Implementation_phases specifically address that? There are a couple ways to do this, with drastically different amounts of work:

  1. CentralAuth uses the service for password checking and cookie generation / verification. This is by far the least amount of work.
  2. The service is called by a new Auth Plugin extension that does the same things CentralAuth does. This is a lot of work, and while it would give us the freedom to implement the functionality in any way we like, there are very good reasons for the way CentralAuth does them, so significant portions would duplicate what CentralAuth is already doing.
    • Authentication of passwords
    • Session management on the current wiki
    • Cross-wiki session management (setting up the session safely, handling auto login, auto create, single-password signon)
  3. MediaWiki core gets rid of the idea of Auth Plugins, and only uses a MediaWiki service for Auth. This would mean the service has to be extensible so that the WMF can do CentralAuth, wikitechwiki can do ldap + 2fa, and all of the current authn/z plugins would have to be rewritten. This gives the auth service writer the most freedom, but is a lot more work for everyone.

Or am I missing a scenario that you're thinking of? I'd like to make sure this is clearly defined, and then the phases should indicate the scope of what will be done in each. CSteipp (WMF) (talk) 16:16, 9 October 2014 (UTC)

+1 on clarifying this right in the phases section. I would consider phase 1 a good starting point for gradually expanding the scope to cover more of the CA functionality. I think we should perhaps emphasize that step 2 should be an iterative process, and can involve CA (or a copy of it) speaking to the service.
We should also have a clear section on where this is headed in the longer term, so having that last bullet there makes sense. -- Gabriel Wicke (GWicke) (talk) 19:00, 29 October 2014 (UTC)

Token vs. session validation[edit]

Tim points out that there have been many cases of a vandal going on a vandalism spree and being blocked in less than a minute, to the point where checking blocks from the db slaves was deemed too slow, so we do the checks against master now. So the number of actions that can be handled by external services without talking back to the Auth service is very limited. This makes me question if the engineering effort of the tokens is worth it, vs. just providing a service that exchanges current information about the user from Redis in exchange for the session identifier (centralauth_Session cookie currently).

A service to do that, which could return a result in much less than 10 ms, would be a trivial php service. CSteipp (WMF) (talk) 21:34, 22 October 2014 (UTC)

For vandals we are mostly interested in write actions, which in the current design will be checked against the auth service in any case. The big gain from tokens comes from authenticating *read* requests, which are very common compared to writes. -- Gabriel Wicke (GWicke) (talk) 17:53, 29 October 2014 (UTC)

Clearly defined use cases[edit]

Reading through this RFC and talk page shows me a lot of design decisions and discussion of the trade offs, but I'm not finding clearly defined use cases for the actual service/API/whatever. I don't think this is a unique deficiency of this RFC, but I do think that it is difficult to reason about very complex and technical changes without a shared reference point of the desired outcome. @GWicke: could you provide the concrete use cases for the restbase system? I can kind of make up use cases from the information in the goals section but what I make up and what the initial consumer application desire may or may not actually agree. --BDavis (WMF) (talk) 00:07, 11 December 2014 (UTC)

@BDavis (WMF):, there are a lot of use cases since auth{entication,orization} is a cross-cutting concern. The typical use case from a service perspective is authenticating a user request in order to make authorization or customization decisions. As an example, RESTBase and MediaWiki need to figure out whether a user can access a bit of content or perform some action in a protected wiki. MediaWiki can currently do this using its private internal code and direct MySQL database access, while services can't make any of those decisions for lack of authentication and authorization information. Hope that is concrete enough. If not, let me know which area you are specifically interested in. -- Gabriel Wicke (GWicke) (talk) 22:54, 7 January 2015 (UTC)
Keeping things concrete, the places that I'm aware of different "services" doing/needing Authn/z are
  • Mediawiki core - needs to do $this->getUser() type calls, probably support password-based authentication for small installs, permissions and groups.
  • CentralAuth - password-based authentication, session handling for local mediawiki core, cross-domain/wiki identity management, groups.
  • parsoid - forwards session cookie to private wikis for read (and write?) requests, but I think there were ideas to have parsoid make calls as the user (OAuth style)
  • RESTBase - ?
  • We also have extensions like OAuth that manipulate identity and rights based on the request header
GWicke, are there any other services that we currently have running or are planned for the next 6-12 months that would use this? I don't think OCG, mathoid, or ContentTranslation would need this, from the small pieces I've seen. CSteipp (WMF) (talk) 00:39, 9 January 2015 (UTC)
RESTBase is definitely a big one, as it will mediate access to stored content and other services. I can't currently think of any other services that will directly need to use the auth service, but they might be forwarding user cookies to other services, which might in turn pass them to restbase or the PHP API. Examples for such pass-through services would be parsoid, mathoid, and ContentTranslation. -- Gabriel Wicke (GWicke) (talk) 01:03, 9 January 2015 (UTC)
GWicke, can you explain what Authn/z RESTBase does currently, and you're planning to do with it? Or point to documentation? I haven't followed the details on that project. I'll also add AbuseFilter as something that would benefit from being rewritten as a microservice in the next 12 months, which would need strong identification of the submitting user and knowledge of that user's groups and rights. CSteipp (WMF) (talk) 18:35, 9 January 2015 (UTC)
RESTBase exposes a REST API, stores content and metadata and performs requests to backend services like parsoid in case something was not in storage or is not configured to be stored. Especially content access needs to be protected. For public wikis, this mostly means keeping track of revision deletions / suppression & calling back into the auth service to determine whether an authenticated user still has access to the information. For private wikis with a standard configuration, most read requests additionally need to be authenticated. The understanding of groups necessary for this is limited to the list of permissions corresponding to anonymous & authenticated users per wiki. More complex decisions (like deleted revisions) can be handled by asking the auth service whether the session has the necessary permissions, so no understanding of groups is necessary there. Generally, restbase cares about permissions and not about complex groups.
You make a good point about abusefilter. It will definitely need to communicate with the auth service to figure out the block situation for the user. This check could just be implicit in the response to a check for 'edit' permission, so probably doesn't require special API support in the auth service. -- Gabriel Wicke (GWicke) (talk) 18:49, 9 January 2015 (UTC)
Another thing that services will need to generally check is CSRF tokens, so that's another thing to add to the auth service API. It might make sense to handle this generically for all POSTs coming into RESTBase, which would free services integrating with the REST API from doing so themselves & ensure that it happens consistently. -- Gabriel Wicke (GWicke) (talk) 19:00, 9 January 2015 (UTC)

RESTBase use cases[edit]

We discussed the RESTBase use cases that are relevant to authorization and prioritized them as follows:

Current situation[edit]

  • no easy support for editing, storage on private wikis
  • API exposes only read-only operations; no PUT
  • no need for CSRF yet
  • can block access to deleted revisions in revision 0

Will need authorization for[edit]

  1. private wiki VE / section editing, storage and retrieval (our prio: high)
  2. Generic CSRF validation for backend services (our prio: medium)
  3. refined deleted revision access (our prio: low)
  4. PUT support for internal bulk uploads (analytics; prio: low as we can hack around)
  5. PUT support for external users (think cluebot; prio: low)

-- Gabriel Wicke (GWicke) (talk) 00:39, 24 January 2015 (UTC)

API strawman[edit]

RESTBase use cases:

  • trade a user-supplied token against an assertion on whether the request should be granted one of a list of rights (authorization)
    • examples of rights, v0: read access on private wiki, read access to revision deleted revision
    • examples of rights, longer term: PUT request on specific storage bucket used for external projects, PUT of a new revision (likely saved through the PHP API), read somebody's watchlist or notifications, right to perform a large number of requests (throttling), etc.
  • trade a CSRF token against a boolean assertion on whether the CSRF token is valid
  • high-performance minimal authorization: check a user-supplied token for validity using cryptographic means (JWT), and use a right assertion contained within it if valid

MW use cases:

  • authenticate a user based on user/pass or other schemes
  • potentially, later, authorize as above for RESTBase

Security use cases:

  • avoid leaking sensitive user info on random exploit

API strawman:

POST /authorize
 permissions: [array of permissions needed],
 token: <user-supplied token>

 returns: array of boolean assertions, or array of permissions actually granted

GET /csrf # get a CSRF token; access restricted
  
 returns: newly minted csrf token

POST /csrf
 token: <user-supplied token>

 returns: boolean validity

## Authentication tbd

-- Gabriel Wicke (GWicke) (talk) 03:25, 21 January 2015 (UTC)

Wikia implementation: Helios[edit]

Wikia have are working on an authentication service using Go, MySQL & Redis: https://github.com/Wikia/helios

They are interested in collaboration, so we should investigate how their work fits with our requirements. -- Gabriel Wicke (GWicke) (talk) 20:31, 25 January 2015 (UTC)