Auth systems/SUL2

Current System

 * When a global user logs in to a local wiki, CentralAuth will inject images on the result page to attempt to log the user into other WMF projects
 * Images for each wiki in $wgCentralAuthAutoLoginWikis are generated
 * The images call Special:AutoLogin with a token, which is used to setup the session associated with this user
 * The user gets a top-level-domain cookie for each wiki, which expires in 1 day
 * On logout, the session is deleted

Current Issues

 * In the existing scheme, many mobile browsers (70%) do not accept the cookie for the foreign wikis when the user hasn't visited that wiki directly
 * Firefox 22 will block third party cookies as well
 * At minimum, users need to be logged into commons.wikimedia.org and wikidata.org to take advantage of mobile and visualeditor features

Overview
To address the shortcomings of the current SUL system, the Platform team is making significant changes to the way that CentralAuth logs users into other wikis on login:
 * Users will be logged into a new, centralized domain when they login to any public WMF wiki with a global account
 * Global accounts will no longer see the "Login Success" page after login, instead they will be redirected back the article where they came from. On the article page, we will transparently attempt to login the user to all of the sister projects, instead of relying on the images on the login success page.
 * All of the public WMF wikis will use the central wiki to check anonymous user's logged-in status, and transparently log the user in if they are centrally logged in.

Details

 * A central domain where global users would have a session / cookie
 * After logging in on the local wiki, CentralAuth would redirect the user to the centraldomain to set the cookie
 * On each wiki, anonymous users would have some JavaScript that contacts the central domain to determine if the user is logged in
 * If the user is logged in, update the UI or redirect the user to finish building their session
 * If the user is not logged in, set a cookie/local storage so the wiki doesn't attempt the check again
 * Since each user will call this service once per session, we can estimate a load of about 193 calls/second
 * Special:UserLogin will always check the central domain for a session, and show a successful login message if the user is logged in
 * Local wikis will also provide an api where a user can request a short-lived token, which can be used for authentication to another wiki's api. This will allow users to talk to the api of other wikis in the cluster as the global user.
 * To limit the potential for abuse, the token should not live more than a few seconds, will only be valid for a single target wiki, and will expire after use.
 * So this is pretty much like what Special:AutoLogin does? And I'd guess that the client would be having to get another token for every request to the other wiki, since XMLHTTPRequest's attempt to set cookies probably will be blocked too. How would that interact with CORS preflight OPTION requests, if bug 41731 ever gets fixed?
 * Pretty close to Special:AutoLogin, although I'd like the expiration to be a few seconds instead of minutes. We can make sure it's long enough that the client can do a preflight and the request before it times out. And yes, the client would need to request a new token per call.
 * If the token expires after a single use, won't the preflight will "eat" the token so it won't be valid for the actual flight?
 * I don't think the token will be sent in the OPTION call (assuming your doing a post, and that is why you need the preflight), so I don't think it would be consumed at that point. But that's just from reading the spec and doing a little playing around in Firefox. We could probably allow it to be used twice if it's an issue.
 * That depends if it's a GET or POST, of course, and note the API already requires the "origin" parameter be included in the query string even for a POST. I guess the question there is whether the preflight also would ever need to be authenticated.
 * The preflight in this case is really only answering if it's ok for the UA to talk the method between the origin and this domain. I really can't think of a case where we would want to change that, based on the whether or not the user is authenticated. Unless there is a reason to allow CORS for some users but not others?

Design

 * WIP. These should be finished before coding.

Redirect on Login

 * User logs in to local wiki
 * During login, if CentralAuth detects that this is a global account
 * Setup/initialize the local wiki session
 * Generate and keep a secret, used to sign the redirect response
 * Generate a token (t1), which identifies the user to the central domain
 * Redirect user to https://centraldomain/Special:CentralLogin/login?token=
 * Special:CentralLogin will check for an existing session
 * If there is an existing session that does not match the one reference by the token, show an error message, including link back to original site, and stop
 * If there is an existing session, and the names match, redirect back to original wiki and show success message (do not return a token, and do not update central session)
 * If there is no existing session
 * set a central session cookie, with a "pending_name"
 * delete the memcache object the token references
 * Give the user a page with a form that will post back to original wiki (Special:CentralLogin/result) with a token (t2) and a signature for that token
 * Page has javascript to submit the form automatically
 * On the local wiki, check the token+signature:
 * If the signature on the token is not verified, show an error message
 * Otherwise the local wiki completes the session setup for the central session referenced by t2 (at minimum, this includes 'user' and 'token')
 * (for now) Show the SUL icons to attempt the normal autologin process?

Notes:
 * In the blog post by Jonathan Mayer, he recommends for setting 3rd party cookies: "The most transparent practice is for you to redirect the user through your origin. You could also use a non-cookie storage technology, though alternatives may be limited by this policy in future."
 * It appears that Safari will handle cookie sets on a redirect done in javascript
 * Some versions of Safari do not handle cookie setting on 301/2 redirects
 * The central domain should only be accessed over SSL
 * On the central domain, only very highly trusted user can modify any javascript. No user script should be allowed.
 * The central domain pages should not be iframe-able

What are other people doing:
 * Google does SSO with google.com and youtube.com with three 302 redirects on login (accounts.google.com -> accounts.google.com -> accounts.youtube.com -> www.google.com)
 * Yahoo/Flickr: all logins go through login.yahoo.com. Flickr does a verification/auth handshake (openid?) with yahoo.com to signin

Architecture Decisions:
 * In following Jonathan Mayer's advice, a full browser redirect through the central domain is the best way to set the cookie.
 * To prevent the Safari bug, and to ensure the token has minimal exposure to the browser, we'll use a javascript form POST to redirect back from the central domain
 * Threat Modeling

UX Notes (conversation with Jared Zimmerman, 2013-06-11):
 * On both mobile and desktop, Jared does not like the CentralAuth "Login Successful" page, and would like to see the user directly sent to the return-to page.
 * He feels like there is very little value to showing the icons
 * He was wondering if we could do the login part invisibly to the user. Maybe inject the iframes mechanism in a hidden div on the return-to page?
 * It would be better to get rid of the delay/spinner on loginwiki
 * If we are ok breaking SUL for buggy versions of Safari (which looks like it was fixed in OSX 10.5, so pretty old), and we do a redirect on the local wiki back to the return-to page, then we should be able to turn both redirects into 302 redirects instead of having them triggered from javascript.

Local wiki Javascript Auth check
Checks are done with a CORS AJAX post if available, with server responses as JSON objects. If CORS is not available (e.g. in IE&lt;10), the browser creates an iframe with a form that uses Javascript onload to POST itself, and the server response is an HTML page with inline scripts and possibly another auto-POSTing form.
 * https://gerrit.wikimedia.org/r/#/c/58924/


 * (C1) On each local wiki, if the user is not logged in, and does not have a local storage key or cookie specifying they are anonymous, check the authentication status on the central domain
 * Do the check, even if the anonymous cookie is set, if the user is on Special:UserLogin
 * The check should only be made from other WMF sites. It should not be possible to determine the user’s logged-in status from other websites. Easy enough with CORS, and iframes using POST should also be safe due to browser cross-origin restrictions.
 * Check returns the gu_id associated with the central wiki session, or 0 if not logged in.
 * If the user is not authenticated on the central domain, set a local storage key or cookie indicating the user is anonymous. Otherwise, clear both the local storage key and cookie.
 * Local storage would be nice, so we’re not adding yet another cookie to send back and forth. However, IE6+7 is 2.5% of traffic; and have no local storage
 * (L1) If the user is logged in, the browser posts the gu_id to the local wiki to start a session
 * If local wiki has a session, abort with Error
 * The local wiki will generate a token (T1), store it in the local session, and store the gu_id and local wiki identifier in memcached under T1. The local wiki returns T1 and the wiki identifier to the user. Server reply includes session_id cookie.
 * (C2) The Browser then contacts the central domain with T1 and the wiki identifier
 * Central domain references memcached using T1
 * If gu_id in memcached does not match gu_id of current user, fail with error
 * If wiki identifier in memcached does not match wiki identifier in the request, fail with error
 * Otherwise, store CentralAuth login session information in memcached under T1
 * Return to browser a success/error
 * (L2) Browser contacts local wiki
 * Retrieve T1 from the local session
 * If wiki identifier in memcached does not match gu_id of current user, fail with error
 * Otherwise, set CentralAuth session cookies using the information in memcached
 * If local wiki has a valid user session, refresh the session id (note: refreshing the session id prevents a requirement to refresh it on L1, but may be avoidable).
 * Browser reloads the page or takes other appropriate action.

Local Wiki                             Browser                                 Central Domain

(C1) → wiki, → ← gu_id ←

← guid ← (L1) → T1, wiki, →

(C2) → T1, wiki, → ← Success/Error ←

← ← (L2) →  →


 * After discussing this with Brad yesterday, it seems like the Sig / L2 is redundant in the attacks it's trying to prevent, and can be removed without weakening exchange. The Central Domain will fill the user's session if the wiki and gu_id match. A call to the local wiki (similar to L2) may still be used to refresh the session information.

UX Notes (conversation with Jared Zimmerman, 2013-06-11):
 * Jared recommends replacing the entire p-personal section with the username/etc on login. He's not concerned that the user will see themselves logged in, but with potentially a different skin/actions/preferences from what they normally see logged in.

Local wiki token for cross-wiki api
(Brad wrote this)
 * Use the ApiTokensGetTokenTypes hook to add "centralauthtoken" as an option to action=tokens
 * Use the APIGetAllowedParams and APIGetParamDescription hooks to add "centralauthtoken" as an option to ApiMain.
 * Somehow, when "centralauthtoken" is given to the API, use the associated user data when executing the request. But where?
 * In CentralAuthHooks::onUserLoadFromSession when MW_API is defined? In this case we need to make sure not to invalidate the token for OPTIONS queries.
 * In some later hook, e.g. ApiCheckCanExecute? In this case, it seems possible that some things could be checking rights and such on the old, probably-anonymous user before it reaches the point of calling the API hook.

Task List

 * Special:UserLogin redirect to central domain
 * https://gerrit.wikimedia.org/r/#/c/59211/
 * Javascript check on local wiki to get central domain's auth status
 * https://gerrit.wikimedia.org/r/#/c/58924/
 * Api for getting a token on local wiki (or central?) good for authenticated call to another wiki
 * https://gerrit.wikimedia.org/r/#/c/57662/
 * Integrate patches onto Labs
 * Testing plan; Test
 * Plan
 * Setup SSL on labs for testing
 * Ensure CentralAuth & SUL(v1) continue to work with $wgCentralAuthLoginWiki = false, deploy code with 1.22wmf3 (merge by April 29!)
 * Work with ops to setup central domain
 * Enable $wgCentralAuthLoginWiki for wikidatatestwiki and test2wiki
 * Deploy SUL2 to all wikis
 * Enable iframe-based autologin instead of images (https://gerrit.wikimedia.org/r/#/c/62194/)
 * Remove image-based autologin

BLOCKERS

 * 1) IE8 needs a P3P file to send cookies with a cross-site iframe (done by Brad)
 * 2) Performance of JS Auth Check. Needs input from UI people. Possible solutions:
 * 3) * Decided to handle this with:
 * 4) ** iframe-ping on login, to attempt to login any users who allow 3rd-party cookies, or who have an existing cookie from those sites
 * 5) *** https://gerrit.wikimedia.org/r/#/c/62194/ and https://gerrit.wikimedia.org/r/#/c/64253/
 * 6) ** When an anonymous user is logged in with the javascript, use a small notification instead of page refresh
 * 7) *** https://gerrit.wikimedia.org/r/#/c/58924 (PS9)
 * 8) Waiting on test.wikidata.org to be deployed to finish limited-deployment testing

Rollout Plan

 * ✅ Setup the central wiki
 * ✅ Modify centralauth's interaction with the login to redirect to central wiki and set the central cookie; continue to use Special:AutoLogin images temporarily.
 * ✅ Deploy limited tests to testwiki (only using memcached in pmtpa), test2wiki, testwikidatawiki (done on July 1st)
 * Enable autologin javascript on all wikis (scheduled for Thurs July 11th)
 * Disable Special:AutoLogin
 * 1) IE8 needs a P3P file to send cookies with a cross-site iframe (done by Brad)
 * 2) Performance of JS Auth Check. Needs input from UI people. Possible solutions:
 * 3) * Decided to handle this with:
 * 4) ** iframe-ping on login, to attempt to login any users who allow 3rd-party cookies, or who have an existing cookie from those sites
 * 5) *** https://gerrit.wikimedia.org/r/#/c/62194/ and https://gerrit.wikimedia.org/r/#/c/64253/
 * 6) ** When an anonymous user is logged in with the javascript, use a small notification instead of page refresh
 * 7) *** https://gerrit.wikimedia.org/r/#/c/58924 (PS9)
 * 8) Waiting on test.wikidata.org to be deployed to finish limited-deployment testing

Rollout Plan

 * ✅ Setup the central wiki
 * ✅ Modify centralauth's interaction with the login to redirect to central wiki and set the central cookie; continue to use Special:AutoLogin images temporarily.
 * Deploy limited tests to testwiki (only using memcached in pmtpa), test2wiki, testwikidatawiki (tentatively scheduled for Thurs July 11th)
 * Enable autologin javascript on all wikis
 * Disable Special:AutoLogin