Extension:Campaigns/Engineering

Use cases

 * A. Why is someone creating an account?
 * There are lots of paths into creating an account but we can't tell which is effective.


 * B. Invitation to participate in a project, e.g. PhilippinesOutreach
 * We have a promotion that goes to on-wiki page(s) (not necessarily the Create account form). We'd like to know how many people followed it, and associate any account creations with the campaign.


 * C. Present special material during and after account creation and/or login.
 * If we know someone created an account or logged in to participate in the PhilippinesOutreach, then we can do better than present them the default GettingStarted page, we can give them special messaging.

Status
implements this extension


 * PHP code sets the cookie. This may not work with caching.
 * need to move ServerSideAccountCreation logging out of Extension:EventLogging to this extension, and add this campaign cookie to the logged event.

Implementation decisions
The extension's page has the high-level view, read that.

Campaign names in ?campaign=someName  might be apsiw, PhilippinesOutreach2013, fromredlink, fromtutorial, etc. Code will limit length to 50 chars.

Setting a cookie from ?campaign an utterly general facility. But for clarity and scope we'll only enable it on links to Create account. (S Page tried to draw diagrams contrasting links to any landing page on our a wiki cloud vs. a link straight to Create account, but Inkscape crashed.)

Desirable features

 * do all this on the server in PHP
 * remove the ?campaign=someName parameter from the URL

Decided
Don't do anything if the user is logged in.
 * A logged-in user creating an account isn't the target for campaign tracking. Instead of later filtering them out (e.g. drop ServerSideAccountCreation events with isSelfCreated false), we ignore them.
 * This means no need to check the "Exclude me from experiments" preference.

Don't log a campaign event, just set the cookie.
 * The only thing that matters is successful account creation from the campaign.
 * We're not logging anything of interest, we're not tracking page impressions and clicks on Account creation (as we did for ACUX).

Don't set the session ID token.
 * Again, we're not tracking user behavior before creating an account (in fact as of 2013-06 few schemas in use besides Mobile log a user token), so no need to relate activity before account creation.
 * usertagging can happen upon SSAccountCreation.

Attempt to rewrite browser history state to remove ?campaign=someName in the browser's location field?
 * + Stops users from unintentionally propagate campaigns by bookmarking or sharing the landing URL.
 * - maybe we want to know every time anyone goes to a ?campaign=someName URL on the wiki, regardless of how they got the link.

Names for the project, for the query string, for the cookie
 * For now, ?campaign, and $wgCookiePrefix.campaign. We considered and discarded ?c=foo, ?camp=foo and scamp™

Why always set a session cookie?
 * It's the reliable way to associate a campaign that lands users on some wiki page with later account creation, and this is the most common and important use case.

Why only do this on account creation URLs? It's a general approach.
 * Narrows the focus. If and when we need to do more, we'll do more.

Why PHP?
 * JS: client-side already has code to set user session token. Also bots and people who don't want it to be known that they participated in a campaign have some overlap with people who disable JS.

2013-06-06 we decided to allow any campaign value. The schema will not enumerate all the currently valid campaign values; the pros and cons of doing this are
 * + stops people fabricating Rickroll links that set campaign cookie to troll values
 * + forces people to think about the campaigns they want in advance
 * + lets us invalidate campaigns
 * + [View history] provides a history of campaigns
 * + Adds a campaign gatekeeper, WMF staff can't just add ?campaign=SFMeetupHackers to their URLs.
 * - a lot(?) of churn in the schema, there will be a new Schema table each time campaigns are added or dropped, we will constantly be updating the extension.

- updating the schema rev in PHP will be constant busy-work

What about multiple campaigns?
 * If the user follows one link with ?campaign in it and then clicks another link with ?campaign, the second replaces the first in the session cookie; only the most recent is associated with the account creation. This is OK since the thing that encouraged account creation is the "closest" campaign, but it means we can't abuse ?campaign to track all the ways users arrive at the Create account form.


 * We could enhance the cookie to store multiple campaigns, but YAGNI.
 * If and when we roll this out more generally, wiki URLs into account creation could use a different query string parameter, e.g., ("create account source", also topLeftNav, WikiTutorial, HelpLoggingIn, etc. ).

Question: should we log campaign cookie contents in case there's already something in it and we overwrite?
 * No, nobody has time to puzzle out this crap!

Question: do we overload this to handle userbucketing for A/B tests?
 * No, we have mw.user.bucket for that.

Idea: put this in a general-purpose session cookie with other session info.
 * YAGNI for now.

Question: do we enable this for mobile?
 * Sure, why not?

Use cases vs. sketch
A. External calls to action that link directly to account creation work great, they just append ?campaign=fundraiserCTA5.

B. This doesn't address it. External calls to action that link to a landing page don't log anything, since we only look for ?campaign in the account creation. But if the landing page has a "So create an account and get started!" CTA, it should put ?campaign in that.

C. We can offer a custom on-wiki message and/or modify GettingStarted to present something different based on campaign session cookie. Steven Walling cautions that reshaping experience based on campaign by redirecting the user and manipulating pages has NOT worked well in the past (see Account Creation Improvement Project).

Do we want to enforce valid campaigns by enumerating them in the schema?

 * + simpler not to
 * + avoids deployment problems
 * +/- leads to free for all
 * - no registry of valid campaigns (but a registry wouldn't stop someone misusing or copy-pasting the URL of a valid campaign somewhere else)
 * - some risk of someone in a session clicking on an old link that has an invalid ?campaign in it that usurps a valid campaign

Decision: don't validate campaign, any string will log an event and set the cookie. S will probably enforce a maximum length of 30

Do this everywhere or just on account creation?
Ori-l: why do this everywhere? Why not target just AccountCreation where we know what we're doing and we have clear use cases?
 * means we can't point Archaeologists to the Archaelogy project landing page?campaign=Arch2013. We'd have to add ?campaign= to each of the links to create an account on the landing page.


 * S: Sure, it's doable, back to ACUX. It remains a general-purpose feature but we limit its scope by only checking for ?campaign=FINTO on account creation pages.
 * DarTar points out it's not the same as what we did for ACUX because we aren't remembering campaign permanently in userbuckets.

Separate parameter (?casrc) for account creation?
We're sticking with one campaign at a time. The most likely conflict will be a campaign to bring people to the site getting usurped by instrumentation tracking how people wind up creating an account. Why not plan for this from the start by separating "came into site, anywhere, with a ?campaign= URL" from "arrived at account creation with a ?casrc=redlink parameter"?
 * No, stick with ?campaign= only on account creation events. When and if we broaden then we can split them off.

Ori-l: don't set a campaign session cookie?

 * + If we only log campaigns on the Create account URL, then we could drop the campagin cookie and instead add a hidden form field to Create account.
 * - DarTar: no, we need a session cookie in case you leave account creation and come back.

Decision: DarTar seems dispositive, so we'll continue to set a DBName.campaign session cookie.

Remove the ?campaign from the URL

 * Ori says yes.
 * StevenW says sure, it's reduces false positives where someone shares the URL with others.
 * Matt says it might be useful to know someone shared a campaign URL, but then a campaign event doesn't mean "User clicked on a URL with a campaign" but "User somehow got hold of a URL with a campaign in it."
 * DarTar: depending on legal, maybe shouldn't remove it. Example is someone uses a URL shortener like bitly which obscures ?campaign=FundRaiser in a URL, they come to our wiki, we remove ?campaign=FundRaiser, so user doesn't know we're tracking.

JavaScript or PHP
Ori and Matt still in favor of PHP solution. We could still remove ?campaign from the URL by issuing a 301 redirect.