Extension:CentralNotice/Notes/Campaign-associated mixins and banner history

From mediawiki.org

Here are some notes about planned changes in CentralNotice and Fundraising banner functionality. We'll improve CentralNotice logging performance, change how Fundraising banners choose to show or hide themselves, and provide more data on users' history of banner and page views to help Fundraising improve banner effectiveness.

These improvement will be rolled together because they have overlapping technical requirements.

Motivation[edit]

Data and logging[edit]

Fundamental unit for the history log: campaign selection event. This event occurs every time a user is included in a campaign, even if a banner is not actually shown.

Also, whenever banner history is logged, the following additional data would ideally be sent: project, device, sample rate, KV storage errors.

Proposal 1[edit]

This proposal would send back the full log content. Due to EventLogging limitations, it looks like we won't do this for now.

{
    log: [
        {
            language: "en",
            country: "LI",
            isAnon: false,
            banner: "FR_desktop_01",
            campaign: "FundraisingC1_2015",
            campaignCategory: "fundraising",
            bucket: 0,
            time: "1435795140",
            status: "banner_shown",
            bannerNotGuaranteedToDisplay: true
        },
        {

            language: "en",
            country: "LI",
            isAnon: false,
            campaign: "FundraisingC1_2015",
            campaignCategory: "fundraising",
            bucket: 0,
            time: "143589241",
            status: "banner_canceled",
            bannerNotGuaranteedToDisplay: true,
            bannerCanceledReason: "close" 
        }
    ],
    project: "wikipedia",
    device: "desktop",
    rate: 0.05,
    kvStorageErrorsLog: [
        {
            message: "LocalStorage not available.",
            time: "1435795233"
        }
    ]
}

To get around the limit on the size of EventLogging payloads, we could have three schemas: one for banner history log entries, another for global log data, and one more for KV store errors. So, the above log could be broken into four events, linked by a unique logId, as follows:

Banner history log entry

{
    logId: "43123123",
    language: "en",
    country: "LI",
    isAnon: false,
    banner: "FR_desktop_01",
    campaign: "FundraisingC1_2015",
    campaignCategory: "fundraising",
    bucket: 0,
    time: "1435795140",
    status: "banner_shown",
    bannerNotGuaranteedToDisplay: true
}

Banner history log entry

{
    logId: "43123123",
    language: "en",
    country: "LI",
    isAnon: false,
    campaign: "FundraisingC1_2015",
    campaignCategory: "fundraising",
    bucket: 0,
    time: "143589241",
    status: "banner_canceled",
    bannerNotGuaranteedToDisplay: true,
    bannerCanceledReason: "close" 
}

Global log data

{
    logId: "43123123",
    project: "wikipedia",
    device: "desktop",
    rate: 0.05
}

KV store errors

{
    logId: "43123123",
    message: "LocalStorage not available.",
    time: "1435795233"
}

It would also be possible to denormalize most of this most of this to consolidate into a single EL schema.

Proposal 2[edit]

This proposal reduces log data and property names to a bare minimum to get around the current EventLogging payload limit.

{
    l: [
        {
            b: "FR_desktop_01",
            t: "1435795140",
            s: 6
        },
        {
            c: "FundraisingC1_2015",
            t: "1435795140",
            s: 2.1
        }
    ],
    n: 2,
    r: 0.05,
    e: "1435795233"
}

Here, tentatively, l would be log, b, banner , c, campaign and t, time. Campaign would only be included for logs of events in which no banner was chosen. s would be a code based on mw.centralNotice.internal.state.STATUSES and the possible reasons for a status that we know may occur (to distinguish close button hides, donate cookie hides, etc.). n would be total number of entries in the log, r would be sample rate and e would be the time of the most recent KVStorage error (if any). At least for fundraising banners, campaign, device, country, project, language, category, bucket and isAnon could probably be surmised from the banner name and time.

It seems that this approach would probably allow 8-10 log entries to be sent back, maybe more.

Implementation details[edit]

S:RI = Special RecordImpression, S:BL = Special:BannerLoader

Campaign-associated Mixins[edit]

Implementation:

  • Campaign-associated Mixins will be RL modules
  • Instead of carrying JS, choiceData will just have the names of any campaign-associated mixin modules needed and the parameters to be used
  • CNBannerChoiceDataResourceLoaderModule will set the mixin modules as dependencies as needed. Note that RL modules are cached client-side, so any given module will only need to be sent from the server the first time a user is targeted by a campaign that needs it. On subsequent page views, the module should be retrieved from the browser's localStorage. Either way, no extra requests.

Non-blocking issues and considerations:

  • Some refactoring of the JS currently in-banner will be needed.
  • We'll try to consolidate bits of JS that are currently used together. It doesn't make sense to have very small modules or to spam the RL module registry.
  • We should only mixinize bits of JS that aren't expected to change too frequently, and make sure that the Mixins' parameters can handle most forseeable tuning needs. Changes in the actual JS will require a deploy.

The other option that was considered was: add commonly used functions to bannerController.lib and include actual small JS snippets in choiceData. However, it looks like the above approach will be more performant overall.

We'll need new UI elements for campaign-associated Mixins on Meta, the infrastructure wiki. Here's a mock-up of a possible, simple initials version:

In this layout, controls for campaign-associated Mixins would have their own area just above controls for banners. The available parameters for each Mixin would show or hide dynamically when the Mixin is selected or unselected.

In-browser data use cases[edit]

Here are three contexts we may need for in-browser data, and some possible use cases:

  1. Campaign context
    • Counter for displaying a banner after a certain number of page views
    • Counter for how many times a banner has been show for a given campaign
    • User bucket
    • Locate in treatment workflow
  2. Category context
    • (Fundraising) Flag for whether the user has seen a full-screen banner
    • (Fundraising) Date when the user last donated
    • Flag for whether the user clicked the close button
  3. Global context

Client-side error logging[edit]

Here is some stuff we might include here:

  • If no choiceData is received.
  • If choiceData contains out-of-date campaigns.

Technical roadmap[edit]

  1. Implement campaign-associated Mixins
  2. Implement category-associated Key-Value storage—a more organized way to keep FR data on the client. First attempt will be to use local storage or something similar, instead of piles of cookies. Try to strike a balance between relative robustness and performance.
  3. Determine details of the data pipeline to log from the client. Maybe the first option will be EventLogging to Kafka. Here we might just defer to whatever Analytics and Operations recommends.
  4. Refactor bannerController API as needed.
  5. Improvements in administration UI for campaign details.
  6. Implement the specific Mixins for collecting the data and sending it back - for Banner History Provides feature 1
  7. Re-implement/move code for banner show/hide/decide-which-banner-to-show-logic to campaign-associated Mixins Provides features 2 and 4
  8. S:RI is turned off and everything is set up so impressions can be had via S:BL Provides feature 3
  9. Client-side error logging to be sure things are working as expected.

Features roadmap[edit]

  1. Banner history via a new logging mechanism.
  2. Banner count via S:BL logs: how many times a given banner has been shown (slow to get).
  3. Fast to get (sampled?) banner impressions. Turn off S:RI. 
  4. Don't lose impression after full screen.

See also[edit]