Article feedback/Version 5/Technical Design

Work-in-progress.

Overview
The AFTv5 plugin will re-use parts of the existing ArticleFeedback extension, including some of the database tables, JavaScript, and PHP. The plugin will still be built as an API, with two major calls - one to fetch the ratings data and load the form, and a second to save the data. Both will submit via asynchronous XHR GET request, and pass data in the JSON format. The feedback review page will continue to be a special page, with permissions-restricted moderation tools. All of the configuration options are maintained, with the exception of wgArticleFeedbackRatingTypes, which has been superceded by relational storage in the database, and configurable questions/answers.

Major components

 * 1) Work will occur in a fork of ArticleFeedback extension, ArticleFeedbackv5.
 * 2) The "leave feedback" widget:
 * 3) Various form elements (text areas, rating widgets, feedback type radios, etc) are included based on which "bucket" a user is placed in.
 * 4) This is based on IP address (modulo of the last two digits of the user's IP address and the number of buckets).
 * 5) Determining which "bucket" to use is made in the PHP of the API call, but the actual HTML of the feedback form, and the JS to control it's behavior, are defined in Javascript.
 * 6) A PHP backend to save the data from the feedback form.
 * 7) Calls To Action - markup may well end up hard-coded in the JS, along with the feedback form options. Current requirements logic is to select one CTA at random, and render it.
 * 8) The "view feedback" page:
 * 9) Moderation tools - hide, promote (to talk page), or flag various pieces of feedback.
 * 10) Sorting and filtering options - good comments first (how that's defined is unclear), newest first, etc. Basicallym tools are needed to manage or at least hide the inevitable flood of spam "feedback".
 * 11) A special page, on the level of the talk pages - add a link to this in the not-quite-top-level navigation on said talk page.
 * 12) Need to work out which bits of this page are must-have-now vs must-have-eventually.
 * 13) Phase 1 will have a limited version with:
 * 14) Filters: All, Visible only
 * 15) Moderation Tools: Hide this comment
 * 16) Sorting: By date only
 * 17) Pagination: Fixed at 50
 * 18) Database change overview:
 * 19) Prefix all table names with "aft_", per WMF.
 * 20) Alter table article_feedback to drop columns aa_rating_id and aa_rating_value, and add columns for created and modified timestamps, is_submitted (boolean)
 * 21) Create additional tables to store ratings and comments relationally, allowing for arbitrary fields in arbitrary "buckets" of feedback.
 * 22) Rename article_fedback_pages and _revisions to something more descriptive of them being rollups, and add page/revision rollup tables to track averages of the boolen or select inputs (IE, the percentage who "did you find what you were looking for", or a breakdown on feedback type - "suggestion", "problem", "praise"). Add bucket column to all tables, for A/B split evaluation.
 * 23) Change rollup logic to recalculate based on averages/sums, instead of adding/subtracting ratings totals on each submission.
 * 24) Remove any restrictions on the number of feedback records a given user may record, or any capacity to overwrite/edit feedback records, per WMF 11/7, and allow any number of records to be recorded. Note that the actual ratings calculations are to be based on only the most recent feedback ratings per user per article revision, per Fabrice's 11/9 email.
 * 25) Extension installer files will have to be created or updated as needed.

Database schema


When a feedback form is rendered, the hit on that bucket is logged to the rollup table: aft_article_hits (   aah_page_id     integer unsigned NOT NULL,    aah_revision_id integer unsigned NOT NULL,    aah_bucket_id   integer unsigned NOT NULL,    aah_hits        integer unsigned DEFAULT 0,    PRIMARY KEY (aah_page_id, aah_revision_id) )

And a record is created in the article_feedback table: aft_article_feedback ( aa_id               integer unsigned NOT NULL PRIMARY KEY AUTO_INCREMENT,  aa_page_id         integer unsigned NOT NULL,  aa_user_id         integer NOT NULL,  aa_user_text       varbinary(255) NOT NULL,  aa_user_anon_token varbinary(32) NOT NULL DEFAULT '',  aa_ip_address      varchar(32),  aa_is_submitted    boolean DEFAULT FALSE,  aa_revision        integer unsigned NOT NULL,  aa_design_bucket   int unsigned NOT NULL DEFAULT 0,  aa_created         timestamp NULL,  aa_modified        timestamp NULL )

The "aa_is_submitted" column will be false, which lets us track partial submissions, but on submission of a feedback form, the record will be updated, and also, depending on the feedback form, a number of rows may be added here, one row per recordable answer: aft_article_answer (   aaaa_feedback_id        integer unsigned NOT NULL,    aaaa_field_id           integer unsigned NOT NULL,    aaaa_response_rating    integer NULL,    aaaa_response_text      text NULL,    aaaa_response_bool      boolean NULL,    aaaa_response_option_id integer unsigned NULL,    PRIMARY KEY (aaaa_feedback_id, aaaa_field_id) )

aaaa_feedback_id is the aa_id in aft_article_feedback, and aaaa_field_id is from the below table: aft_article_field (   aaf_id        integer unsigned NOT NULL PRIMARY KEY AUTO_INCREMENT,    aaf_name      varchar(255) NOT NULL UNIQUE,    aaf_data_type ENUM('text', 'boolean', 'rating', 'select'),    aafg_group_id  integer unsigned NULL )

The value of aaf_data_type will determine which of the "aaaa_response_" columns is filled in - only one will be set for each answer row. Group ID is optional, and is used for grouping similar inputs on reports.

After that, there are four rollup tables that store reporting data and averages, which are updated on each for submission. There are two tables grouped by pageID, which count the sum of all revisions, back to the wgArticleFeedbackRatingLifetime, and two for revisions, which are grouped by revisionID: aft_article_feedback_ratings_rollup ( aap_page_id   integer unsigned NOT NULL,  aap_rating_id integer unsigned NOT NULL,  aap_total     integer unsigned NOT NULL,  aap_count     integer unsigned NOT NULL,  PRIMARY KEY (aap_page_id, aap_rating_id) )

aft_article_revision_feedback_ratings_rollup is identical, save for including an "afr_revision" column, and the prefix changing from "aap_" to "afr_"

These tables are used for aggregating numerical ratings (trustworthiness from 1-5, etc), and boolean inputs (did you find what you're looking for, with "yes" being a 4 and "no" being a 2, per WMF requirements).

There are also a second set of rollup tables for counting the option-based questions:

aft_article_feedback_select_rollup (   aafsr_page_id   integer unsigned NOT NULL,    aafsr_option_id integer unsigned NOT NULL,    aafsr_total     integer unsigned NOT NULL,    aafsr_count     integer unsigned NOT NULL,    PRIMARY KEY (aafsr_page_id, aafsr_option_id) )

As above, there exists an aft_article_revision_feedback_select_rollup table, which is identical save for the revision_id column, and with the "aarfsr_" prefix. This is used for determining, for example, how many feedback comments were marked as suggestions versus problems or praises.

Free-text comment fields are not rolled up.

Page Flow

 * 1) On article pageload, JS is to make AJAX call to articlefeedback query API.
 * 2) GET request. Parameters: pageid (required), revisionid (optional), subaction (optional, value of "showratings" will result in a return of JSON hash of the page's ratings averages)
 * 3) PHP code will determine which bucket to use, and log the hit of that bucket - useful for measuring response rates of each bucket, or for the proposed option 4, which has nothing to submit, and is only an "edit" CTA.
 * 4) Returns JSON object with "bucketId" field.
 * 5) JS will load feedback widget HTML and configure event listeners/binds, based on bucketID.
 * 6) On submit, the JS will pass the form variables, via GET/POST (TBD) method, to the articlefeedback API method.
 * 7) PHP backend will save the data, and update the rollup tables using for reporting. Will return either error message or (TBD - probably success = 1), via JSON.
 * 8) JS will determine which CTA to display - currently there's only one, but later phases will add more.

In other words (from js perspective)... User arrives at page -> build appropriate form -> (option 5 only) User clicks the view link -> replace form with average ratings display -> User submits form -> submit to API -> has errors -> show errors -> has no errors -> select random CTA and display

Bucketing Algorithm
See diagram



Tasks (to be split up)

 * SVN wrangling: get switched over to commit access instead of read-only (Greg, Reha locally)
 * Copy over new extension (ArcticleFeedbackV5) and commit (Greg)
 * Database changes (Greg)
 * Table alters (and inserts for article_fields and article_field options)
 * Update install script
 * Update queries
 * Update php (Greg, unless Reha finishes early)
 * Drop a line in article_feedback and pass to js
 * Add bucket decision-making
 * Clear/insert (feedback id + page id) on article_answers
 * Update rollup code to handle bools and selects
 * Revamp js (Reha)
 * Carry the feedback id and bucket id
 * Switch between forms based on bucket id
 * HTML into ajax response
 * Feedback dashboard page
 * Gut and replace with a super-simple list as above (use MoodBar as an example)