Article feedback/Public Policy Pilot/Technical

From mediawiki.org

Process[edit]

Page Load[edit]

This is the process that occurs when a user goes to a page initially.

  1. Page request is received.
  2. If pageID is within the pilot's purview, we continue:
    1. After full page load, the AA javascript fires.
      • If the browser has a cookie value amounting to "has ever given a rating" or the user is logged in, the javascript-fired request needs to try to get previous rating values for this user.
    2. The server catches this request and returns the correct data:
      • The values for the current aggregates for all four questions, as well as the number of ratings received.
      • If there are individual ratings to be returned (user-level):
        • The value given by the user for each question
        • Whether or not the values are "stale"
        • The number of revisions that have passed since the user gave these ratings.
    3. Upon receipt of the data from the server, the client-side javascript assembles the visuals as needed and injects the data into the page.

Rating Submission[edit]

This is the process that occurs when a user submits ratings. It is effectively the same process whether they are fresh, stale, or re-rating the same revision)

  1. User selects values for the questions.
    • User is not required to submit values for all four ratings.
  2. User clicks the submit button.
  3. If this is a new rating for the user on this revision, a new row will be inserted into the database. If it is a "re-rating" for the revision, an update will be made (so this is basically an upsert on the userid/pageid/revisionid key)
  4. It is at this point that the aggregate rating values are calculated.
  5. Data is returned to the client in the same format as it is on page load, and the components are updated (not reloaded)
  6. A "this user has given a rating to something" cookie is set on the client.

API[edit]

list=articleassessments[edit]

action=query&list=articleassessment&format=json&aapageid=

where pageid is the page ID of the desired page

This will return a cached result in an object nested like so

{
	"query": {
		"articleassessment": [
			{
				"pageid": "1",
				"ratings": [
					{
						"ratingid": "1",
						"ratingdesc": "articleassessment-rating-wellsourced",
						"total": "5",
						"count": "1"
					},
					{
						"ratingid": "2",
						"ratingdesc": "articleassessment-rating-neutrality",
						"total": "5",
						"count": "1"
					},
					{
						"ratingid": "3",
						"ratingdesc": "articleassessment-rating-completeness",
						"total": "5",
						"count": "1"
					},
					{
						"ratingid": "4",
						"ratingdesc": "articleassessment-rating-readability",
						"total": "5",
						"count": "1"
					}
				]
			}
		]
	}
}

Where 1-4 are the indicies for the dimensions of the review, count is the number of reviews, and total is the sum of the reviews

action=articleassessment[edit]

action=articleassessment&pageid=&revid=&r1=&r2=&r3=&r4=

Database[edit]

The data stored will be in the following tables:

article_assessment_rating[edit]

Maps metrics to ids.

id rating
unique rating id rating key (i18n key?)

article_assessment[edit]

Will hold 4 rows per user per revision, namely:

page_id user revision timestamp rating_id rating_value
page.page_id username or IP revision.rev_id MW Timestamp Rating ID Value

Four rows per user per revisionID. If a user does not provide a value (0 stars), a 0 is entered. This row will not be counted in aggregation values (for now).

article_assessment_pages[edit]

Will hold 1 entry per page per rating we're measuring, namely:

page_id rating_id total count
page.page_id UINT which rating UINT total rating values UINT total ratings

Where rating refers to the rating that's being measured (1 = completeness, etc)

Assumptions[edit]

  • the page's pageid and current revid need to be visible to JS as variables
  • Historical information will be retained per user per article. That is: if a user rates a given article 5 times on 5 different revisions, all 5 ratings will be stored to facilitate "over time" statistical analysis.
    • Note that if the user re-rates the same revision, the data will be an update and not an insert. So if the user rates a given article 6 times on 5 revisions, only 5 entries will be stored.

Limitations[edit]

  • Any anonymous user who uses a different browser or clears their cookies will be seen by the system as having a different browser
  • We need to limit the MWHooks to 1 call so that code only gets injected when it is a page that has AA enabled for it, and this comparison only needs to be made once

Open Issues[edit]

  • Renamed user
    • Database proposal stores username or IP-address. That would mean that a renamed user "has to" (or "can") assess all pages again. Perhaps store userid instead of username. And when extracting data get username from the usertable by userid (or store both username_text AND userid like in recentchanges, archive and revision tables. However that means that the renameuser-extension also needs to update the assessments tables when a user is renamed).