UploadWizard/Funnel analysis

From mediawiki.org

This is an explanatory page for the funnel analysis of UploadWizard. Funnel analysis involves tracking users as they go through the steps of a process to identify why they are unable to successfully finish it.

The graphs of this analysis illustrate the number and/or percentage of users who drop out at each step of the UploadWizard. We are tracking this as a means of understanding where failure is occurring, so that we can prioritize fixes that need to be made.

Ordered list of steps as experienced by end users[edit]

  1. tutorial - the Puzzly licensing tutorial
  2. file - the actual file upload step
  3. deeds - the selection of license information
  4. details - long form to input many details about each file uploaded
  5. thanks - success screen with help on how to add the file(s) to articles


File upload dropoff[edit]

Dropoff at the file stage is steep (71.2% as of 2014-09-18; i.e. 3 in 10 users who arrive to the file upload page don't continue to the license selection page). Steps are underway both to better understand this (see card #862 and card #903), and fixing one problem that we believe could be a contributor (see Bug 46741).

Sequential instead of parallel uploading[edit]

The "Upload another file" button at the last step of the process is heavily used; the average user goes through the steps of UploadWizard dozens of time instead of adding all the files at the same time and going through the steps once. This suggests that the ability to upload and describe multiple files at the same time is not widely known, or has some deficiency that causes users to avoid it.

Top errors[edit]

To identify causes for dropoff, we are logging API errors and form validation errors. While we don't have conclusive numbers yet, it seems that these errors are not the main source of dropoff, but do contribute to it significantly. The most frequent error types (for more details see here:

  • invalid file type errors - these might just mean the user wanted to do something that's not in scope for Commons, but could also mean that they need better information on e.g. how to convert files to free formats.
  • bad token errors (bug 69691)
  • session timeout errors causing files uploaded via stash or async API to be unavailable (bug 43967)
  • duplicate file errors which appear too late in the upload process (in the details step)
  • api-error-publishfailed
  • stash errors (bug 56302, bug 54028)

Issues with the logging system[edit]

As of 2014-09-25, all known issues have been fixed, but past data is still affected by them. (The graphs show data collected in the last 30 days, so artifacts will disappear by late October.)

  • Between 2014-09-12 and 2014-09-18, no data was collected due to a bug in the logging code due to a configuration error.
  • For a long time, tutorial steps were underreported. (More precisely, the tutorial can be skipped; we log a "fake" tutorial event in such cases, but it was not done consistently for all workflows.) This is also the cause of the tutorial survival percentage exceeding 100% in the first graph.


See Multimedia/Media_Viewer/Metrics/Architecture for a general description of analytics architecture for multimedia features.

Funnel analysis is done by assigning a random identifier (flowId) to UploadWizard on page load, and logging separately all events such as entering a step, clicking a button, getting an error. These events can then be tied together via the flowId. No details identifying the user or the uploaded images are logged.

Event logging schemas involved:

Logging code: uw.EventFlowLogger.js