User:Robchurch/Upload wizard

The current upload interface consists of a single form which prompts for:


 * file to upload
 * destination filename
 * description
 * (optional) license template

Despite some AJAX gadgets, including a check for conflicting destination names, and a license preview, we receive frequent feedback that this interface is not helpful for new users; it's confusing and often unintuitive.

A perennial proposal is to introduce an interface that looks and behaves a lot more like a "wizard", to which a lot of our users, including new users, will be much more accustomed. This kind of interface is suited to the process, given that there are identifiable steps to uploading a file:


 * confirming the license (and advising the user if it's not suitable)
 * uploading the file
 * choosing a destination filename (and dealing with conflicts)
 * adding additional description, categories, etc.

I therefore propose an implementation of the upload wizard concept, which incorporates these ideas, plus feedback from various users, some observations I've made in the past few months, and perhaps a few ideas from the old Summer of Code proposal.

Features
The main goal here is to separate the interface for uploading media files into recognisable steps in the process. Throughout an upload session, we will maintain a custom object, serialised into the session data to preserve it between pages.

Establishing the licence
Most projects will be interested in ensuring their media files are released under an appropriate licence, or else marked as being used under fair use or regional-equivalent provisions, or in need of some (legal) attention. This is particularly relevant to Wikimedia projects, and essentially so for the Wikimedia Commons.

The user can select a licence via one of three main tracks:


 * origin of the image
 * common licences
 * select from all licences

Eligible licences for the first two options can be defined, and a license name, template, and description can be associated with each; a list of all possible licences is also maintained for "power users", and for this, a licence name and template is associated with each of these.

The rationale here is that it's possible to establish the licence based on the media's origin. For example, if a user created an image or recording for that wiki's usage, he can indicate this, and can be prompted to select an appropriate (hopefully, free content) licence. In contrast, if a user found the file "on some web site", then it's likely that the image's licencing situation needs to be checked further, or perhaps the upload needs to be outright rejected.

If this is not appropriate, then providing a list and short description of some common licences might be useful; wikis might, for example list two or three free content licences, and some "pseudo-licences", e.g. public domain.

For power users, or in specialist cases, it's probably going to be most convenient to simply select the desired licence from a list of those available, so I propose to maintain the existing drop-down box as a third option. It might be a good idea to update this with selections from the other two methods, so that the user confirms a final licence selection.

Once a licence has been selected, the licence name or another identifier will be stored in the upload session object.

Uploading the file
When uploading the file, the user will select the file using a standard file input element and submit the page in the normal manner. I hope to be able to use, where available, APC's implementation of RFC 1867 (upload progress), and later, others, to provide some feedback to the user, in the form of a nice, clear progress bar.

When available, we'll be able to provide information on the upload transfer rate, percentage of the file uploaded, etc. We might fall back to a faux-animated bar if the configuration or available modules is not in our favour.

Once the file has been uploaded, it will be moved to a temporary location (but less temporary than PHP's uploaded file store), and the path stored in the session object.

Setting the file name and log comment
We'll prompt the user for a filename, and can perform a slick little check, via AJAX, to see if their choice conflicts with an existing image (and if so, preview that image for them). We might as well also preview the uploaded file at this point; it will assist the user in picking a good filename, and if they've uploaded the wrong image, they can back out.

Once the filename has been submitted, provided it passes a second conflict check, we'll save it in the session object, and at this point, we'll also publish it via the file repository. If the filename does conflict, then the user will be prompted to enter another.

At this point, the user will also be prompted to enter the comment for the upload log, which will also appear in the revision history, and recent changes, etc.

Post-upload description and categorisation
Following the upload, it might be nice to give the user a chance to tweak the description page and add the image to one or more categories. I imagine it would be appropriate to skip this step when reuploading an existing image.

Features from the standard editor, including the editing toolbar, and a preview, should be available. We could provide a list of recommended categories, or even a partial category tree, perhaps incorporating the CategoryTree selector, if available.

Additional notes
At the moment, the progress bar will be confined to environments where APC is available, and where the apc.rfc1867 configuration variable is enabled. This causes a custom user cache item to be created, named after a special value in a hidden form element, which we can set to a salted token, or similar, and check via an AJAX request with apc_fetch.

For this method to work, APC needs to have been compiled against PHP 5.2.0, which implies that we'd need to be running PHP 5.2.0, although this isn't actually clarified anywhere. Our current code base apparently works fine under 5.2.x and later, so this may not be a problem.

There are notes on the PHP web site which refer to this not being thread-safe, although I'm not sure how up-to-date this comment is, or even how accurate; it might, however, be the case, that on large wikis such as Wikimedia, this won't work, or will be unreliable, and might need to be turned off.