User:MarkTraceur/UploadWizard/Book uploads/Initial thoughts

From mediawiki.org

As part of the Google Summer of Code, we intend to integrate book uploading inside UploadWizard. We're breaking the task down into some subtasks.

Add book-upload metadata form(s)[edit]

Ready the interface[edit]

The UploadWizard interface currently only deals with image-type metadata. If we're going to deal with more than one type, we need to add a way to specify what a given thing is. This could look one of the following ways:

  • Radio buttons on a page just after the initial upload (or after the license page, depending on whether we want to customise license inputs based on type) that determine the form we put into the details page.
  • Separate buttons for each type of metadata, possibly organized by supertypes (like "image" is a supertype of "picture" and "cartoon", maybe), that add different subclasses of UploadWizardUpload objects to the list of uploads.
  • A dropdown list, either on a separate page that gates the type of form we use in the details page, or on the details page itself, which would require modifying the forms while on the details page.

The initial implementation of this interface should either only allow choosing 'image' (which is sort of annoying, but would be possible) or should allow configuration of different types based on some simple-ish configuration variables in LocalSettings.php for testing purposes. The patch (or the patches) that implement this will not cause any user-facing changes.

Make it possible to add more metadata types[edit]

This is the second half of the metadata work that will need to be done. One of two things can happen:

Build out configuration systems for metadata forms[edit]

We can build a complex-ish system for configuring metadata forms in LocalSettings.php. This is sort of how licenses already work, for example.

Integrate with TemplateData[edit]

Or, we can use the snazzy new TemplateData in combination with some configurable category names in LocalSettings.php in order to determine A) what is a valid metadata type (things that are in Category:Metadata schemas and are in the Template namespace, for example) and B) what those types expect in the way of form data types. This option seems way better, but maybe a little more complicated because we don't get to define all of the structures ourselves. I'm mostly okay with that.

See e.g. the example schema in the repository.

Import metadata from third party sources[edit]

Metadata that books (and other things) need can sometimes be stored in XML files on a third party site. These XML files would be great to have as a source of metadata so we don't need to manually enter the metadata ourselves.

I don't think we actually need to get these files to the server. We should be able to just pick up the metadata from the file on the client. This will require some sort of handling of the XML formats, and I'm not sure how to make that work along with the idea of "define whatever metadata formats you want" concept from the previous section. I guess it'll come down to us providing a potential shortcut for *some* metadata formats, but not all.

Import book files from third party sources[edit]

Why stop at metadata? Let's bring book files in from archive.org, Europeana, etc., just like we do with Flickr images. Probably this will involve refactoring bits of the Flickr code into generic-looking functions for importing third party stuff.

Add nifty integration features for third-party sources of files and data[edit]

One little nice-to-have thing is to add support for a query string parameter (or multiple ones?) to support uploading files and metadata automatically from a source. An example would be if you were browsing archive.org and wanted to share a file from it on Commons. We could do this for Flickr and Europeana, too. The basic workflow would just be to supply URLs for actual work file and metadata file, and then use those instead of asking when it comes time to get data. Upload multiple ones at once? It's possible, but I'm not yet sure if we should support it. We might want to consider making this a POST action to support big lists of files, if we do support it.