Extension:GWToolset/Technical Design

Abstract
This section of the document answers questions about the project. What is the project? What is its purpose? What are the requirements?

GWToolset (or GLAMWikiToolset) is a Special Page extension. The main goal of the extension is to allow GLAMs the ability to mass upload content ( pictures, videos, and sounds ) to Wikimedia Commons based on respective metadata ( XML ); the intent is to allow for a wide variety of XML schemas. The extension goes about this task by presenting the user with several steps, represented by HTML forms, in order to set-up a batch upload process that will upload content and metadata to the wiki, which creates individual mediafile pages for each item uploaded.

The project, co-funded by Europeana and a few Wikimedia chapters, is under heavy development.

Further information can be found on the project page. Your feedbacks and questions are welcome, feel free to contact us.

Rationale
This section explains the value of the project, why we think it is of value, and how it fits into the bigger picture.

Process
Often cut into multiple sections, this describes how the feature is intended to work.

The current steps within the upload process are:
 * 1) Metadata detection
 * 2) Metadata mapping
 * 3) Batch preview
 * 4) Batch job creation

Metadata detection

 * 1) indicate which element within the metadata file represents a mediafile record.
 * 2) * a mediafile record contains metadata about the digital item such as author, date created, and a url to the mediafile.
 * 3) select a MediaWiki template that will display the mediafile metadata on the mediafile page.
 * 4) optionally select a previously saved metadata mapping that maps the metadata fields within the metadata file with the fields in the MediaWiki template.
 * 5) select the metadata file stored on your local hard drive.
 * 6) upload the metadata file.

Summary

 * a summary of the information provided in Metadata detection step.
 * a listing of all of the MediaWiki fields in the template selected in the Metadata detection step.
 * drop-down menus next to those fields that contain all of the metadata elements found in the metadata file.
 * a sample mediafile record with corresponding metadata information about the mediafile record.

Create a mapping

 * 1) create a mapping of the MediaWiki template fields to the metadata record elements by selecting the corresponding metadata record element from the drop-down next to the appropriate MediaWiki template field.
 * 2) * more than one metadata record element can be related to a MediaWiki template field.
 * 3) * a metadata record element can be related to many MediaWiki template fields.

Global categories

 * 1) optionally add global categories to the upload
 * 2) * global categories are applied to all mediafile records in the metadata file
 * 3) * more than one global category can be applied

Item specific categories

 * 1) optionally add item specific categories to the upload
 * 2) * these are applied to each mediafile record, but use item specific information.
 * for example, if the drop-down contains a mediafile field called author, the value for each individual record will be used.


 * 1) * the phrase allows you to prefix the mediafile metadata field with something like “created by” which could pair with a drop-down field author.

Summary

 * 1) optionally provide a summary message that gives an overview of why you are uploading this metadata file and all of its records.

Batch preview
Uploads and creates the first 3 mediafile pages based on those records found in the metadata file.
 * 1) you can preview the results of the mapping
 * 2) you can go back to the mapping step and make any necessary changes.

Batch job creation
If the Batch preview looks good, go ahead and create the batch job process. This step will create an initial UploadMetadataBatchJob that will cycle through all of the records found in the metadata file and create individual UploadMediaFileJobs that contain the mapping and specific record information as well as any categories that may have been added in the Mapping step.

Uploads placed into stash Processing continues by looking at referenced data from stash read N files from the stashed payload and remember where you stopped to be able to continue with the next segment

User will upload their metadata file into stash using Special page - Stash creates reference that can be used to get the data from the stash for later steps - scan node-wise for content to display how mappings will look User selects mapping & template they want to apply to metadata - there's a preview to help users see how things will work out User selects global tags to apply to pages User clicks "upload batch" A metadata job is created that contains all of the info collected via the wizard The job starts processing and makes N (currently 10) child jobs to process each file described in the metadata - after limit is hit it makes a new metadata job that knows how to start from the next record following those that have been processed Each child job: downloads the associated media, creates File page, fill in template - Create a FIle: page for each file - Fill in user selected template with date from upload file based on mapping - Add mapping json and record from upload in the page as html comments

Gallery and Assets
These are images that are essential to understand the project. Mockups, screenshots, and icons fall into this category.
 * GWToolset demonstration screencast