Amsterdam Hackathon 2014/Topics/Artworks import from Wikimedia Commons


The goal is to import artworks in Wikidata from Wikimedia Commons files with the artwork template.

Wikimedia Commons extraction[edit]

An extraction of Wikimedia Commons files with the template Artworks has been done by a Wikimedian in september 2014.

The file:

Result: 190.000 files with 105 different properties

Table of the 105 properties with occurences:

Preparation for Wikidata[edit]

For a wikidata import:

  • remove useless properties (detail, review, informations on the file...)
  • merge redundant properties (artist-creator..)

Result: new version with 27 properties.

The file:

Table of the 27 properties:


  • Fields values are heterogeneous
  • Artworks still in Wikidata and another with artworks with two or more occurences in the table.


At this point, many options. Some proposals:

  1. Global processing on fields (example on date)
  2. Division in lots (by institution?)
  3. First option 1 for some properties then option 2.
  4. Extract files with well formed metadata first