Wiki Techstorm/Programme/Wikidata and OpenRefine
Let's get going! OpenRefine is a multi-faceted data wrangling and cleaning tool with which you can tidy up, enrich and reconcile big batches of data and upload them to Wikidata. In this hands-on workshop you will learn how to use OpenRefine to prepare batches of (meta)data and upload them to Wikidata.
Please install OpenRefine 3.3 beta, ideally before the workshop:
If you are already at the workshop and have not installed OpenRefine yet, ask for one of the USB keys to avoid downloading it on the venue's wifi.
In the first part of the workshop, we will demonstrate the workflow on a simple example. We will import the list of members of the Welsh Museum Federation. The dataset we are going to use can be downloaded in TSV format.
Projects to practice
In the second part of the workshop, we have prepared a few datasets which you could try importing in Wikidata with OpenRefine. These are good exercises: you should be able to import parts of your datasets by the end of the TechStorm. Comment on the Phabricator task associated with the import to let others know that you are working on it. You can also come up with your own dataset of course!
- Museum Data Files, about museums in the US;
- Büchereiverband Österreichs, about public libraries in Austria;
- UK Lakes Portal, about, guess what, lakes in the UK;
- ISO 10383 Codes, about financial markets;
- CA PROP 65, about chemicals which cause cancer;
- Publons, about academic journals and publishers;
- Endangered alphabets, about writing systems and scripts.
Ask the mentors during the workshop! They are here to help you.
Here are also a few links you might find useful:
- GREL variables
- Reconciliation documentation
- A few recipes
- Wikidata reconciliation (with list of tutorials and videos)