Wikibase/Indexing/RDF Notes

RDF is a WC3 spec for describing knowledge in triples that look like: wd:Q23 wd:P509 wd:Q356405. In English that is "George Washington's (Q23) cause of death (P509) is bloodletting (Q356405)".

Technically RDF species the knowledge described as triples. It is usually written in a format called Turtle which looks like the example above. Turtle has some shortenings: wd:Q23 wd:P509 wd:Q356405 ; wd:P106 wd:Q82955 , wd:Q189290 , wd:Q131512 , wd:Q1734662.

In English that is "George Washington's (Q23) place of birth is Westmoreland County (Q356405) and he held the occupations (P106) of politician (Q82955), officer (Q189290), farmer (Q131512) and cartographer (Q1734662)." The  means "another triple about the same subject" and   means "another triple about the same subject with the same predicate". This syntax is reasonably terse which still being obvious to a computer and even mostly human readable. Turtle is pretty good.

You've probably noticed all those s.  RDF feels like it borrow's XML's namespace concept but, at least in Turtle, the syntax isn't so hideous. Here is what that first example would look like with the namespace properly specified: PREFIX wd:  wd:Q23 wd:P509 wd:Q356405.

Here is an example strait from the RDF primer: BASE   PREFIX foaf:  PREFIX xsd:  PREFIX schema:  PREFIX dcterms:  PREFIX wd:   a foaf:Person ; foaf:knows  ; schema:birthDate "1990-07-04"^^xsd:date ; foaf:topic_interest wd:Q12418. wd:Q12418 dcterms:title "Mona Lisa" ; dcterms:creator .  dcterms:subject wd:Q12418.

There is a whole bunch of stuff to notice here:
 * The primer references Wikidata. Neat.
 * Its perfectly ok to mix and match prefixes.
 * Some of these prefixes are actually pretty standard. foaf, xsd, dcterms are things I've seen a whole bunch.  schema.org is new to me but seems pretty sane.
 * They use a difference indentation style than I was using. I believe I got that indentation style from other places on the web.  Meh, I dunno what is canonical yet.
 * The  style syntax - that on in particular means .  Its how the BASE works.  I'm not sure why you'd want this syntax over a prefix.
 * Note that when you use the  syntax that doesn't prefix the BASE because the URL/URI/IRI is not relative.
 * The  syntax is syntactic sugar for   which I believe to be the same as   in meaning.

Observation: we'll probably be able to build some kind of rewrite rules from the standard prefiex (foaf, etc) to wd prefixes. Or infer them somehow if the relationship isn't 100% the same.

How we represent Wikidata
There exists a paper about representing Wikidata in RDF form. But the representation can get pretty cumbersome. I'm going to provide some examples rewritten using some common Turtle's PREFIXes: PREFIX wd:  PREFIX wdo:  PREFIX schema:  PREFIX sco: 

Site links, descriptions, and are pretty good:

wd:Q80 a wdo:Item ; wdo:label "Tim Berners-Lee"@en ; schema:description "izumitelj World Wide Weba"@hr ; sco:altLabel "TimBL"@pt-br. <http://es.wikipedia.org/wiki/Tim_Berners-Lee> a wdo:Article ; schema:about wd:Q80 ; schema:inLanguage "es".

This seems quite nice and sane and good. It uses, from what I can tell, the standard prefixes and language specifying strings which seems cool.