Talk:Wikibase/Indexing/RDF Dump Format

About this board

Fuzheado (talkcontribs)

The docs here say "Full statement is represented as separate node, with prefix wds:" however a TTL dump shows no "wds:" but only "s:" - So should this documentation be changed? Thanks.

TomT0m (talkcontribs)
Reply to "Is wds: now just s:?"

exact meaning of wikibase:geoPrecision?

3
Vladimir Alexiev (talkcontribs)

What does the value of wikibase:geoPrecision mean exactly?

A couple years ago I did some research and came up with this:

  • "measured in 1/111000 meters (eg street address or building -> 30.8/111000)"

But I can't remember how I came to that conclusion. Is it correct?

cc @Smalyshev (WMF)

Lydia Pintscher (WMDE) (talkcontribs)
Vladimir Alexiev (talkcontribs)
  • Google "A degree of longitude is about 111 kilometers (69 miles) at its widest. The widest areas of longitude are near the Equator, where the Earth bulges out. Because of the Earth's curvature, the actual distance of a degree depends on its distance from the Equator."

Using degrees is not the best way to measure precision since it depends on location and direction. But it is what we got.

I'll add this to Wikibase/Indexing/RDF Dump Format

Reply to "exact meaning of wikibase:geoPrecision?"

wrong Wikidata element in the example

2
Giacomo Lanza (talkcontribs)

Hallo,

the second example box is wrong - Q3 is "life", while the "Universe" is Q1.

I would edit it by myself, but i have no knowledge about the long identifiers starting with `wbs:`

Mbch331 (talkcontribs)

For en example the data doesn't have to be correct, just correctly formatted. Besides if we were to correct it, we would need more things to be fixed. It still contains a reference to P7, which property no longer exists.

Reply to "wrong Wikidata element in the example"

Mapping Images from Wikidata to Commons with federated SPARQL

1
Mfchris84 (talkcontribs)

based on the existing RDF Dump Format model it isn't possible to match the Media-Info RDF-Dump based on a Wikidata-SPARQL-Query for images using ?item wdt:P18 ?img. The reason is, that the commons media file value stored in the Wikidata Wikibase like http://commons.wikimedia.org/wiki/Special:FilePath/%FileName% but in the media-info rdf-dump this values isn't represented, only the full qualified upload url, which is hard to guess without any other information.

Reply to "Mapping Images from Wikidata to Commons with federated SPARQL"
88.156.143.3 (talkcontribs)

What is prefixed by osmm: osmt: and other osm_: ?

These prefixes can be found everywhere in examples, but they are not listed here.

Frog23 (talkcontribs)

They are the prefixes from the OpenStreetMap (OSM) Data Model which are necessary for querying the OSM SPARQL Endpoint at https://sophox.org

 "OSM Data":{
      "osmnode":"https://www.openstreetmap.org/node/",
      "osmway":"https://www.openstreetmap.org/way/",
      "osmrel":"https://www.openstreetmap.org/relation/",
      "osmt":"https://wiki.openstreetmap.org/wiki/Key:",
      "osmm":"https://www.openstreetmap.org/meta/",
      "pageviews":"https://dumps.wikimedia.org/other/pageviews/"
 }

All other osm* prefixes are the Sophox prefixes for its Wikibase instance, corresponding to the same concepts of Wikidata that start with 'w' , e.g.

PREFIX osmd: <http://wiki.openstreetmap.org/entity/>

analog to

PREFIX wd: <http://www.wikidata.org/entity/>"

Same for all of the other prefixes like

PREFIX osmdt: <http://wiki.openstreetmap.org/prop/direct/>

etc. I hope this helps.

Cheers Frog23

Reply to "osmm osmt prefixes"
MichaelSchoenitzer (talkcontribs)

According to the site this query should result true:

ASK { wd:Q2 a wikibase:Item }

But it results in false. It this a bug? Is there any other way to check is a entity is an item?

Matěj Suchánek (talkcontribs)
MichaelSchoenitzer (talkcontribs)

Thanks, I oversaw this section. Is there any other way to check if an entity is an item (not an property, lemma, etc.) in a query?

Matěj Suchánek (talkcontribs)

It may be unrealiable but I would do ASK { wd:Q2 wikibase:sitelinks [] }.

Reply to "a wikibase:Item doesn't work"
Jc3s5h (talkcontribs)

Disputed interpretation

In an edit at wikidata:Help:Dates Jarekt changed a statements to read "Wikibase software interprets years 1801-1900 with precision 7 as 19th century" and "Wikibase software interprets years 1001-2000 with precision 6 as second millenium". There is discussion on the associated talk page.

I believe Jarekt is referring to the interactive user interface, but I consider it wrong to refer to that interface as "Wikibase software". I believe the RDF API is just as much a part of Wikibase software as the interactive interface. The RDF data model documentation indicates RDF follows ISO 8601 and XSD 1.1 standard. Both of those indicate precisions by truncating the unneeded information. So, for example, for precision 100 years and a year of 1900, it would be truncated to 19 and understood to include any year from 1900 to and including 1999. Jc3s5h (talk) 22:30, 17 July 2018 (UTC)

Jarekt (talkcontribs)

RDF Dump Format deals with how data is stored not what it means. Current Wikidata standard of interpreting concept of 1st century as years 1-100, second century as years 101-200, etc. and 1st century BC as years 100 BC-1 BC, is perfectly consistent with international understanding of those terms for hundreds of years.See 1st century, etc. If you like to redefine those term than discussion on ''RDF Dump Format'' talk page is not the right place. Wikidata is not consistent with ISO 8601 and XSD 1.1 standards, which is unfortunate but as wikidata:Help:Dates mentioned Wikidata dates are "resembling ISO 8601" but do not follow it. Other difference is how we store BC dates and section d:Help:Dates#Years_BC explains conversions which are done to from the format used by Wikidata to RDF Dump format.

I do not understand your point about of Wikidata GUI not being part of "Wikibase software". I never tried this but I believe that if you create a different instance of wikibase using wikibase software than it comes with the GUI. So I am lost...

Jc3s5h (talkcontribs)

When I wrote "I consider it wrong to refer to that interface as "Wikibase software" I meant that the interactive user interface is not the only Wikibase-provide method to read and write data, so that writing as if the meanings implicit in that interface were followed by all intefaces is incorrect. I completely reject the notion that "RDF Dump Format deals with how data is stored not what it means." Explaining what the RDF Dump Format means is the purpose of this page.

Reply to "Disputed interpretation"
Jc3s5h (talkcontribs)

My first question is how to update the this document. A "sister" document, Wikibase/DataModel/JSON says that the document should not be edited in the normal way, but rather "NOTE: The canonical copy of this document can be found in the Wikibase source code and should be edited there. Changes can be requested by filing a ticket on Phabricator"

Does a similar process apply to this document, or is this document edited directly?

The change I think should be made is as follows, with bold showing material to add:

The full value includes the simple value above under wikibase:timeValue, precision and timezone as integers and calendar model as IRI. The timezone parameter has never been implemented and should be ignored; all times in the database are local times; that is, the timezone is not recorded as part of the time and must be deduced from other clues, such as the place an event occurred.

I consider this important because editors may be unwilling to contribute to Wikidata if it forces them to make false statements, such as the time zone being UT when it is really United States Eastern Daylight Time.

I will be making a parallel request for revision of Wikibase/DataModel/JSON.

[Text above added September 2016. Text below added 31 December 2016.]

At wikibase:Project chat#Adding a source @Pasleim: stated

If a date is given with day precision, one has to ignore all information which make more precise claims including the time zone. The time zone parameter is only needed for dates/times with at least hour precision. So Wikidata doesn't say anything if a date is in UTC or in local time. Basically a specific day is a time period of 50 hours, from 12:00a.m. in UTC+14:00 (Q7130) to 11:59p.m. in UTC−12:00:

While this may have been a defensible interpretation of earlier versions of this document, and maybe even a defensible interpretation of the JSON dump spec, this spec says

The simple value of the time value is either datetime value of type xsd:dateTime, if the value can be converted to Gregorian date in ISO format, or a string as represented in the database, if not. The xsd:dateTime dates follow XSD 1.1 standard...

Considerable effort has recently been expended to respect the XSD standard to always use the Gregorian calendar, by creating code to convert Julian dates to Gregorian dates. If this effort has been put into respect the Gregorian calendar aspect of the XSD spec, I infer the meaning of the "Z" an the end of the representations, which means Universal Time, would be equally respected. Jc3s5h (talk) 15:58, 31 December 2016 (UTC)

Lydia Pintscher (WMDE) (talkcontribs)

No this one is maintained here :)

Jc3s5h (talkcontribs)

In view if a Wikidata ambiguity about the date of Isaac Newton's death, I believe the section should also be revised to state the year is always deemed to begin on January 1, even though historically some countries have observed other dates to increment year numbers.

Reply to "Time revision"
Yurik (talkcontribs)

When storing sitelinks, shouldn't it be normalized URL ('_' instead of spaces)? Otherwise they differ from the canonical wiki representation. CC: @smalyshev (WMF):

Smalyshev (WMF) (talkcontribs)
Bobdc (talkcontribs)

A query for the schema:about value of <https://en.wikipedia.org/wiki/Duck> shows that it's wd:Q3736439, but the Sitelinks section of this page says that it's wd:Q3. Am I misunderstanding something or does this example need to be corrected?

Mbch331 (talkcontribs)

The example doesn't have correct values. It just has values to show how it's formatted. Examples usually have just random values as does this example.

Reply to "wrong example for Duck?"