Talk:Wikidata Bridge

Jump to navigation Jump to search

About this board

This is the talk page about the Wikidata Bridge project. Feel free to give feedback or ask questions. Some threads are dedicated to specific questions or feedback loops.

Dies ist die Diskussionsseite über das Wikidata Bridge-Projekt (Verknüpfung mit Wikidata). Gerne kannst Du ein Feedback geben oder Fragen stellen. Einige Themen sind bestimmten Fragen oder Feedback-Schleifen gewidmet.

Link to a new version of the prototype

11
Lea Lacroix (WMDE) (talkcontribs)

Hello all,

I just updated the link to the prototype to a new version (v 3.0). Thanks to the feedback we received during and after Wikimania, we've been able to understand better what people need and improve some features.

Please note that this is still a "click dummy", with only certain reactive paths and areas where you can click on, not a full developed test system. If you get stuck, you can press "R" or click on "Restart" on the bottom right corner.

What changed:

  • the reference section was updated to be more visible
  • we added a new type of screen, "data type not supported", to warn users when a data type is (not yet) supported by the Bridge. You can access it by clicking on the coordinates.
  • another new type of screen is permission screens, that you can test by clicking on the map. This kind of screen will appear only when the user is trying to edit something without having the permission. The text and display is still to be improved, feel free to give us feedback.

What we're still working on:

  • cancel and save buttons for references
  • path to Wikidata not included yet
  • path to history not included yet
  • license notice not included yet



Feel free to try this updated prototype and give us feedback!

For your information, we're running a bit late on the timeline that is presented on the overview page, I'll give you a more detailed update at the end of November.

Thanks, Léa

Bouzinac (talkcontribs)

When someone type in "new" values (with or without ref, I don't care : I don't understand why there is such a debate, whatever) ; will the new value be "upranked", and the other pre-existent value(s) be "downranked"/deprecated ?

Lea Lacroix (WMDE) (talkcontribs)

Thanks for your feedback. Editing ranks will not be part of the first version of the feature, but that's an idea we keep for later.

Bouzinac (talkcontribs)

So there will be multiples values each times someone edit a property ?

Lea Lacroix (WMDE) (talkcontribs)

If the user states that the value is wrong, the value will be changed in Wikidata - no new value will be added.

If the value is outdated, then yes the outdated value will be kept in Wikidata and a new one will be added.

Ayack (talkcontribs)

But if the outdated value had a preferred rank, what will happen? The new value will be created with a normal rank and so won't be displayed in the infobox? Or the old value will be downranked and the new upranked? Or created with a normal rank?

Bouzinac (talkcontribs)

So the outdated value should be automatically be downranked...let's hope it will be finally implemented

Ayack (talkcontribs)

No, the outdated value should not be be downranked (it was true at a point in time), it's the new value that should be upranked.

Charlie Kritschmar (WMDE) (talkcontribs)

Hello @Ayack and @Bouzinac, the idea is that we will down rank the previously preferred value to normal and set the newly added value to preferred instead. We will not be deprecating values from the bridge. If the former value was already at a normal rank then the new value will be set to preferred.


There is currently no option to just add an additional value of the same rank via the bridge.

Jsamwrites (talkcontribs)

References is now well highlighted. Is it possible to add multiple references from multiple sources?

Charlie Kritschmar (WMDE) (talkcontribs)

Hi @Jsamwrites, It will be possible to add multiple references from multiple sources. The current prototype doesn't represent this functionality well yet.

Reply to "Link to a new version of the prototype"
Alsee (talkcontribs)

I see your UX Research page already has a link to EnWiki 2018 Infobox RfC. However I don't think you fully appreciate what that RFC says and what it means.

A very oversimplified summary would be that about half the community support Wikidata in infoboxes, and about half the community oppose use of Wikidata.

A somewhat more nuanced explanation would be that 1/3 of the community support Wikidata in infoboxes, 1/3 want it GONE, and 1/3 find the current usage of Wikidata problematical but they would be willing to support Wikidata in infoboxes if their concerns can be met. Some people would argue that it is impossible for Wikidata to meet those conditions. It could be argued that the 1/3 in the middle haven't followed the issue closely enough yet, and that they haven't yet figured out that Wikidata can't satisfy those requirements.

There have been a number of other Wikidata-related RFCs since then, denying or removing Wikidata from other locations.

  • There is a substantial likelihood that your project will provoke a new Wikidata RFC on EnWiki.
  • There is a very real possibility that your project will effectively trigger a ban of Wikidata on EnWiki. I would hesitate to claim any particular percentage chance on that outcome. The issue is precariously undecided among the community, and my impression is that it may be tilting against Wikidata.

I anticipate little chance this post is going to affect the course of your project. However I did want to alert you of the situation and the potential effect.

And as someone else noted, this tool would be literally unusable for unsourced items on EnWiki. There is an overwhelming rejection of unsourced items. The range of debate on EnWiki is whether sourced items are permissible.

Wargo (talkcontribs)

This tool's purpose is to edit Wikidata. Infoboxes could still use values set locally (in the ways as usual) and I think this tool could work on infoboxes and their properties without enabling to show in them values from Wikidata.

Jc86035 (talkcontribs)

Was it really necessary to immediately assume that the team had no idea what was said in the 2018 RfC? Many of the WMDE staff have worked on Wikidata for nearly six years, so I doubt that they would be unaware of Wikidata's reception on the English Wikipedia.

I doubt that this will result in an outright ban on Wikidata transclusion. Although most properties are rarely used in English Wikipedia articles, consider the authority control template – it would take almost a million edits for the Wikidata values to be replaced, and it would be needlessly difficult (if not completely pointless) to replicate Wikidata's semi-automatic identifier matching tools. (Right now, 60.6% of English Wikipedia articles use at least some Wikidata data; this is the 44th-highest percentage out of all Wikipedia editions' usage percentages.) Even if you assume that Wikidata data will never be usable on the English Wikipedia, I think it would probably require a significant amount of work to completely remove all Wikidata data at this point, and it could be potentially difficult to find consensus for this.

Another RfC could be a useful opportunity; a possible outcome could be to hide the pencil icons for (e.g.) unregistered users by using CSS classes. And by that time, I think Wikidata's ratio of sourced to unsourced statements would have improved since 2018, especially considering the DBpedia data-sync project and the other ongoing initiatives.

Furthermore, the WikidataIB module can ignore Wikidata values if they aren't sourced properly (and the pencil icon can also be trivially enabled or disabled, since it's just an image with a link), so I'm not sure why it's necessary to focus on unsourced data so much. Given that the draft technical documentation shows that the Wikidata-editing pop-up will be enabled through the addition of a single HTML attribute to a wrapper around the pencil icon, this would only be an issue where some values for a particular property on an item are sourced and some of them aren't (all the values would be shown in the pop-up).

In any case, it doesn't necessarily even matter if it's not going to be usable on the English Wikipedia for some years. The English Wikipedia is not the only wiki that matters, and the software will still be ready for whenever Wikidata gets those mass imports of sources.

Alsee (talkcontribs)

I'm not sure why it's necessary to focus on unsourced data so much.

It was barely mentioned. And it was mentioned because the documentation says they are building something with no support for sources at all. That means it won't even function on EnWiki unless/until they do add support for sourcing.

Devs: I just realized something. When you say your building something with no support for sources, does that mean the software will update the value and:

  1. Delete any source information that was attached? Or...
  2. Not-touch any preexisting source that was attached?

Neither option is particularly good, but I really hope you already considered this issue and already realized why one of those options would be extremely bad. Angry mob bad.

Another RfC could be a useful opportunity; a possible outcome could be to hide the pencil icons for (e.g.) unregistered users by using CSS classes.

No, that's not credibly possible as an outcome. The debate was whether to use Wikidata in infoboxes at all, and there were a lot of complaints that the RFC was overly complicated. We're not going to include trivial details like the pencil icon in the same RFC debating whether to fully deploy or fully rollback use of Wikidata in infoboxes. Small details like the icon would have to be addressed separately. The icon is irrelevant if the community decides Wikidata doesn't belong in infoboxes at all.

I expect the next RFC will follow up directly from the result of the last RFC. The result was that Wikidata might be acceptable for use if the relevant concerns are satisfied. I expect the RFC will ask whether Wikidata content is a sufficiently reliable source in specific and sufficiently compliant with Wikipedia policies&guidelines in general for automated import of Wikidata content into Wikipedia.

Wikidata's ratio of sourced to unsourced statements would have improved since 2018

The ratio of unsourced statement is irrelevant because the unsourced statements are irrelevant. They are already blocked. The only problem is the sourced statements.

WikidataIB module can ignore Wikidata values if they aren't sourced properly

No it can't. I discussed this with the module developer. The module makes an simplistic attempt, and it does filter (most) unsourced items. However it can't even reliably filter out items circularly sourced to Wikipedia. (It pattern-matches for "Wikipedia" in the source field, but millions of items are sourced to Wikipedia without having that text string anywhere in the source field.) The WikidataIB maintainer also refused to even attempt to filter out items which are circularly sourced to Wikidata. (Those items may be unsourced, but pass through the filter because there is a circular source claim attached.) The filter can't reasonably detect which items are circularly sourced to Wikidata, any such filter would have to overblock a massive percentage of all sourced items on Wikidata. And that's not even considering the universe of general bad sources, and other issues.

Snipre (talkcontribs)

I have some problem with your affirmation "No it can't." It is possible to filter values to avoid values with no source or sourced by a wikimedia project (using property P143 "imported from Wikimedia project"). I know it because the French WP has developped the lua module for that (see https://fr.wikipedia.org/wiki/Module:Wikidata, parameters withsource and sourceproperty).

Then there are still some wrong formatted references using P248 "stated by" linked to a wikimedia project. But this can be handled by curating the data in WD with a bot.

Now if you want to filter value according to a reduced number of references you considered as reliable, this can be done: you just need to create a filter which analyzes the references linked to a value and match the references you listed.

Finally I don't understand your affirmation "WikidataIB maintainer also refused to even attempt to filter out items which are circularly sourced to Wikidata." As references as defined in particular item, any reference link will point to a WD item. You should explain what is you problem with circularly sourced to Wikidata.

The main solution for your problem is on WP side by developing the filter which will extract the data you want: WD has a model for reference (see ), some contributors are still not respecting that model in WD but the structured data of WD allows you to eliminate everything you don't want. Just be aware that more you increase your constraints, more the filter is complex and less data will fit your desire. That's all.


Nikkimaria (talkcontribs)

It's very easy to filter values that use P143 "imported from Wikimedia project". It's also very easy to filter values that only use a very defined set of reliable sources. It is not easy, however, to get anywhere in between, because you'd need either a comprehensive set of all reliable sources or a comprehensive set of all unreliable sources, and such a thing simply doesn't exist.

Jeblad (talkcontribs)

I have experimented with reusing sources from Wikidata, and it is quite easy to make a minimum implementation. It is although a bit hard to make something that truly look like the current cite templates. It is not hard at all to filter out statements that use P143, but it is strangely difficult to explain to users that this is easy and in fact was done in the experiment.

Snipre (talkcontribs)

Thank you for the comment. This is my point too: we can filter everything which doesn't comply with the English WP rules concerning the lack of source or circular WP references. But as WP English is not able to provide the list of reliable sources for the whole WP, this is no reason to ask the same task to WD. If someone want to extract only values from reliable sources, then he has to provide the list of reliable sources. This can then be added in the infobox code (if the infobox is codded in lua of course).

Jc86035 (talkcontribs)

Infoboxes are mostly irrelevant on the English Wikipedia for this use case, at least for now. Most of the Wikidata usage on the English Wikipedia still comes from the authority control, official website and Wikimedia Commons templates (as well as some other external identifier templates). It's still exceptionally rare to see infoboxes that use any substantial amounts of data from Wikidata, and I don't expect this to suddenly change.

Conveniently, if this remains the case until Wikidata's sourcing situation generally improves, it means that sourcing is also almost entirely irrelevant for the English Wikipedia, because external links and Commons categories don't require sources. I do also think it would ultimately be necessary for the software to show sources if it's ultimately intended for everything in infoboxes to be based on Wikidata data, but it might take a lot of user testing to make it more usable than the figure-all-of-it-out-yourself style of the current wikidata.org/Wikibase interface. I don't think this sort of user testing would be appropriate for the minimum viable product, though, especially if it'd be initially implemented as a beta feature (i.e. lower risk of vandalism and errors).

It could be possible to make the case that because the English Wikipedia convention is to omit sources from infoboxes altogether, it wouldn't make sense for sources to be shown in the pop-up anyway (though in my view it would probably make more sense to require sources for all infobox data).

I also take issue with your suggestion that Wikidata would have to be considered as a whole. As an example, if census data were imported regularly for a particular country (making it acceptable to use that data in infoboxes), it wouldn't make sense to require a complete evaluation of Wikidata just to use that census data.

I'm still not sure why you're implying that this software being unable to handle references would cause a huge fuss, given that it would be the conscious and deliberate decision of template authors to enable the software for the appropriate parameters; presumably, if it's considered absolutely necessary for the end user to view the sources, then the pop-up would just be disabled for the relevant data. I'm sure it's perfectly possible for this to become contentious if it's not handled properly, but it doesn't seem likely to me because it's (presumably) not going to be forced on anyone and editors are (presumably) to be completely in control of where and when this gets enabled.

Jc86035 (talkcontribs)

Of course, Wikidata data is actually used in infoboxes in a significant number of Wikipedia editions, so (as Snipre notes below) this would not be applicable for some of the larger wikis like the French Wikipedia, and in most situations it would be necessary to be able to view, add and modify references through the pop-up.

Lydia Pintscher (WMDE) (talkcontribs)

Hey :)

Yes. It'll be up to the template editor to decide where to use the Wikidata Bridge. Also our designer is currently figuring out how to handle references. So we'll have that as well.

ChristianKl (talkcontribs)

That sounds like a state of affairs where you would get a template editor to decide to use the Wikidata Bridge on EnWiki and then we get drama and an RfC which might put us in a worse position then we are currently.

Jc86035 (talkcontribs)

Wikidata usage itself is already limited to non-infobox data that doesn't require references on almost all English Wikipedia articles. Only a few specialized infoboxes like w:en:Template:Infobox telescope use Wikidata extensively (though w:en:Template:Infobox person/Wikidata does have more than 1,800 transclusions). Given that it wouldn't make sense to use Wikidata Bridge without also using the relevant Wikidata data, I don't think this would in and of itself cause any drama; it's already discouraged to add Wikidata-transcluded data to begin with, so I doubt that anyone is going to be running around adding a beta version of Wikidata Bridge to high-use infoboxes.

Logically, there are only four scenarios for Wikidata/Wikidata Bridge usage for infobox data:

  • The status quo, at least on the English Wikipedia is that infobox parameters generally have neither Wikidata data nor the Bridge pop-up.
  • If one were to add only the Bridge code but not the data from Wikidata, the pop-up would probably be removed (or no one would notice) since it wouldn't be possible to use it to modify anything displayed in the article. This would probably cause disruption to the Wikidata data, but wouldn't in itself cause disruption to the Wikipedia article.
  • The situation for adding data transcluded from Wikidata without the Bridge pop-up would presumably remain the same.
  • If both data from Wikidata and the Bridge pop-up were added, would this be worse than if only the Wikidata data were added? The references not being shown, I think, would only be detrimental to Wikidata and not to the English Wikipedia, because references aren't normally shown in English Wikipedia infoboxes, and it's still not possible for w:en:Module:WikidataIB to display references in the first place.
RexxS (talkcontribs)

It's not difficult to import the references, if desired, along with the values they support. The problem lies with how to format those references so that they match the style of the references used in the rest of the article. Unfortunately, the English Wikipedia allows an editor to make up whatever style of referencing they choose and once that becomes the established variant for the article, you need lengthy discussion to be able to change it. It's therefore a near impossible task to hold a list of all possible styles of reference formatting that imported references would need to match. The outcry from editors that "these references have appeared and I can't change them to fit my scheme" would set back the adoption of Wikidata on the English Wikipedia by years.

Jc86035 (talkcontribs)

Maybe I'm stating something that's already obvious or I've asked you this question already in some other discussion, but could you not have a parameter like |ref-style=cs1 and disable citations/Wikidata on pages without the parameter? Would there be a drawback in doing so, other than necessitating the use of the parameter?

RexxS (talkcontribs)

The disadvantage is that editors will copy others who use the parameter, even on pages that don't use CS1. That's when the outcry starts and the blame goes to "Wikidata" or whoever implements the code, rather than the editor who made the mistake. Nevertheless, I suppose that we'll have to have something like it eventually, so I'll sandbox an implementation when I get back from this weekend's meeting.

Snipre (talkcontribs)

I share some fears of Alsee about the risk that this tool will bring more criticism about WD. The main criticisms are the quality of the data and the integrity of the data. And the first draft of the tool didn't take account of these two aspects so there is a legitimate question to know if the developement team is aware of the opposition to the use of WD.


About the quality of data, the main element is the sourcing and some WP have some strong policy about that aspect. Proposing a tool which is not able to deal with that principle is just a casus belli because it encourages a bad behaviour in those WPs and can increase the risk to see the display of that data later by the WP if the infoboxes are not proper coded to filter the data.


Then about the data integrity, the fact to propose to correct "wrong data"is a bad understanding of the sourcing principle. If someone did a mistake when adding a sourced data with the property "retrieved" (P813), then by changing the value of the statment but not the value of that property in the reference part, we generate complete chaos in terms of chronology.


And finally no answer about the vandalism problem was proposed since the last RfC: no stronger policy about authorization of contributing (like excluding IP), no stronger policy about sourcing (like obligation to add a source), no increase of the protection of existing data (it is still possible to change the value of a sourced data without having any blocking or deletion of the reference part).


Just increasing the access to WD will just incease the vandalism posibilities and then WP having a strong concern about that problem won't be intereted to implement a that tool and further to use data from WD.


Jc86035 (talkcontribs)

Wouldn't sources presumably be made mandatory on a per-property basis using constraint violations (which Wikidata Bridge would have to be able to handle at least partially anyway)? I don't think it would be appropriate to implement such a requirement specifically for this tool, given that many properties (a majority if we include identifiers) shouldn't need sources in the first place.

I agree that it would be essential to manage vandalism, although if all the edits from the tool are tagged then it shouldn't be technically difficult to assess whether most edits are constructive and whether allowing unregistered users to access the tool is beneficial. As noted, the tool would also be enabled on a case-by-case basis (and it would be possible to mix non-interactive and interactive pencil icons in the same template), so if a template is disproportionately enabling vandalism then the pencil icon could be removed for that template.

Snipre (talkcontribs)

Jc86035: "given that many properties (a majority if we include identifiers) shouldn't need sources in the first place". On which basis can you assert that the majority of properties don't required references ? This is the inverse, the majority of properties require a reference to be able to manage the chronology of information (even identifiers can change folloging merging in the original databases) and difference between several points of view.


In my opinion only instance of/subclass of and properties used to decribed a reference shouldn't have a source.


People think that identifiers don't require references, but htis is wrong, because some databases are not open so we can know those identifiers only using others sources, then even identifiers can change with the time due to merging in the original databases, so it is necessary to have at least a retrieved date with an identifier in order to understand why two references can have different values for the same identifiers (one if before the change and the other after). Some data can be different but all can be correct: we can have differences in the precision, in the method,...


And finally, to comme back to your comment, you explain that a tagging will be used to assess the vandalism risk, but this assuems a post treatment of the edits (who will do that, is ti a permanent action or only a temporary analysis,...) and this is based on the fact that we will be able to differentiate vandals from contributors in an easy way. So again the risk is not reduced at the origin but using permanent actions which will lead to an increase of workload for "good" contributors and will be depending on the effort of these contributors Wikipedians don't like this kind of arguments, they dont' want to know what we will do to improve WD, but what is currently active to keep the quality.


So again with the announcement of the Wikata Bridge, we have a similar process than the one for WDin the past: we are trying to sell a product with identified drawbacks with the promise that the tool will be improved in the future to take account of the customer needs. And the Customer answer is "come back later when your tol will do what we need".

Jc86035 (talkcontribs)

Even assuming that it would be good practice for users to add metadata for identifier statements, given that "our designer is currently figuring out how to handle references", it would presumably be technically possible to make it easier to add retrieved (P813) and the UTC date as a reference for identifiers using a checkbox (of course, this is hypothetical, and it would be better if this and similar actions were possible in the Wikibase interface as well). Even if this isn't possible, it wouldn't be difficult to check an item's revision history if a conflict arises (and presumably this is usually necessary anyway, since most identifier statements don't have such references).

Although I agree that the prospect of increased vandalism is probably detrimental, it's possible that it would also be easier for readers who've never edited before to correct obvious vandalism (and easier for them to be introduced to editing Wikidata, and so on). Anecdotally, I've seen this happen on both Wikipedia and Wikidata. We would need to wait for the live trial to know if this actually has a measurable benefit, of course.

Snipre (talkcontribs)

I agree about the fact that every problem can be solved with a solution, and this discussion is not about pointing the impossible things to acchieve. But the development team decided to find a solution to one particular problem (contributing to WD from WP) and forget to propose a solution for all other problems which are currently the sources of opposition to the use of WD in WP: poor quality data (data without source, data sourced from WP), fight against vadalism (no possibilty to use WP history of article to see modification in WD affecting the WP article), protection of integrity of data in WD,...


We all know this is a question of priorities and resources, that's not the point. The problem in the process used to propose a new tool which solved one problem (contribution to WD) but without having from the start a solution for the other problems. We can't change the current WD framework, but any new modifications should take account of the maximum of the criticisms mentioned previously in order to show that the problems were understood and that there is a will to solve them.


To summarize a little:

  • the first beta version has to include a feature to handle reference. There is no interest from the big WPs (or at least no majority) to assess a tool without that feature
  • no correction of existing statement should be possible without an update of all relevant qualifiers or reference properties. The critical point is the retrieve property which has to updated in any case. And finally the definition of correct or wrong for a statement can't be done without a look at the reference. Some data were correct once and not more currently. This doesn't mean that value was wrong all the time and perhaps in a certain period the data was correct. WD has to be able to take account of the fourth dimension (the time). Same for data with different units, and based on different determination methods. Unless the full data set including value, qualifiers and references can be displayed by the interface, there is only acceptable action when contributing to WD from WP: adding a new statement.
  • finally, even if the interface tool allows new contributors from WP to edit WD, I can already predict that this source of contributions will be a problem for WD: the interface is not the main problem, the main problem is the addition of the whole data set linked to a value. Just look at the complaints of WD contributors: to add a reference, I need to create a new item with 2-5 properties or how data should be modeled (see the problem of date accuracy). How the interface will solve that ? I hope that the interest of WD will be taken into account and everything will be done to avoid a bunch of new statements without reference and without mandatory qualifiers, because this will just transfer the problem from WP contributors to WD contributors and I would not like to have to correct bad structured statements edited in WP.
Jeblad (talkcontribs)

I doubt this is correct the first beta version has to include a feature to handle reference. There is no interest from the big WPs (or at least no majority) to assess a tool without that feature

Most users (and communities) are willing to explore any solution to editing statements from Wikipedia, with or without sources from Wikidata.

The second point would set the bar muh higher for Wikidata than Wikipedia presently does, and I'm not even sure it is possible to enforce this at all. Not on Wikipedia, and not on Wikidata. It is possible to trigger warnings in some cases, like when a sourced document changes.

Snipre (talkcontribs)

Sorry but please add a reference to your affirmation "Most users (and communities) are willing to explore any solution to editing statements from Wikipedia, with or without sources from Wikidata". The three main WPs (en, fr, de) had RfC about WD use and put some constraints to the use of WD and the constraints were not related to the lack of possibility to edit WD from WP.

With Wikidata Bridge, there will be no change in the reasons which lead to limit the use of WD in WP. Some contributors are always ready to test new functionalties but if you don't convince the majority who emitted strong reserves in the RfC, then you will just see new constraints for the use of Wikidata Bridge. Is it what you want ? What are you proposing to avoid the criticismes already written in the previous discusisons ?

For the second point, there is no different treatments between WP and WD: WP always recommend to add source so WD Tools have to be able to edit references. You mix the possibility to add reference and obligation to add references. Wikidata Bridge doesn't need to force contributor to add sources but has to be able to provide a way to edit sources.

By delivering a beta version without possibility to edit sources, Wikidata Bridge will not comply with the recommendation of WP to add sources and will not take account of previous criticismes mentioning the poor data quality in WD mainly due to lack of sources.

Jeblad (talkcontribs)

Been involved in several discussions about Wikdata Bridge, and Wikidata in general, on several projects. (Betatesting at nowiki) The discussions goes as about ¼ is very vocal pro Wikidata, ¼ is very vocal against Wikidata, and ½ is pretty much indifferent but accepts use of Wikidata. The quarter of users against Wikidata usually claim they have way more support than they usually does, and more or less consistently claim their fringe problems are major problems. No, I don't buy it.

To recommend use of sources is one thing, but what you imply with Some data were correct once and not more currently. This doesn't mean that value was wrong all the time and perhaps in a certain period the data was correct. is way more than to simply recommend use of sources. Okey, so you back down and say it is no obligation to add references, good. It is a beta and will provide some functionality. More features will come in later updates, and will be defined by the devs and the communities involved in the betatesting. If that does not include enwiki, so be it.

Jc86035 (talkcontribs)

While I generally agree with what you've said, these issues are shared by the vast majority of existing Wikidata editing interfaces. Harvest Templates and Mix-n-match, for example, also do not add the "retrieved" property, and presumably have a much higher editing volume than this software will; and the wikidata.org interface is only limited by the existing property constraints and edit filters. Perhaps it would be inappropriate to make these things absolute prerequisites for the software's deployment, given that this wouldn't actually resolve the issues and that all the work to prevent these things could have almost no effect (given the existing editing volume). I think it would be much more effective to resolve these issues through a separate project that doesn't just involve one interface, but clearly the issues haven't yet been prioritized enough across the board, not just within the development of this software.

Moreover, given that existing software already suffers from these issues and that the situation doesn't change in this regard, the Bridge software could be a net positive in terms of vandalism/bad edits, since users would become less likely to click through to wikidata.org due to most pencil icons no longer linking directly to Wikidata items.

(From the Wikipedia point of view, even if a user adds just a plain URL as a reference, it's better than nothing and can almost always be fixed later. I would think this is pretty much the same for Wikidata.)

Jc86035 (talkcontribs)

While vandalism is a relevant subject here, I think it would be appropriate to address label/description vandalism, especially since it seems to be more common than statement vandalism. Addressing it properly would probably save much more time for experienced Wikidata contributors than adding an elaborate and foolproof citation system to the Bridge software would, and would probably require less development time and less user testing.

Perhaps the easiest way to address this would be to enable the SHORTDESC magic word on all Wikipedias (if not all Wikimedia wikis) – it does bring a number of benefits, particularly that more detailed/helpful generic descriptions (e.g. for disambiguation pages, lists, categories) can be enabled with one edit to a high-use template, rather than a bot run over a million items every time a word should be changed. At least on the English Wikipedia, I think the template also makes it seem easier (even if it isn't actually easier) to make the descriptions for individual articles more detailed. Of course, it also pushes the liability for fixing a lot of description vandalism back to the Wikipedias, which is convenient in several ways.

Stjn (talkcontribs)

I hope there will never be a decision to turn bad hacks for individual wikis like SHORTDESC into fully supported features. We should work on better (and by default) tracking of changes from Wikidata and deprecating those hacks, not on setting them in stone.

Jc86035 (talkcontribs)

I do think it helps to some degree, even if only because most English Wikipedia editors would never have otherwise noticed the descriptions' existence (e.g. [:w:en:Talk:Witness (Katy Perry album)#"Best album"? w:en:Talk:Witness (Katy Perry album)#"Best album"?], as well as #Brief article description vandalism on the same talk page – the vandalism edits actually stayed on the Wikidata item for almost half a year in total, which is concerning) because most English Wikipedia editors don't use the wikipedia.org portal or the mobile site and don't edit Wikidata. On the other hand, I wouldn't advocate for the use of SHORTDESC on every project, simply because there are only a few wikis where maintaining the descriptions could be plausibly doable.

Alsee (talkcontribs)

Remotely exporting Wikidata item labels as if they were article-descriptions was an extremely bad hack. The Foundation needs to ask the community before pulling stupid stunts like that.

Wikidata can happily go on it's own way as its own project, or it can be shoved down other wiki's throats and a mob will show up with torches and pitchforks wanting it burned to the ground.

Jeblad (talkcontribs)

Please moderate your post. Thank you.

Jeblad (talkcontribs)

While I understand some of the feature requests, a lot of the features described in the above thread has very little to do with Wikidata Bridge. They add a considerable feature creep which may make the project unfeasible.

Wikidata Bridge should allow editing of a statements value in the first implementation. That is the core feature, and that is what should be the beta.

ChristianKl (talkcontribs)

Why is adding statements with sources a requirement that makes the project unfeasible? It makes the project more complicated but it actually means that it costs less political capital to deploy the feature. The whole Wikidata Bridge project is to make interaction with Wikipedia easy. It won't be if you burn political capital for the sake of making the project a bit easier on the technical level.

Jeblad (talkcontribs)

Note that I wrote “They add a considerable feature creep” in plural.

Snipre (talkcontribs)

"Wikidata Bridge should allow editing of a statements value in the first implementation." References and qualifiers are part of the statement so these fields have to be editable in the first version of Wikdata Bridge.

Jeblad (talkcontribs)

Statements value (Wikibase/DataModel#Values), or more precise the object in the w:semantic triple (subject–predicate–object). References and qualifiers are part of the statement, but not part of the value (object).

There are several layers in a statement, and most users does not have a clear understanding of how they relate. I should have been more clear.

Jc86035 (talkcontribs)

By "beta", are we referring to the first usable version (e.g. something that would presumably be enabled as a demonstration on one of the test Wikipedias), or the first version to be enabled on a real Wikimedia project through the Beta Features preferences tab (which would actually be able to modify Wikidata)?

Lydia Pintscher (WMDE) (talkcontribs)

We will have support for adding references in the first version that goes live on a Wikipedia. Before that there will be iterations on a test system, the first few of of which will probably not have it. We will not make references mandatory in the first release because I think that adds a burden that we should not add unless absolutely necessary (and I can be swayed by feedback on the first releases). We will start with rollouts on small to medium sized wikis that ask for it. I believe these are the right ways to go forward because at the end of the day it is in the hands of the template editors on-wiki to decide where they enable the functionality and where not so they are ultimately in control. I hope that helps clarify things.

ChristianKl (talkcontribs)

We don't we start with design iterations, till we have a design that properly supports sources and thus doesn't come with the risk of alienating Wikipedians? WMF and EnWiki relations aren't on a particular high currently and it's prudent to avoid actions that have the potential to inflame matters further.

Why code up a design that's not intended to go live on a Wikipedia?

Reply to "EnWiki reception of Wikidata"
GreenReaper (talkcontribs)

I wasn't aware of this when posting my feedback to Wikidata's federation input discussion, but if we are using data from (and potentially donating data to) Wikidata on federated projects, it stands to reason that we may want to use a bridge to edit Wikidata within our UI (as well as any data items in our own Wikibase instance), just as you forsee it being done on your own sister projects. While I appreciate this isn't the focus at this time, perhaps it could be a possibility that you consider for the future?

Lea Lacroix (WMDE) (talkcontribs)

Hello and thanks for your feedback!

Although we're developing the tool with the usecase of editing Wikidata from Wikipedia for now, we have in mind that this extension could be used on any Wikibase instance in the future. This could be used to edit the content of a Wikibase instance from one of its client wikis.

The usecase of being able to edit Wikidata's data from a external wiki (not part of the Wikimedia projects) is not part of our short-term roadmap, but it could definitely be considered in a more or less distant future :)

GreenReaper (talkcontribs)

Thank you, Lee. As outlined in our policies, we have no wish to duplicate Wikipedia or other Wikimedia projects; rather, our goal is to complement them, and such a feature may assist in that.

Reply to "Use on federated projects"

New prototype (integrating references)

5
Lea Lacroix (WMDE) (talkcontribs)

Hello all,

Based on your previous feedback, we built a new version of the prototype, that includes showing and editing references. This v2 has been tested by our UX researchers during Wikimania, with a dozen of people. We are already working on a v3 that will integrate the suggestions of the interviewees, but we still wanted to share the v2 with you, so you can keep track on the evolution. So here it is!

As usual, please keep in mind that this is only a "click-dummy", faking real interactions and with only one user path available.

If you have any remarks or questions, feel free to answer on that thread.

Jsamwrites (talkcontribs)

This new version v2 of prototype is clearer than v1. However, I am wondering whether the retrieved date is autofilled (on clicking like in Wikidata) or editors have to fill it. May be one example can be shown.

Charlie Kritschmar (WMDE) (talkcontribs)

Hi @Jsamwrites, thank you for your feedback. Currently retrieved is not automatically filled, but that is on our list of features we'd still like to implement.

This post was hidden by Charlie Kritschmar (WMDE) (history)
Jeblad (talkcontribs)

Seems like the template type is part of the work flow; image 8 and image 9. This must be stored at Wikidata as abstract types, as there are a lot of different ways to implement references. Local code must also be prepared to handle typeless references, as it will probably be a lot of references that lacks type for a long time.

I'm attempting to make code for figuring out which template is a likely match in w:no:Modul:Property map, but it isn't completed yet.

Note that some references has an implicit type by lack of specific information, even slight differences in some information, or even that the information could be lacking completely. A well-known example is newspapers that publish short versions of the articles on the net, and complete articles on paper. When users find the story in the paper they try to look the articles up on the web, and link to them, without realizing the web-version is shorter, and that specific information (the reason for using the article as reference) is missing.

Note also that a lot of references at Wikidata does not have required pieces of information, like “title”. We could run a citoid-bot to fill in titles, and perhaps some other fields, but until that is done (not sure if that will ever be finished) a fallback for cite templates are necessary. This can be implemented quite easily in Lua, but it must be done.

Reply to "New prototype (integrating references)"
Mohammad (talkcontribs)

Hi, I'm working on creating a local page and a local announcement on Persian Wikipedia, explaining all that is explained here and updating that page accordingly, I'll also request for discussions both locally and on here, I was wondering, since not all editors on Persian Wikipedia are required to be fluent in English, is it possible if I occasionally post a translation of user discussions on this page and reflect WMDE answers locally?

Lea Lacroix (WMDE) (talkcontribs)

Hello Mohammad, and thanks for all the work you're doing! Of course, we'll be happy to answer the questions from the Persian Wikipedia community. Feel free to add a summary of the discussions here, and I'll answer any questions you have.

When the page is ready, you can also add it in the list on this page.

Jeblad (talkcontribs)

I wonder if there should be a page with links to local pages on the individual projects. That could make it easier to track down important implementations, and also differences.

Mohammad (talkcontribs)

Isn't there already one? @Jeblad

Jeblad (talkcontribs)

Not that I know of…

Lea Lacroix (WMDE) (talkcontribs)

I started a list on Wikidata Bridge/Updates, section "Pages about Wikidata Bridge on other wikis", is that what you were looking for?

Reply to "Feedbacks from a local Wiki"

Qualifiers as additional information

2
Jeblad (talkcontribs)

In some cases qualifiers are important. One notable use is in entries for spouse (d:Property:P26), or “ektefelle” at nowiki. See for example w:no:Knut Hamsun, where the row reads “Ektefelle Marie Hamsun (1908–1952)”. (Which comes from Wikidata in this case.)

Lea Lacroix (WMDE) (talkcontribs)

You're absolutely right, qualifiers often carry some meaningful information. We will investigate to find the best way to display them when it's necessary.

Reply to "Qualifiers as additional information"
Jeblad (talkcontribs)

During preparations for use of references from Wikidata for dates, by extending w:no:Module:WikidataDato, we found that some items has a lot of references for birth dates. One of the “worse” is d:Q4295 which has 30 references. That is slightly over the top, and we (I) implemented a simple scoring mechanism for prioritizing a few important (4) references. If such a mechanism is in use in an infobox, then an edit to a reference might not be visible. It may also move a reference to a visible slot, or vacate a visible slot. If the next number of references is not above max number of entries, then a change to a preexisting reference will not make it invisible. The problem arise when the number of entries goes above max number of entries. I believe the solution is to score and limit the entries in well-defined functions instead of a project specific module. That makes it possible to figure out if a specific entry will be visible in the infobox after editing in the gadget.

Current code uses a best score of language of title, root domain of reference url, linked entity, and property use. Code without to much doc is at w:no:Module:References. The code will probably be extended with a few other functions.

Lea Lacroix (WMDE) (talkcontribs)

Thanks for letting us know, this scoring mechanism is indeed very interesting! I'm not sure at what point we can integrate it, but we'll keep in mind the need for sorting and selecting references.

Jeblad (talkcontribs)

About “I'm not sure at what point we can integrate it”, perhaps I don't understand you, but when the scoring and prioritizing is done in a module it is outside the gadgets knowledge and it can't predict how it will be done. If the same thing is done in the Wikibase Client lib for Lua, or a parser function, then the gadget can predict how the infobox will look like and act accordingly.

An alternate solution to automatic ordering could be to manually order the references, but that ordering must then be project specific, which would create additional work load on the users.

Note that my scoring solution is not especially good, it is just one of many ways to do this. It can be viewed as a winner-takes-all on Manhatten metrics, pretty straight forward. If we had better statistics an Bayesian estimator could give better results.

Reply to "Limiting use of references"
Jeblad (talkcontribs)

Seems like nowiki has passed critical mass of users voting “for” in our poll for consensus. The poll is still on-going, but I doubt the outcome will be much different. The group of usual no-sayers at nowiki is to small to change the outcome substantially. The poll is finished 14:41 (UTC) 1. July 2019.

At nowiki we have various solutions to the edit infobox problem, but among the more common ones is a link with a pencil symbol to the correct section in the Wikidata item. I guess it should be pretty easy to tweak.

Lea Lacroix (WMDE) (talkcontribs)

Thanks! Can you give a link to the poll?

Jeblad (talkcontribs)
Jeblad (talkcontribs)

Final outcome is 18 for joining the betatest, and none against. [Later increased to 20 for and none against.] Jeblad (talk) 15:52, 1 July 2019 (UTC)

Jeblad (talkcontribs)
Lea Lacroix (WMDE) (talkcontribs)

Thanks! We're currently working on a way to view and edit references, so it can be part of the first testable version.

Reply to "Betatesting at nowiki"

Some questions about the software

8
Jc86035 (talkcontribs)

I only found out about this today, so I may have missed something. I have tried to use the demo, or at least I think I clicked through to it.

Anyway, I do have some questions and concerns. (These may be relevant to both the minimum viable product and future versions of the software.)

  1. When will the software overwrite existing values – always, sometimes or never? (The reason I ask, for those who are unaware, is that it's actually quite rare for it to be good practice to delete a previously valid value. Often either the new value should be marked as preferred (e.g. things that normally change over time) or the old value should be marked as deprecated (e.g. things that are newly discovered to be incorrect). In some cases none of the values should be preferred or deprecated (e.g. child (P40)), and in some cases multiple values may be preferred (e.g. occupation (P106)).)
  2. Will there be per-property or per-template options on Wikidata to specify what should happen when a value is modified, or will this be left to template creators? Allowing this to be set for each property would presumably save time and prevent stuff from breaking. (I think this might actually be possible with a Wikidata property to be used on other Wikidata properties, but that route wouldn't work for infoboxes that use the plain parser functions, as you would need a helper Lua module to bring the values to the templates.)
  3. Will the software surface the options to change preferred and deprecated ranks?
  4. If there are things that won't be editable through this interface, will the software assist the user in finding help or in modifying the item through the actual Wikidata interface?
  5. How will the software handle properties for which multiple values are used (e.g. occupation) and for which all (or some) of the values should be shown in an infobox?
  6. How will the software handle statements with qualifiers and references? If a value is overwritten, will the qualifiers and references stay? (Obligatory note that qualifiers can significantly change the meaning of a statement, especially for properties for which qualifiers are mandatory.)
  7. How will the software handle properties for which multiple statements with the same value but different qualifiers may be used (e.g. political offices held more than once, both non-consecutively and consecutively)? (I ask mainly because QuickStatements cannot add or modify such statements correctly.)
  8. On Wikidata, statements with an obvious sorting order are usually left out of order because it doesn't matter. If these are presented as sorted by the template (e.g. if Lua is used to sort the values), will the software be able to handle this properly or tell users not to break the existing values?
  9. Some Wikidata templates/modules can take the first value for a property (regardless of whether any values are preferred or deprecated) and discard the others. (I have recently used this configuration in several external link templates.) Will Wikidata Bridge be able to handle this? Will it be able to tell the user about the other values, if there are any?
  10. Will the software tell users how to add local parameters (e.g. fair use images) or allow users to do it through the interface?
  11. How will the software handle wikilinks? Assuming that the user is supposed to enter a page name and not a QID, if a user enters the title of a redirect that is linked to a Wikidata item, how will the software react? Will it ask the user to differentiate between the redirect's item and the target's item?
  12. How will the software handle units? As an example, on the English Wikipedia, both kilometres and miles may be acceptable for the same field, so it could be necessary to allow this to be modifiable (I would note that the demo/prototype actually omits the unit, which could be detrimental).
  13. Which datatypes will the software be able to support upon initial wide release? Is the goal to support every datatype, or just a subset of them?

As an editor of both Wikipedia and Wikidata, I'm cautiously supportive of this, but there are a lot of edge cases that would have to be handled before allowing random users to set this up everywhere, since you could break a lot of items by making it easy to enter bad data that looks correct into an infobox. It would be a shame if this just ends up making it more difficult for Wikidata editors to keep items in order.

Lea Lacroix (WMDE) (talkcontribs)

Thanks for your feedback! Here are the answers from the development team.

  1. The interface will guide the editor through edit flows to make sure that the right action is made on Wikidata. Overwriting existing values can happen from time to time, when the existing value is wrong and needs to be fixed.
  2. It's going to be a per template options, that can be set in Lua, because of some very generic properties that are use in various places
  3. Ideally, the tool will edit ranks, but in way that is transparent for the Wikipedia editors? Our goal is to allow Wikipedia editors to edit Wikidata's data without having to fully understand the data model.
  4. Yes, that's a good point, we'll have a link or icon "edit on Wikidata" somewhere
  5. We'll try to show them all in the interface (that will probably not happen in the first version, but later)
  6. We are not sure how it will work, we're still looking into this
  7. Hopefully this will be solved by showing all values (see 5.)
  8. I'm not sure how this would be a problem - can you describe a use case where the order could induce a mistake from the user?
  9. Yes, by showing all values
  10. Not in the first versions, possibly later
  11. In the first version, the tool won't be able to edit values that are links to items - so the question will be solved later
  12. Good point, the editor should be able to see and change the unit
  13. In the first version, we won't support values that are links to other entities (items, properties, etc.) We'll definitely support string and URL datatypes. We'll try to get as much as we can for everything in between.
Jc86035 (talkcontribs)

As an aside to #12, right now each property has to specify its own valid units in its constraints. It would be really nice if these (and similar types of needlessly-duplicated constraints) could be unified across properties which are supposed to use the same units, particularly because a different structure could potentially allow for customizations within Wikidata Bridge like putting important units first and hiding joke units like smoot (Q2095762) and fortnight (Q2993680) or very specific units like Stardate unit (Q50277568).

This sort of work could also potentially be used to improve the actual Wikidata/Wikibase interface – I think a static dropdown would be a welcome improvement over the current find-the-item-yourself search box.

Jc86035 (talkcontribs)

For #8: there's a small chance that the values being shown out of order could nudge a good-faith user to "sort" the values by deleting all of the ones that are out of order, and that could result in errors being introduced (particularly if some parts of the data aren't shown).

Jc86035 (talkcontribs)

For #1: I think it's considered good practice to leave in certain deprecated values (e.g. see d:Q167#P1181), so perhaps it could be left up to other Wikidata editors to remove the deprecated values.

Jc86035 (talkcontribs)

For #2: How will this be "set in Lua"? Will the dialog boxes be enabled by using e.g. HTML attributes to activate JS (which would not require Lua), or will the module have to generate the dialog boxes? Is there a particular Lua function which will be necessary for this?

Jc86035 (talkcontribs)

I also have another question (thank you for your quick response, by the way). Some templates, such as w:en:Template:Authority control, have one pencil icon for all of the data. Will the template have to be changed so that each identifier has to have its own pencil?

Lea Lacroix (WMDE) (talkcontribs)

Yes, this is probably part of the small changes that the template maintainers will have do to, in order to have a Wikidata Bridge-compliant template.

Reply to "Some questions about the software"
Andrei Stroe (talkcontribs)

In the Romanian Wikipedia we already make heavy use of Wikidata in our infoboxes. We would like to participate in the beta testing session of this feature when the time will come.

Lea Lacroix (WMDE) (talkcontribs)

Hello, and thanks a lot for your suggestion! Is there already a discussion within the community? We don't ask for a proper consensus but at least that the people who maintain infoboxes are aware and enthusiastic about it :)

Andrei Stroe (talkcontribs)

The response to your announcement is that discussion. It's brief for now, as there aren't many of us who maintain infoboxes, but I'll ask again for an explicit agreement.

Lea Lacroix (WMDE) (talkcontribs)

Awesome, thanks! I'll keep your contact and let you know as soon as I have a clearer view of the next steps and how your community can get involved :)

Do you have a link to an example of an infobox that reuses Wikidata's data?

Andrei Stroe (talkcontribs)

There are several examples. w:ro:Modul:InfoboxSettlement (see w:ro:Novi Pazar, Serbia for a page where it takes all information from Wikidata) and w:ro:Modul:InfoboxTeamSportBio (supporting w:ro:Format:Infocaseta Fotbalist and w:ro:Format:Infocaseta Handbalist - see w:ro:Ianis Hagi for a use example) are infoboxes written in Lua to obtain data from Wikidata. Others just rely on a generic infobox template and take advantage of some other useful templates dedicated to pulling data from Wikidata. One of these is w:ro:Infocaseta Galaxie, se as an example w:ro:NGC 1097.

Reply to "Beta-testing on ro.wp"