Talk:Wikidata/Archive 1

XML
For many database projects, you do want to define XML definitions to the data as well. This definition could then allow both data import and data export. This would really open up the data content.

For Wiktionary there are many people currently outside Wiktionary that will really welcome a better structured dataset. There are many resources on the web that we could integrate with if we have a mechanism. A database (stucturing the data) and mechanisms like XML are the ticket. :) GerardM 12:41, 17 Sep 2004 (UTC)

Wow - this is a really great idea! --Daniel Mayer 17:36, 17 Sep 2004 (UTC)

I concur with GerardM. Wiktionary is a mess in large part because the necessarily and rightly loose structure of Wikipedia is incompatible with the way a dictionary works. With something like Wikidata, we could begin to do real lexicography. I have a half baked structure to use, one flexible enough for minimalist entries and rich enough to do things no print dicitonary does. Please go forward with this. I can't write PHP and I don't know databases very well so I can't contribute much on the technical side, but I can contribute applications if the code is in place.

Diderot 13:52, 23 Sep 2004 (UTC)


 * Do you have a Wiktionary structure written down somewhere already? That would help in mapping out the requirements for Wikidata.--Eloquence


 * Sort of. I'm part of a team writing a commercial terminology application for a translation firm.  What we've done is to adapt a structure that supports a much richer set of lexicographic needs while maintaining a lot more flexibility.  Alas, that particular schema is not GFDL.  However, I have an alternative - but similar - approach which has few IP encumbrances and some very different priorities.  Ever since we started on this project, I keep having these feelings of this could be implimented on Wiktionary.  I'll see what I can do for you in the next couple days.  Failing that, I'm leaving on vacation on Sunday and will write something on the plane and post it from Canada.


 * In the mean time, and as an example of a quite extensive feature set for lexicography, take a look at TBX. This was our starting point.  TBX is freely useable - it has no IP issues.


 * Diderot 18:55, 23 Sep 2004 (UTC)

Lisa
For use with Ultimate Wiktionary, the specifications from Lisa are relevant to dictionaries. As this is the standard used in the localisation industry it makes absolute sense to invest in it. TBX is an open XML-based standard format for terminological data. This standard provides a number of benefits so long as TBX files can be imported into and exported from most software packages that include a terminological database.

Maybe it is possible to support several standards, but I think it makes sense to use the standard that applies to a particular dataset. GerardM 21:59, 5 Apr 2005 (UTC)

Meta-object Facility (MOF)
The OMG Group has defined a specification called the Meta-Object Facility, which is a layered architecture for defining object meta-data which I think it will be useful to follow when implementing Wikidata. Basically, a Wikidata user must first define the structure of the data he will be creating in his Wiki- this is the model, or the meta-data. MOF already has a specification for a meta-model/meta-metadata language, which Wikidata can implement instead of having to create on its own.

OMG also has a XML spefication called XML Metadata Interchange (XMI) for exchanging meta-data. I'm not sure if it has any use in plain data exchange, however eventually supporting XMI will allow users of Wikidata to create their models using graphical UML tools.

BTW- it is indicated below that development has already begun on Wikidata. Is there a mail list, IRC channel, etc. to coordinate help from volunteers? Jleybov 19:16, 5 Apr 2005 (UTC)

Cooperatation vs. Integration
Seems very intersting, useful and complicated. I doubt a You-Can-Manage-Every-Data-With-MediaWiki-Software is the right choice for all. There are also free databases we can colaborate with. For instance I'd something linke Wikibibliography or Wikicatalouge where I can correct bibliographical data of libraries, but I do not want to copy all the data of millions of books into MediaWiki. See Linking to databases for a simple strategie that could lead to more cooperation with already existing databases. (more detailed in German here. Better cooperate with an already existing database than trying to reinvent the wheel once more. -- Nichtich 09:49, 19 Sep 2004 (UTC)


 * The thing with WikiData is that it allows for creating a database that integrates within the Mediawiki software. It will have a UI that will not require to change the skins everytime.


 * Point is this is technical functionality. Not what content it will be that is used. Your point is correct tough, wikimedia will not have a database for everything, and I expect that it will not be for everyone to create a new mediawiki dataproject. I expect that for each new project we will have prior discussions. GerardM 10:42, 19 Sep 2004 (UTC)


 * For integration I prefer not creating a database (there already are databases) but a simple protocol how to integrate Databases into MediaWiki. The Databases can be a Wiki itself but in many cases there are experts creating a databases that cannot be created by everyone. They have to make their data free so we can integrate the data, not the entire database. -- Nichtich 13:32, 20 Sep 2004 (UTC)


 * What are you trying to say; I do not understand. There are "experts" in our crowd. One of the problems with many databases is that they are fragmented or hard to reach or in a proprietary format. With a Wikidata, we will be able to host databases. I do expect that we will not host everything, but first define a need. When we cannot add value by hosting the data, I do not think we should. There is also a difference between defining the database and filling the database with content. The definitions will be done by the "experts" but filling the content takes another kind of expertise. Not all the content/databases will be interesting to everyone.


 * Really I do not really see what your point is. GerardM 17:27, 20 Sep 2004 (UTC)

Semantic Web
Sounds like connecting Wikimedia to the Semantic Web (I hope so). The idea of Semantic Web does only function if everybody gives his information for free, anyway. -- Nichtich 09:49, 19 Sep 2004 (UTC)


 * I sure hope so. It would be very interesting for the Semantic Web community to leverage the community of Wikipedia to create ... wow, I'm on a loss of words on what we would create... --denny 11:53, 1 Dec 2004 (UTC)

Software changes
Brion wrote: ''Please note that the statement previous [that software changes are required] is completely false. This would work similarly to templates and plugins such as TeX math bits already; it's supplementary to the main text editing work.''


 * That depends on what exactly you are trying to accomplish. If you are talking about something like Magnus' Special:Data, with its own namespace, you are correct. However, I prefer a view where everything, including regular article pages, are wikidata and can be easily complemented with new fields, relations, etc. I also believe that wikidata fields must be easily indexable for performance reasons.--Eloquence


 * Everything is Wikidata... is that actually accomplishable with MediaWiki or would we need a new implementation? --denny 11:53, 1 Dec 2004 (UTC)


 * Mediawiki proper and Wikidata projects are apart. Yes, fields can be added to the Mediawiki data but this is already the case. Wikidata is to enable data that we want to host that cannot be served properly by Mediawiki alone. GerardM 18:14, 5 Apr 2005 (UTC)

Wikidatabase
Hi, it'll be great if this Wikidata project had open accessibility ports. All the informtion within wikidata could be accessed via open and defined MySQL queries from any user. Thus, you could easily implement wikidata access to any software / website programmed:

for example, imagine there would be a database within Wikidata that contains yearly statistics about precipitations (that is, rain) for several regions on the world. This database is openly accessible. Thus, you can either access this database with the Wikidata website, but also with a self-programmed software, that queries the MySQL database directly. On this way, the data within Wikidata would not be "locked away" to only the web interface.

Please tell me, what you think of my idea. Thanks, --Abdull 09:03, 24 Mar 2005 (UTC)


 * If this is going to happen, it will be VERY costly in hw resources. At this moment I do not envision this in the foreseeable future. First we have to make it work. GerardM 07:19, 27 Mar 2005 (UTC)


 * This would be a nice feature but it requires careful analysis of its security implications. For example, allowing arbitrary SQL queries to be run against a dataset creates an easy way to launch denial-of-service attacks- say, by doing Cartesian products on the largest tables with no filter criteria.  Until a way to manage this is better understood probably best to just make uploading data-dumps more convenient so that people interested in data-mining can do it on their own hardware Jleybov 20:07, 27 Mar 2005 (UTC)

Timeline
Just a thought on a way to expand this proposal (which I like very much, BTW):
 * Create pages with content one would find on a timeline (e.g. George Washington marries Martha Washington) and tag it with the date it occurred as well as subjects it concerns
 * Create a software feature that could automatically create timelines from these pages, so that one could easily create a timeline of David Bowie's career or French history in the 1640s or whatever.

TUF-KAT 06:24, 27 Mar 2005 (UTC)


 * With all respect, I do not see what this has to do with embedding DATABASE functionality in Wikidata. Given also that each DATABASE project needs its own approval to start of with, I think this is not relevant to THIS subject. GerardM 07:17, 27 Mar 2005 (UTC)

Wiktionary and the status of Wikidata
Hello. The Spanish Wiktionary doesn't quite begin to take off because there has been a year-long discussion about the format of entries. Even though so far there's no agreement as to what new format to choose, people there majoritarily agree that the current structure used in other Wiktionaries is failed and doesn't meet the requirements to build good dictionary entries, so many (among them myself) don't see the point to start adding thousands of entries in a format that sooner or later will have to be changed because it simply doesn't work well for a dictionary. I myself have tried to develop a new structure to better organize the information within the dictionary entries, but I clearly see the root of the problem lies in the limitations of a software made and thought for encyclopedia entries (consisting of rather lengthy free-form texts divided into thematic sections and subsections) instead of for dictionary entries (consisting of rather short strings of data that fit well into pre-defined fields with a complex set of interrelationships, i.e. a typical database structure). I saw this issue had already been raised in the early days of the English Wiktionary, but since it hasn't been implemented there so far, I thought no one was really interested in developing the necessary software. Today I stumbled upon this Wikidata project and it's exactly the kind of thing that I believe Wiktionaries are in very bad need of. So I'd like to know if it is merely another proposal that might never come true (so that we'll have no choice but to go on building our Wiktionary upon the poorly-fitted freeform-text format), or if the new software is already on its way to become true (so that we might start thinking of a new dictionary-friendly structure for Wiktionary basing it on typical database capabilities like fields and relationships between fields). Thanks. Uaxuctum 05:51, 4 Apr 2005 (UTC)


 * Uaxuctum, I'm a semi-professional lexicographer, and I haven't contributed much to Wiktionary precisely because I think it's free form format isn't viable, and I haven't really had the time to wade in over the last year and argue for a better one. Much of the problem is with Mediawiki itself and its free form entry structure, and yes, Wikidata might fill the gap.  But I don't know what the status of this project is.


 * But you've identified the right problem: Entries have a complex link structure which is not supported by Mediawiki, and require a quite fixed structure for much of the data in each entry.


 * Diderot 13:04, 5 Apr 2005 (UTC)


 * The Wikidata project is a required building block that will enable the Ultimate Wiktionary. This will enable an any to any dictionary and it will create a fixed structure for wiktionary content.


 * There is a budget for the programming of both Wikidata and Ultimate Wiktionary and the programming has started. GerardM 18:10, 5 Apr 2005 (UTC)


 * There are several competing ideas on what the UW database might look like. I have held back my ideas on META (ERD) as I really want to see what kind of thing people would like to see. GerardM 22:02, 5 Apr 2005 (UTC)