Extension:Cognate/es

Cognate crea un espacio central donde se almacenan los títulos de las páginas de un conjunto de sitios. La extensión puede entonces generar enlaces interwiki a través de proyectos wiki en casos donde los títulos son iguales. Fue desarrollado para solucionar la tarea «Centralizar enlaces interlingüísticos para el Wikcionario».

«Cognado» (cognate en inglés) es un término lingüístico que se refiere a palabras de distintos idiomas con el mismo origen. Como esta extensión se basa en un mecanismo similar de traducción, el nombre resultaba corto y apropiado, a pesar de que el concepto en sí es diferente.

Asunciones y restricciones

 * Pages must be in one of the standard MediaWiki namespaces.
 * Page titles are the same across languages (with some simple normalization applied).
 * Sites should have the same interwiki structure for language links.
 * Pages should not contain inter language links in wikitext as these will override the link provided by Cognate.
 * Unexpected hash conflicts are unlikely but could occur, and would result in unexpected language links.

Title Normalization
Very simple title normalization occurs within the extension. This can be seen in the StringNormalizer class.

Initially the amount of normalization is very small. Requests can be made to expand this and will be added on a case by case basis.

Title Hashing
Titles are hashed using sha256. This can be seen in the StringHasher class.

Part of the hash is then stored in the database in a BIG_INT field for efficient lookups.

There are roughly 18,446,744,073,709,551,615 possible values.

Matching Hashes
As titles that require links are assumed to be the same post normalization, they will result in the same hash and thus the same Int stored in the database.

Some sample data might look as follows when loading the "Foo..." page on enwiktionary.

Sobreescritura
It is possible to overwrite the automatic links provided by Cognate, simply by adding one or more interwiki links in the page.

That also means that to make Cognate work when the extension is deployed, the pages should not contain inter language links in their wikitext.

Testing


The extension can be tested on beta wiktionary sites:


 * https://en.wiktionary.beta.wmflabs.org/wiki/Wiktionary:Main_Page
 * https://de.wiktionary.beta.wmflabs.org/wiki/Hauptseite
 * https://he.wiktionary.beta.wmflabs.org/wiki/%D7%A2%D7%9E%D7%95%D7%93_%D7%A8%D7%90%D7%A9%D7%99

These sites are linked together using the Cognate extension with added interwiki sorting provided by the InterwikiSorting extension.

Instalación
php ./maintenance/populateCognateSites.php --site-group=wiktionary php ./maintenance/populateCognatePages.php
 * Populate the sites table by running the populateCognateSites.php maintenance script. Sites must already exist in the MediaWiki sites table with the correct groupings.
 * Populate the page and title tables by running the populateCognatePages.php maintenance script.