Extension:Svetovid

From MediaWiki.org
Jump to navigation Jump to search
MediaWiki extensions manual
OOjs UI icon advanced.svg
Svetovid
Release status: stable
Svetovid link creator 2.png
Implementation Special page
Description An extension that provides grammar-aware link creation assistance
Author(s) Ostrzycieltalk
Latest version 1.2.7
Compatibility policy master
MediaWiki 1.34+
PHP 7.2+
Database changes No
License MIT License
Download
Example Nonsensopedia, requires at least rollback rights (go ahead and ask Ostrzyciel for a demo!)
Check usage and version matrix.

Svetovid (Polish: Świętowit, a Slavic deity of war, fertility and abundance) is an extension that provides grammar-aware link creation assistance.

This extension was developed with the Polish language in mind, but it could potentially be used with other declensed languages. Slavic languages should be relatively easy to implement, as they have similar grammar structure, other languages may require more changes. If you are interested in using Svetovid with a different language, please contact this extension's creator.

Installation[edit]

Now, this is a crazy extension and installing it is all but easy. Venture further on your own mental health risk.

  • Install the CirrusSearch extension along with its required dependencies. Have fun!
  • Install the AdvancedBacklinks extension. It's crazy enough by itself and requires a patch to MediaWiki core. Yay.
  • Install Morfeusz 2, Polish inflectional analyser and generator. Specifically, you need the dynamic library (libmorfeusz) and a dictionary (go with PoliMorf, it's more complete).
  • Install the Pistache library. Again, you need libpistache somehow installed in your system. You'll probably have to compile it yourself.
  • Install Morfoapi, a custom API daemon that exposes Morfeusz to MediaWiki over HTTP locally. You will have to compile it from source, but it shouldn't be too hard, you just need CMake for that.
  • Run Morfoapi, preferably as a daemon. Refer to your system's documentation for information on how to do that.
  • Download and place the file(s) in a directory called Svetovid in your extensions/ folder.
  • Add the following code at the bottom of your LocalSettings.php:
    wfLoadExtension( 'Svetovid' );
    
  • Configure as needed.
  • Yes Done – Navigate to Special:Version on your wiki to verify that the extension is successfully installed.

Configuration[edit]

$wgSvetovidMorfeuszURL
The URL to the Morfoapi morphological service. Default is http://localhost:8145/declension (the default port Morfoapi will listen on).
$wgSvetovidSearchCacheExpiry
How long search results will be cached in seconds (i.e. how long a user can use retrieved results to edit pages). Default is 3600 (one hour).
$wgSvetovidSearchBlacklist
Titles that should not be returned in search results, you can for example add there your Main Page. Default is an empty array.
$wgSvetovidDefaultNamespaces
Default namespaces that should be searched, depending on the namespace of the page the user wants to link to. For example, if you want to search only the main namespace (ID 0) for main namespace queries and the help (ID 12) and main namespace for help namespace queries, you would use something like this:
$wgSvetovidDefaultNamespaces = [
    0 => [ 0 ],
    12 => [ 0, 12 ]
];
By default this configuration variable is empty. If you don't specify default namespaces for some namespace, Svetovid will use the list from $wgContentNamespaces.
$wgSvetovidMaxSearchResults
The maximum number of search results returned by the search module. Setting this to a high value can impact performance negatively, as all pages are processed ahead of time, during the search operation (this is necessary to remove false-positives). Default is 15.

Usage[edit]

See below for screenshots of every step described here.

Let's say you decide the article about Abraham Lincoln does not have enough backlinks. To fix that, visit the special page Special:LinkCreator, or just click the Link creator link in your toolbox if you enabled this option in your preferences. After selecting the page you want to work with (i.e. link to), the creator will try to interpret the text you entered in terms of grammar. This is necessary, as Polish grammar is not always unequivocal. In this case, Lincoln can mean a surname and a brand of a car. They declense differently (they have a different grammatical grammatical gender), so we have to choose the right interpretation. This interpretation could be also extracted automatically from the context by using a syntax analyzer. This is not implemented yet, but may be added in the future.

After selecting interpretations for all words in the query text (which in this case is the same as the page title, but that can be changed), the creator automatically declenses the text in all cases and both numbers. You can then correct this (there is usually no need to, these are taken from a high quality dictionary) and specify other search options, such as omitting pages that already link to the page you are working with. Only direct links count, though, so no navboxes will affect that. What's a direct link? You can read about that on the AdvancedBacklinks extension page.

The only thing left to do is hit the search button. Then you will be presented with a list of pages that can link to the page you are working with, together with incoming and outgoing direct link count. When you click the edit button, a new browser tab will open with the standard wikitext editor. The links are added automatically, but you can always review the changes before saving. That's it!

How it works[edit]

For grammar things (analysis and generation) Svetovid uses the Morfeusz 2 library provided by the Polish Academy of Sciences. This tool uses a compiled dictionary in the form of finite state automata to perform very efficient morphological analysis and generation. The library is operated by a separate program, Morfoapi, which exposes some Morfeusz functionality over a REST API and does some additional processing on the queries. It is very fast thanks to being written in C++ and the amount of communication between it and MediaWiki is minimized to a single request per search to keep overhead as small as possible.

For searching CirrusSearch is used, as it provides scalable and feature-rich searching on a wiki. Using a simpler, built-in search engine may also be possible, but it wasn't investigated yet.

AdvancedBacklinks provides additional information on direct backlinks, so one can avoid having duplicate links.

Why was this created and the why is this on mediawiki.org?[edit]

This, along with the AdvancedBacklinks extension, was created specifically for Nonsensopedia. After examining statistics about direct links on the wiki it turned out a lot of content was hard to reach, as there were no or very few links to these pages. We also do not consider navboxes being a serious remedy for that – they are huge, ugly and just don't make any kind of sense on mobile skins. The result of that discussion are these two extensions that hope to provide ultra-powerful tools for solving these problems. Based on our initial evaluation, this tool does deliver that. We think other wikis may have similar problems to us, even Wikipedia. Ask editors, do some statistics.

As for why this extension was put up here – partially because we want to invite others to join the effort of creating MediaWiki powertools and partially for bragging. The project is far from perfect, but we will surely keep improving it. Do get in touch if you are interested :)