Manual:Pywikibot/Use on third-party wikis

The pywikipedia bot may be used to do all kind of things that are important for the maintenance of a MediaWiki project. When this software is to be used outside of the Wikimedia projects, some configuration needs to be done.

Some non-Wikimedia projects, or families, are already supported. These can be found in the families folder which can be downloaded.

Using the existing files as examples, it should be easy to adapt the bot to your own project.

user-config.py file
Open a text file. (Notepad.exe is a text file editor)

Save the text file as, in the main pywikipedia folder.

Add the following three required lines to :

Now save  again.

Your user-config.py should look something like this:

mylang='en' family = 'dead' usernames['dead']['en']=u'bobtheperv' console_encoding = 'utf-8'

family.py file
Modify the existing files below, or create a new file, in a text editor such as notepad.

Save the file in the pywikipedia/families folder, with a name such as

Follow the sections below in order (each section makes the following section easier).

You must have the correct details as described in the sections below.

Potential problems:
 * errors in the custom namespaces mean the family file won't work properly (though you'll get a warning when you login?)
 * an error in specifying index.php will prevent you logging in;
 * an error in specifying api.php seems likely to lead to problems, since the wiki is accessed via API.

The  file in the families folder offers considerable documentation of the options available to family files. In fact,  is a simple 'prototype' family file. You can just copy  to sitename_family.py, then open up the sitename_family.py file in a text editor, and tweak it per the instructions below.

index.php

 * Note, several family files do not have this in the file, for example, battlestarwiki_family.py

You must have a correct index.php path in family.py. To check if you have the correct path for your index.php, test which address redirects to your wiki's homepage - probably one of these:
 * wikidomainname.org/wiki/index.php
 * wikidomainname.org/w/index.php
 * wikidomainname.org/index.php

(You may be able to guess from the address of a regular wiki page, but you need to check it - e.g. a wiki page may be wikidomainname.org/randompage but the correct path may be 'w/index.php' - this occurs at fr.ekopedia.org for example.)

Near the end of your family.py file, you find something similar to this. Modify or remove the "/w" as necessary:

def path(self, code): return '/w/index.php' #The path of index.php, look at your wiki address.
 * 1) This line may need to be changed to /wiki/index.php or /w/index.php,
 * 2) depending on the folder where your mediawiki program is located.

api.php

 * Note, several family files in the pywikipedia family folder do not have this in the file, for example, battlestarwiki_family.py

You must have a correct API.php path in family.py.

The API.php file is found in the same place as index.php. Test this by navigating to for example:
 * your.domain.org/wiki/api.php

or
 * your.domain.org/w/api.php

or
 * your.domain.org/api.php

The correct address will give you an auto-generated MediaWiki API documentation page (the important thing is that it confirms you have the correct address).

For the '/w/api.php' case, api is defined like this: def apipath(self, code):

return '/w/api.php' #The path of api.php

Custom Namespaces

 * Note, several family files in the pywikipedia family folder do not have this in the file, for example, battlestarwiki_family.py

''' Why do you need to add a custom namespace to a family file? Please explain '''

Adding a custom namespace to a family file is not well documented in the  file. The Uncyclopedia example below has some examples using custom namespaces, but the addition of these namespaces to your family file requires knowing the numerical ID of each namespace.

Finding the details you need: Start with the api url (which you found above) and add this string to the end: ?action=query&meta=siteinfo&siprop=general|namespaces|namespacealiases|statistics

...so your address might look something like:


 * http://your.domain.org/wiki/api.php?action=query&meta=siteinfo&siprop=general|namespaces|namespacealiases|statistics

The resulting page will list the information about your namespaces that you need for the family file, including the numerical ID.

explains two ways to list namespaces:

1. Changing a specific language's namespace:

2. Changing an entire batch of languages at once:

When adding custom namespaces you must use the second style. The first method only works because  already exists. With custom namespaces, they do not, so you must use the second style, including the curly braces, even if you only have one language. For example, suppose you have a News namespace, and your Wiki only operates in English:

If you do not use this format, the family file will not work, the resulting error message will not help - it will ask you to create a  file, even if that file already exists and works correctly.

Custom User Groups & Permissions
PyWikipedia assumes that the Wiki it is running on uses the standard, default user groups and that each group is assigned the usual permissions. This means that it may refuse to do things that its account would allow, and it may attempt to do things that its account is not allowed to do (these attempts will fail, so this is not a security issue).

The fact that it will attempt things it is not allowed to do is minor, but the inability to perform certain actions (e.g. deleting pages) without being in the "Sysop" group can be a sizable limitation. If the bot account has such permissions without being in the "Sysop" group (being part of some other, non-standard group), the only known work-around is the edit Wikipedia.py itself. This file controls most bot actions, and is found in the main folder PyWikipedia has been installed to.

In the _getUserData function within Wikipedia.py, a pair of lines like this exists: In version 2008-10-29T19:21:05.438703Z 6043, this is line 4705 and 4706, but that line number may be different, even very different, in other versions. Using the Find function of your text editor to find " " (to find the function definition) is recommended.

In order to get the rights listed in the second line, the first line must be changed to recognize your custom user group (note: if your custom user-group has some, but not all of those rights, changing this will not give you those rights - it will simply let PyWikipedia attempt to exercise them if you direct it to. Doing so is strongly not recommended as the program's behavior in such a case is undefined). The exact name of the user group can be found by logging in manually as the bot, and viewing the HTML code of any Wiki page. Near the top, a line as follows should appear: Where the ellipsis is replaced by a list of the groups that the bot is a member of. Choose the one that matches the custom user group that gives you the rights that you are attempting to exercise, and replace Sysop in the above code from Wikipedia.py with that name. This should activate those rights for PyWikipedia.

Running the PyWikipedia Bot
Refer to Using the python wikipediabot on how to run the bot.

Example: Mozilla wiki
The Mozilla Foundation's wiki, wiki.mozilla.org, is a very simple example because it is only available in one language.

This is the contents of. Hints for you to write your own family specification are underlined.

Example: Starwars
This is the content of the Starwars wiki at wikia. The file is located at.

Here explains how to configure the Pywikipedia bot to work at this site.

Example: Memory Alpha
memoryalpha_family.py is the "family" definition of Memory Alpha, www.memory-alpha.org, a Star Trek wiki. This specification is a little bit more difficult because it has several languages.

Example: Uncyclopedia
The various Uncyclopedias are slightly more awkward as not all are hosted at the same domain or under the same name. Domain names and paths must be specified individually. Just over half are Wikia-hosted; exceptions include fi: hu: ja: ko: no: pt: sv: and zh-tw:. Many have their own registered domain names and many use custom namespaces.

The approaches which work for an Uncyclopædia or a Memory Alpha project typically can be adapted to other Wikia.

''Note: There have been subsequent updates and changes, see botwiki:python:uncyclopedia_family.py or uncyclopedia:es:usuario:Chixpy/uncyclopedia_family.py for more current versions of the Uncyclopedia interwiki bot configuration. There are also unresolved issues in which some interwiki languages are not available from all Uncyclopedia projects or point to incorrect/inconsistent destinations; proceed with caution.''

Language
For a single-language site, the language specified does not matter as long as it is consistent between the user-config.py and families/foo_family.py

Login failed. Wrong password?
Pywikipedia does not report anything more useful than success, failure, or host connection failure. If possible, try accessing the web server logs (apache uses access_log by default) and take a look at the URL strings.

You could also try running login.py in 'very verbose' mode, ie:. This will dump a lot of information, including possibly the html code from the server, so you can see exactly what is going on. (this option does however run the risk of possibly revealing some security sensitive info so be careful...)

Make sure your scriptpath, the relative path to your api.php and index.php files, is defined appropriately for your wiki in your families file:

If this does not help, add a line like

to your user-config.py file.

See the mozilla configuration for clues.

Mismatched interwiki configuration
In some projects (such as Uncyclopedia), each language operates as an independent wiki. This may mean that interwiki tables differ from one individual wiki to another within the same project. Interwiki.py is built on the assumption that, if outbound interlanguage links are available at all from a language, the list of available link-destination languages and the destination URL for each will match perfectly across all wikis in the project.

This leads to some potential pitfalls:
 * If one language is missing outbound language interwiki support entirely, one must avoid giving pywikipediabot an account on that wiki (in user-config.py) in order to ensure that interwiki.py leaves that one language wiki untouched.
 * If one language is using a valid but incomplete interwiki table, running interwiki.py on that language wiki will create broken links. Unlike the case where one language is missing project-wide, there is no clean and easy workaround.
 * If a language in a project has been forked (not just mirrored), the interwiki for each individual language pair will point to only one of the multiple forks. Verify the wiki your bot is looking at is the same one that is being linked from the wiki you're editing - otherwise the bot will delete some valid links as "page does not exist".

Customisation of namespaces
Some projects use non-standard extensions to provide Special:Interwiki and Special:Namespaces lists; where available, these lists should be checked against the configuration files to detect any additional namespace customisations.

Short URL rewrites
If your site uses short URL rewrites, you may have to add "/api.php" to the blacklists, Otherwise, your bot scripts will not be able to access api.php.

Check your rewrite conditions in your apache conf file, and make an appropriate addition.

Bot & private wikis
Some wikis require logging in into mediawiki before being able to view any wikipage. If you have any such site, add to your custom family file :

Fixing Permission Denied problems Creating page via API Unknown Error. API Error code:permissiondenied Information:Permission denied Your wiki may require users to be part of a particular group in order to edit pages. If so, login to your wiki as an administrator and use Special:UserRights to put your bot into the proper group(s) to avoid API permission problems.

Bot & HTTP auth
Some sites will require password identication to access the HTML pages at the site. If you have any such site, add lines to your user-config.py of the following form: