Manual:Pywikibot

The Python Wikipediabot Framework (pywikipedia) is a collection of tools that automate work on Wikipedia or other MediaWiki sites. It's written in the Python language and requires the Python program. This page gives general information for people who want to use the bot software.

Overview

 * 1) Install python
 * 2) Download pywikipedia (code repository - nightly built release)
 * 3) Run generate_user_files.py to generate user-config.py and user-fixes.py.
 * 4) Add your wiki-family in folder, if not already present. (README-family)
 * 5) Run login.py. That's it, you can run category.py to manage categories, template.py to manage templates, add_text.py to add footers, replace.py to fix some stuff such as adding wikilinks, and a lot more.

In summary users should read and configure those three files:
 * user-config.py</tt>
 * user-fixes.py</tt>
 * yourWiki_family.py</tt>

Initial setup
Requirement: To run PyWikipediaBot, Python v2.4 or higher is required, but Python v3.x isn't currently supported. Some of the code may work on Python version 2.3.


 * For Windows, download the latest Python v2.x here ( not 3.x version!! )
 * For Mac and Unix, Python is already present on Mac OS X and on some Unix versions (although it might be necessary or recommended to update it if you have a very old version).

Download
The easiest way to download PyWikipediaBot is to use the latest nightly release available at this site. All you have to do is download PyWikipedia to your computer and decompress the file, there is no further installation required.

Download with SVN
You can use SVN (subversion.tigris.org) to retrieve an up-to-date version of PyWikipediaBot. If you use Windows TortoiseSVN is advised. On a Mac, you can follow these instructions to install SVN.

To check out the source code using the command line SVN client use this command:



Or, without the spell-checking files (saves a while), add  :



With either of those commands the source code will be in a new directory inside your current working directory named. (the last argument is used as the destination directory)

For non command line tools, the only information needed is the repository path: http://svn.wikimedia.org/svnroot/pywikipedia/trunk/pywikipedia/

Configuration for non-wikimedia projects
Not only Wikimedia wikis use pywikipedia. Some other wikis, such as wikitravel, betawiki, anarchopedia, etc... have their configuration files included in pywikipedia distribution. To check if your wiki is directly supported by pywikipedia, check the  directory contents.

If your wiki is not listed in the families folder, you have to create your own family file. This is relatively easy to do, and documented on Pywikipedia bot on non-Wikimedia projects.

Permission on wiki projects
Make sure that your bot is approved by the wiki community where you are going to use it. Strictness of this differs greatly between various projects; at some you need to announce it in advance and get approval before you start, at others you can do whatever you want.

Using your normal browser, create a login name and password for the bot. It is best to use a name that makes clear that it is a bot, and preferably also who is operating it. A common method is to use your own login name and add the word 'bot' to it, but several other forms also exist.

On the English Wikipedia, bots are only allowed to be used if they are approved at en:Wikipedia:Bots/Requests for approval.

Request a bot flag
If you heavily use a bot, it will clutter recent changes. To avoid that, you can get your bot registered as such. In that case it will not be shown on Recent changes unless a user specifically asks to get bots included.

This can be done by a Bureaucrat. You can put a request to get your or someone else's bot registered at Requests for bot status. You will probably be asked for some kind of evidence that your local community agrees with your bot. On the English language Wikipedia, requests should be made at Requests for approvals. It is probably good to get your bot registered whenever it will edit many pages in a single run.

Select and run a bot script
Now we are ready to really start using the bot. You need to get to a textual interface to your Operating System.

On Windows this is done by opening the start menu, and clicking on 'Run'. You are asked to give the name of a program, type "cmd.exe".
 * Change the root to C: by typing chdir C:\
 * Type chdir \"name of the folder where pywikipedia bot has been downloaded" (For example: chdir \"pywikipedia" if the file is in the C: folder.)

On Mac, find Terminal.app in /Applications/Utilities.

On Linux or any other Unix, use any terminal application such as gnome-terminal, konsole, xterm, or simply the text-mode console.

Run the script login.py by typing "python login.py" (or just "login.py").

Python will then return:

'''<pre style="background-color:#000000; color:#c0c0c0; font-family:Lucida Console; font-size:12px;"> Password for user your_bot on your_site:en: '''

Use the password you used for the bot's login name. The bot can't work anonymously. Unless you change your password, you normally need to run this program only once, the bot usually does not get logged off.

The bots are in the main pywikipedia folder when downloaded. But if necessary, use the command cd to go to the directory where the bot files are saved.

Now run any of the bots here by typing "python botname.py" (If you are using Windows, you can leave out "python").

Scripts
Here is a list of the existing bots with links to their descriptions:

Command-line arguments
Although many bot scripts have their own command line arguments, which should be documented on their respective pages (or in their source code), all bots unless specifically stated to the contrary recognize the following command line arguments:


 * -help
 * Print a list of global bot arguments (this list), followed by bot-specific help if available.


 * -lang:xx
 * Set the language of the wiki you want to work on to language code, overriding the configuration in.


 * -family:xyz
 * Set the family of the wiki you want to work on, e.g., wikipedia, wiktionary, wikitravel, ... This will override the configuration in.


 * -log
 * Enable the logfile. Logs will be stored in the logs subdirectory.


 * -log:xyz
 * Enable the logfile, using  as the filename.


 * -nolog
 * Disable the logfile (if it's enabled by default).


 * -putthrottle:nn
 * Set the minimum time (in seconds) the bot will wait between saving pages. The default value is 10.

For example,  will run the "scriptname" bot on wiktionary articles, overriding the default family setting in your user configuration.

Bot mailing list
It can be useful to subscribe to the bot mailing list (see http://lists.wikimedia.org/mailman/listinfo/Pywikipedia-l). Every time a file of the bot software is changed, a mail is sent to the list, so you know when you need to update to the new version.

Update

 * If you installed using SVN, updating your working copy is easy. Place yourself in your pywikipedia repertory, and simply type svn update</tt>. It will simply update the framework to include the latest changes. Read svn Manual for more precisions.
 * If you are using a nightly version, the process is a bit more complicated. You have to re-download a full copy from the same site. Before installing it, backup your configuration files and scripts (user-config.py</tt>, any family file, or custom script that you might have created). Replace your pywikipedia directory by the new version you just download. Restore your configuration files. If you're not sure of what you're doing, do not erase but keep a backup of your complete old pywikipedia directory, to avoid losing any important files.

How to report a bug
When you report a bug please try to include:


 * PyWikipediaBot version in use. It's recommended to test if the bug is still present in latest SVN revision available.
 * Python version (python -V) and operating system you use (e.g. Windows, Linux, MacOS...)
 * For above purpose, version.py will be useful.
 * A nice summary
 * Full description of the problem/report
 * How to reproduce bug full information (script, command line, family, and language used)
 * The console output provided by the script (included the Python traceback if you are reporting a crash)

To submit a new bug visit the bug tracker provided by SourceForge.

Development
If you have a function you want to have a bot for that is not yet provided by one of the bots, you can ask one of the programmers to write it for you. Or even better, you can try to work on the bots yourself. Python is a nice language, and not hard to learn. We will welcome you.

Tips
Here and in wikipedia.py, there are some very basic tips for getting started writing your own bot:
 * be sure you've set up your user-config.py file (see above)
 * To gain access to the pywikipedia framework, use:
 * to retrieve a page, use the following, where pageName is, e.g., "Wikipedia:Bots" or "India":
 * to update a page, use:
 * look at some of the pywikipedia files for other ideas -- basic.py is relatively easy to read even if you're new to pywikipedia.
 * you can find all available Page methods in the wikipedia.py file.
 * basic.py gives you a setup that can be used for many different bots, all you have to do is define the string editing on the page text.
 * To iterate over a set of pages, see pagegenerators.py for some objects that return a set of pages. An example use of the CategoryPageGenerator that does something for each page in the Category:Living people category:

Create a quick shortcut to run commands (Windows users)
How to make a quick shortcut to run commands (Windows users).

If you're installing Pywikipediabot in a folder such as "My Documents" it may be troublesome to use the "cd" command to go into the folder all the time to run the bots (For those who don't get what that means, this will help you a lot).

On Windows you can create a shortcut which will open the command box you can use to run bots easily. Just follow these simple steps to create one:
 * 1) Open up the folder pywikipedia is installed in, in a window.
 * 2) Under File > New select Shortcut.
 * 3) Type in "cmd.exe" and hit next.
 * 4) You can give a name to the shortcut here, just "Pywikipediabot" is good.
 * 5) In the address bar (The text bar above where your files are which tells you where you are) copy the path there.
 * 6) Right click on the new shortcut and hit properties and paste that path you copied into the "Start in" text field.
 * 7) Hit ok, and now you have a shortcut to open the command line to run bots from.

Contributing changes
If you changed the bot and want to send a patch to the maintainer,
 * 1) Update to the current version (it will merge your changes with the improvements already committed to the SVN Repository),
 * 2) Resolve any conflicts caused by the update (grep for "=====" ;-) and
 * 3) Type:
 * $ svn diff > svn.diff

Review the diff to ensure it only includes the changes you want to contribute. The lines at the beginning starting with "?" should be removed.

If you are in direct contact with a Pywikipediabot developer, you can send the file svn.diff to him, but preferably attach the patch to a ticket in the Pywikipedia bug tracking system.

Multiple accounts
It is a common need to run python wikipedia bot under different accounts (main and/or multiple bot accounts). It can be done in two ways.

Separate pywikipedia distributions
One can install completely separate instances of pywikipedia in different directories (1 for each account) and have diferent  files in each of them. However, when updating the installation via SVN, one needs to run  on each folder separately. Also, every installation takes some disk space, which might be a problem on accounts with limited quota.

One pywikipedia distribution with symbolic links
Let's assume user  has a current SVN working copy of pywikipedia in. For each of the accounts, he creates a separate directory:

foo@bar:~$ mkdir foobot foo@bar:~$ cd foobot

Pywikipedia needs then some symlinks to the main code tree created in the working directory:

foo@bar:~/foobot$ ln -s ~/pywikipedia/families foo@bar:~/foobot$ ln -s ~/pywikipedia/userinterfaces

Then,  for this account must be created as described in Configuration section above.

Finally, the bot must be logged in the usual way:

foo@bar:~/foobot$ python ~/pywikipedia/login.py

The working directory is ready. The scripts will however require a slight modification to run (the path to the pywikipedia tree must be added to Python's path).

import sys, os sys.path.append(os.environ['HOME'] + '/pywikipedia') import wikipedia

That's all. Updating to the newest version of pywikipedia on all accounts at once is now a matter of running  only in the   directory.

On Windows 2000+ with NTFS
A similar set-up can be created on Windows systems running Windows 2000 or later that use NTFS as their filesystem. This can be achieved by using the  tool (available from Microsoft's website, part of the sysinternals suite).

As above, create your new directory, but to create the symlinks, create  and   as directories inside of that. (NB: The cmd.exe  appears not to work properly in all the time, so using the Right-click->New menu or File->New menu is suggested.)

C:\...> junction families C:\pywikipedia\families C:\...> junction userinterfaces C:\pywikipedia\userinterfaces

The rest of the method is the same as above. Symlink/junctions should be deleted using the  program, as you may accidentally lose data in the original directory (see ).

Bot & Proxy
There is probably (not tested!) draft workaround described here.

Mailing lists

 * pywikipedia-l (archives, current month) : Human discussion on pywikipedia topic. This includes support, follow-ups to announcements, follow-ups to svn commits, and developer discussions. -- Moderate traffic (usually no more than a couple of mails a week in average)


 * pywikipedia-announce (archives, current month) : Important announcements, e.g. breaking changes. -- Minimal traffic (one mail a month, at most)


 * pywikipedia-bugs (archives, current month) : automated mail is sent by bug trackers on each bug state change. -- High traffic


 * pywikipedia-svn (archives, current month) : automated mail is sent after each pywikipedia SVN commit. -- High traffic