Manual:Pywikibot/replace.py

Replace.py is part of the Pywikipedia bot framework.

This bot will make direct text replacements. It will retrieve information on which pages might need changes either from an XML dump or a text file, or only change a single page.

You can run the bot with the following commandline parameters:


 * -xml
 * Retrieve information from a local XML dump (pages_current, see). Argument can also be given as " ".


 * -file
 * Work on all pages given in a local text file (the file should be in UTF-8).  Will read any wiki link and use these articles.  Argument can also be given as " ".


 * -cat
 * Work on all pages which are in a specific category. Argument can also be given as " ".


 * -page
 * Only edit a single page. Argument can also be given as " ". You can give this parameter multiple times to edit multiple pages.


 * -ref
 * Work on all pages that link to a certain page. Argument can also be given as " ".


 * -start
 * Work on all pages in the wiki, starting at a given page. Choose "-start:!" to start at the beginning. NOTE: You are advised to use   instead of this option; this is  meant for cases where there is no recent XML dump.


 * -regex
 * Make replacements using regular expressions. If this argument isn't given, the bot will make simple text replacements.


 * -except:XYZ
 * Ignore pages which contain XYZ. If the -regex argument is given, XYZ will be regarded as a regular expression.


 * -fix:XYZ
 * Perform one of the predefined replacements tasks, which are given in the dictionary 'fixes' defined inside this file. The -regex argument and given replacements will be ignored if you use -fix.
 * Currently available predefined fixes are&mdash;
 * HTML - convert HTML tags to wiki syntax, and fix XHTML


 * -namespace:n
 * Number of namespace to process.


 * -always
 * Don't prompt you for each replacement


 * other arguments
 * First argument is the old text, second argument is the new text. If the -regex argument is given, the first argument will be regarded as a regular expression, and the second argument might contain expressions like \\1 or \g.

NOTE: Only use either  or   or , but don't mix them.

Examples
If you want to change templates from the old syntax, e.g., to the new syntax, e.g., download an XML dump file (page table) from http://download.wikimedia.org, then use this command:

python replace.py -xml -regex "" ""

Note that the you can match patterns across more than one line:

python replace.py -regex -start:! "First line\nSecond line" ""

If you have a dump called foobar.xml and want to fix typos, e.g. Errror -> Error, use this:

python replace.py -xml:foobar.xml "Errror" "Error"

If you have a page called 'John Doe' and want to convert HTML tags to wiki syntax, use: python replace.py -page:John_Doe -fix:HTML

If you run the bot without arguments you will be prompted multiple times for replacements:

python replace.py -file:blah.txt

The script asks the user before modifying an article. It is recommended to double-check the result to be sure that the bot did not introduce errors (especially with misspelled words). It is possible to specify a set of articles with an external text file containing Wiki links :

plane vehicle train car

The bot is then called using something like :

python replace.py [global-arguments] -file:articles_list.txt "errror" "error"