|IMPORTANT: The content of this page is outdated. If you have checked or updated this page and found the content to be suitable, please remove this notice.|
- If you need more help on setting up your pywikipediabot visit the irc channel pywikipediabot mailing list. @ freenode server or
This is a complete step-by-step guide on how to install and run your own pywikipedia bot on Wikimedia projects from a Mac.
All of the Bot software can be run on your local Mac against the particular Wikimedia server -- a "Shell Account" on that Wikimedia server is not necessary. However, a "Bot Account" is!
- You can also run against a Wikimedia installation running on your own Mac under OSX Server!
- (OSX Server includes a standard distribution WIKI. Mediawiki software needs to be installed separately.)
- While a number of the actions can be performed directly from the Finder, most require that you are familiar with the use of the Terminal window. This provides what Unix and Linux users call "Shell access." The OSX Terminal application defaults to the Bourne Shell.
- It is also recommended that you are conversant with a "Text Editor," not a "Word Processor."
- While "Text Edit" is an application supplied by Apple, it is NOT well suited to most of the work which needs to be done.
- One solution is to use one of the "traditional" command line editors supplied with OSX -- pico(nano) or emacs; another is to use a full GUI based text editor such as TextWrangler (free) or BBEdit (commercial) from BareBones Software.
- Experienced users can also use the facilities of Xcode.
- 1 Downloading
- 2 Before you begin
- 3 Configuring
- 4 Registering the username on the wanted wikis
- 5 Running your Bot
- 6 Maintenance
- 7 Generating a Family FIle
Downloading[edit | edit source]
- OSX 10.8.4 (Mountain Lion) provides (i.e. provided by Apple) both Python and Pythonw 2.6; and Python and Pythonw 2.7. The default is Python 2.6 -- "$man python" for details on switching between the two if necessary.
SVN[edit | edit source]
- OSX 10.8.4 (Mountain Lion) provides (i.e. provided by Apple) SVN: Subversion command-line client, version 1.6.18
- Note that if you are familiar with Xcode, you can also use it for many of these actions. However, using Xcode is beyond the scope of this article.
Downloading the Pywikipediabot Scripts[edit | edit source]
- (In Lion (10.7) and Mountain Lion (10.8) Terminal is found under "/Applications/Utilities" in the Finder. It does not appear in the Launch Pad)
Copy-paste the following into the Terminal window, and press enter/return:
$ svn checkout http://svn.wikimedia.org/svnroot/pywikipedia/trunk/pywikipedia/ pywikipedia
Using Apple's SVN under Mountain Lion 10.8.4, the following Error will be generated by Apple's security interface (Gatekeeper):
Error validating server certificate for 'https://svn.toolserver.org:443': - The certificate is not issued by a trusted authority. Use the fingerprint to validate the certificate manually! Certificate information: - Hostname: *.toolserver.org - Valid: from Sun, 26 May 2013 09:34:24 GMT until Wed, 28 May 2014 17:08:01 GMT - Issuer: GeoTrust, Inc., US - Fingerprint: bf:35:7f:3e:62:4b:89:6c:bc:39:c9:c3:38:81:9e:53:26:43:be:f4 (R)eject, accept (t)emporarily or accept (p)ermanently?
Responding "p/t" may then result in:
svn: warning: Error handling externals definition for 'pywikipedia/externals/pycolorname': svn: warning: OPTIONS of 'https://svn.toolserver.org/svnroot/drtrigon/externals/pycolorname': Could not read status line: connection was closed by server (https://svn.toolserver.org)
- ( Note: This is assumed to be a transient SVN error and will likely go away with the next SVN update.)
The length of time for the download will depend upon the speed of your Internet connection. It will take a few minutes.
A folder will be created in your “home” folder (look in the left column in the Finder for the house icon and your username), with the name “pywikipedia”, in which all the scripts will be when the downloading is finished. You can see when it’s finished by looking for the following in the last line in Terminal:
This text is equal to the one appearing when you open Terminal (it is called the command line or terminal prompt). $ will be used in the text below to indicate the prompt.
Before you begin[edit | edit source]
There are a number of files in the SVN download to which one needs to pay attention. While they can be read from the Finder, you cannot perform the necessary actions on them from there.
- Note that when viewed from the Finder, the directory (folder) is displayed in CASE INSENSITIVE alphabetical order (upper and lower case are intermixed), while when viewed from a "ls" in Terminal, they appear in CASE SENSITIVE order -- i.e. the Capitalized names will appear at the top.
- The top-level README file is a standard inclusion in such software distributions, however in this case, its contents are simply cosmetic. They provide no useful information.
- The CONTENTS file contains important information describing the individual files in the distribution.
The first step in installing pywikipedia is to run the python code:
generate_user_files.py. This cannot be done from the Finder.
- Note that if you have Xcode installed, double clicking on this script from the Finder window will launch Xcode.
- In a Terminal window:
$ cd pywikipedia $ python generate_user_files.py
Which will generate the following:
- Note 1: For your first attempt, generate both files
- Note 2: The list of WIkis are those who are advertised to Mediawiki.org; for a private Wiki, select "27 - test"
- Note 3: The default language is english
- Note 4: for "Username" enter the username associated with your bot; typically <yourWikiUserid-bot>
- Note 5: Choose the "Small" (S) version of the config file for your first attempt. The extend version contains many options beyond the scope of this article.
1: Create user_config.py file 2: Create user_fixes.py file 3: The two files What do you do? 3 <-- Note 1: generate both files 1: anarchopedia 2: battlestarwiki 3: botwiki 4: celtic 5: commons 6: fon 7: gentoo 8: i18n 9: incubator 10: krefeldwiki 11: lockwiki 12: loveto 13: lyricwiki 14: mediawiki 15: memoryalpha 16: meta 17: mozilla 18: oldwikivoyage 19: omegawiki 20: openttd 21: osm 22: piratenwiki 23: southernapproach 24: species 25: strategy 26: supertux 27: test 28: twcareer 29: ubuntutw 30: uncyclopedia 31: vikidia 32: wekey 33: wesolve 34: wikia 35: wikibond 36: wikibooks 37: wikidata 38: wikimediachapter 39: wikinews 40: wikipedia 41: wikiquote 42: wikisource 43: wikitech 44: wikitravel 45: wikitravel_shared 46: wikiversity 47: wikivoyage 48: wiktionary 49: wowwiki Select family of sites we are working on (default: wikipedia): 27 <-- Note 2: use Test to start The language code of the site we're working on (default: 'en'): <-- Note 3: (hit return to accept the default) Username (en test): <wiki-bot username> <-- Note 4: your bot's username -- typically <yourWikiUserid-bot> Which variant of user_config.py: [S]mall or [E]xtended (with further information)? S <-- Note 5: Choose the Small version 'user-config.py' written. 'user-fixes.py' written.
Now you can begin configuring your bot.
Configuring[edit | edit source]
user-config.py created by the "Small" option consists of 4 lines:
# -*- coding: utf-8 -*- family = 'test' mylang = 'en' usernames['test']['en'] = u'username-bot'
You need to configure (customize) your bot before you can use it.
This can be done using the applications Text Edit or TextWranger, or the comnand line tools in a Terminal Window -- emacs or pico(nano).
If you choose to use Text Edit or TextWrangler, simply use the finder to navigate to
user-config.py, in the pywikipedia folder, and drag and drop that file on top of the appropriate icons.
If your language uses non-ASCII characters, make certain to retain the first line in the file:
# -*- coding: utf-8 -*-
Edit or add the following lines to
user-config.py as appropriate for your Bot:
family = 'wikipedia' mylang = 'en'
“xx” is the main language you want your bot to work on, equivalent to the language code of the Wikimedia project in which the bot’s going to run.
Family is the project name.
We selected "27 test" as the Family of wikis when we ran the setup script
family = 'wikibooks'
usernames['wikipedia']['en'] = u'ExampleBot'
usernames['FAMILY']['MYLANG'] = u'ExampleBot'
usernames['wikipedia']['de'] = u'BeispielBot' usernames['wikipedia']['en'] = u'ExampleBot' usernames['wiktionary']['de'] = u'BeispielBot'
If you are working on more than one Wikimedia project, you can also add several usernames.
console_encoding = 'utf-8'
(Optional but recommended)
It is recommended that you change the console_encoding to UTF-8. Especially if you are editing a wiki with non-ASCII characters, this is important. If you don’t change the encoding, you’ll end up with “?”s all over the place. For example, the Spanish name María will render like Mar?a.
textfile_encoding = 'unicode_escape'
(Optional but Rarely needed)
Set "textfile_encoding" only If this is the encoding used by your system.
sort_ignore_case = True
Some scripts may use this for sorting, e.g. solve_disambiguation.py. Default is False. Capitalized titles will preceed uncapitalized ones if this key is False or omitted, and capitalization will be disregarded by sorting if True.
use_api_login = True
log = ['*']
Enable logging for all scripts. Logs will be stored in the logs folder.
- Remember to save your changes to
Notes[edit | edit source]
- If you want to work with more than one language, choose the most common one. You can override this on the command line by using
- Meta uses 'meta' for both language code and wiki family, Commons uses 'commons' for both, and Testwiki uses 'test' for both, the multilingual wikisource uses '-' for the language. You can override this on the command line by using
- The 'u' in front of the username stands for Unicode. The 'u' is required if your username contains non-ASCII characters.
- Note that on Linux/Unix hosts username capitalization matters! While logging in may not be an issue, testing the log in or attempting to use a bot will not use the correct cookie file and may result in anonymous access to the API. This can cause problems for private wikis that do not allow anonymous access or use third party authentication. Default usernames for mediawiki and those pulled via LDAP or other third party authentication schemes will have an uppercase character for the first letter, thus 'user' becomes 'User'.
|Language:||English • Tiếng Việt|
|To get a copy of the EXTENDED version of
user-config.py examples[edit | edit source]
EksempelBot on no.wikipedia[edit | edit source]
mylang = 'no' usernames['wikipedia']['no'] = u'EksempelBot' console_encoding = 'utf-8' use_api_login = True
ExampleBot on Commons[edit | edit source]
mylang = 'commons' family = 'commons' usernames['commons']['commons'] = u'ExampleBot' console_encoding = 'utf-8' use_api_login = True
BeispielBot on de.wikipedia and de.wikibooks, with de.wikipedia as main wiki[edit | edit source]
mylang = 'de' usernames['wikipedia']['de'] = u'BeispielBot' usernames['wikibooks']['de'] = u'BeispielBot' console_encoding = 'utf-8' use_api_login = True
Registering the username on the wanted wikis[edit | edit source]
It is recommended that you are registered on the wikis where you want to run your bot.
- log into your own user account, write “Special:Userlogin” in the search box.
- Then, choose to register an account.
- Now fill out the required fields on the page and continue. If the account creation is successful, you should now be able to run the bot.
- With SUL it’s easy to run bots on several wikis. If you’ve created the bot account recently, the account is already merged.
- If you created your account(s) some time ago and it is not merged, or if it was created some time ago on other wikis, you will have to merge it in Special:MergeAccount.
- Now, when logging in, the other accounts will automatically be registered as you enter the same password for all wikis.
- NOTE: If you are working with your own Wiki or one independent of the Wikipedia "groups," you first need to create an appropriate "Family" file... see: #Generating a Family FIle below.
Bot flag and test edits[edit | edit source]
Don’t forget to apply for bot flag/permission to run the bot! On en.wikipedia and he.wikipedia (there may be more), you have to apply before you make test edits. On other wikis you can apply and make test edits right away. When you make test edits, use the argument -pt:15, which tells the bot to wait 15 seconds between each edit (4 edits/min), so the recent changes list doesn’t get flooded with the bot’s edits.
Running your Bot[edit | edit source]
[edit | edit source]
To run the scripts, you first need to navigate to the folder in which the scripts are located. To do this you need to use the Terminal window (application) - (Applications/Utilities/Terminal)
- A "fresh" Terminal window will normally open in your "Home" directory .
- “USER_NAME” is the home text to the right of the house icon in Finder; also referred to by the character "~"
- Typing "cd" by itself will normally return you to your Home directory.
- Assuming you have used the default SVN install, enter the following in Terminal:
$ cd pywikipedia
$ cd /Users/USER_NAME/pywikipedia
$ cd ~/pywikipedia
Tip[edit | edit source]
- The above can be unnecessarily boring to write each time you want to use the bot. Terminal saves every command you use, and you can use old commands by pressing arrow up. With one press on the arrow up button, you see the last command used. Most likely, the command you see is not the right one; simply keep pressing arrow up until you see the right one. Then, press enter/return. Now, you can start running the bot. When you’re running a script followed by another when the first one is done, you don’t have to use the navigation command once again; only if you close Terminal.
Logging in[edit | edit source]
Now, you need to log into the bot accounts on the wiki(s), by typing the following into Terminal:
$ python login.py
Terminal will ask for the password. Type the password you used when you registered the bot’s username.
If you have accounts on multiple wikis, use
$ login.py -all
Type in the passwords, as above.
Logging in is only necessary once, the bot stays logged in.
Examples[edit | edit source]
$ python login.py Password for user user-bot on mywiki:en: <-- password is not echoed to the terminal window No handlers could be found for logger "pywiki" <-- no log file was configured, use: <code>python login.py -log</code> to enable the logfile, using the default filename '''"login.log"''' Logs will be stored in the logs folder. Logging in to mywiki:en as user-bot via API. Should be logged in now
To verify your login status use the "-test" parameter
$ python login.py -test No handlers could be found for logger "pywiki" <-- no log file was configured, see comment in previous example. You are logged in on mywiki:en as user-bot.
Logging Out[edit | edit source]
To log your Bot out of your wikis use:
$ login.py -clean
Running the scripts[edit | edit source]
After navigating to the right folder, you can now select scripts, by writing:
$ python SCRIPT_NAME.py
To see an overview of some scripts with description pages, look here. Most of these scrips are part of the standard SVN download.
Arguments[edit | edit source]
Arguments are written by using a dash: - and then the name of the argument, -pt:15 for instance. Many scripts require arguments, for example interwiki.py:
$ python interwiki.py -start:! -autonomous
- As default, the bot will run interwiki from the main language and project, and correct them on all registered wikis. The -start and -autonomous arguments says that it’ll check all pages on the wiki and add/modify interwikis. It will not remove anything, and it will give up when discovering conflicts, see interwiki.py.
Global arguments[edit | edit source]
All scripts recognize the following arguments:
- Shows the script’s help text
- Selects the language you want to work on. Overrides the mylang option in user-config.
- Selects the family you want to work on. Overrides the family option in user-config.
- Enable the logfile. Logs will be stored in the logs subdirectory.
- Enable the logfile, using “xyz” as the filename.
- Disable the logfile (if it's enabled by default).
- -putthrottle:nn or short -pt:nn
- Set the minimum time (in seconds) the bot will wait between saving pages. The default value is zero.
Maintenance[edit | edit source]
Updating[edit | edit source]
To update your scripts, simply write the following into Terminal:
$ svn update pywikipedia
- When it’s done, it will say “At revision XXXX” (“XXXX” depending on the latest revision). If you have modified some of the scripts, don’t worry, your modifications will be merged with the changes made by updating.
- If you have experience using Automator, you can do the Shell process in a "workflow" and save it to your desktop or on your dock, so you can reach it more easily. It’s a good idea to right click on it, then, while pressing Alt/Option, hold the cursor over Always open in, then choose Automator runner.
Modifying[edit | edit source]
- You might want to go through the script text before you run it, because for some languages the script may be bot specific. Right-click on it, hold the cursor on Open in, then choose Other, scroll down and double-click on TextEdit. However, this is not mandatory, and not recommended for beginners, but if you have to; be careful when editing the scripts!
Reverting[edit | edit source]
If you want to revert your changes to a script, simply run the following:
$ svn revert NAME_OF_SCRIPT.py
If a script suddenly doesn’t work, and produces the output SyntaxError: invalid syntax, you have encountered an SVN conflict.
Revert the script as described above. Then try to run it. If it works, run the following in Terminal:
$ svn resolved NAME_OF_SCRIPT.py
You can now try to make the same edits again, if desired.
Generating a Family FIle[edit | edit source]
|As of 2 July 2013, to generate a family file automatically using the script
Family is the project name.
- We selected "27 test" as the Family of wikis when we ran the setup script
However, if the bot’s main wiki is "mywiki", edit this line in user-config.py:
family = 'mywiki'
- To run a bot on your local wiki (mywiki) you will need to create a "family file" --"families/xxx_family.py" where xxx is the name of your local wiki "
families/mywiki_family.py", the same name as you would enter into this variable, replacing "test". See:
Usage: generate_family_file.py <url> <short name> Example: generate_family_file.py http://www.mywiki.bogus/wiki/Main_Page mywiki This will create the file families/mywiki_family.py
Example: lotro-wiki.com[edit | edit source]
> python generate_family_file.py http://lotro-wiki.com/ lotro-wiki Generating family file from http://lotro-wiki.com/ ================================== api url: http://lotro-wiki.com/api.php MediaWiki version: 1.20.6 ================================== Determining other languages... fr There are 2 languages available. Do you want to generate interwiki links? This might take a long time. ([y]es/[N]o/[e]dit)N Loading wikis... * en... in cache Retrieving namespaces... en Writing families/lotro-wiki_family.py...
This creates the basic file:
- If you did so, you can now rename
user-config.pyand make certain that the "family=" parameter is correct.
- Consulting: README-family.txt again, you can edit the family file to include any necessary additions or corrections.