Manual:Pywikibot/Installation/Labs

From MediaWiki.org
Jump to: navigation, search
Translate this page; This page contains changes which are not marked for translation.

Other languages:
čeština • ‎Ελληνικά • ‎English • ‎español • ‎日本語 • ‎한국어
If you need more help on setting up your Pywikibot visit the IRC channel #pywikibotconnect @ freenode server or Pywikibot mailing list.

Setup on Wikimedia Labs/Tool Labs server[edit]

In order to install your bot onto the Wikimedia servers and run it from there, make sure first to become familiar with Wikimedia Labs/Tool Labs environment.

In the next step you have to request several accounts (for labs, for the tools project, your tool), provide an ssh key and so on. How to do this and then proceed, is described in full detail in Setup pywikibot on Labs.

Pywikibot source repo moved (from svn) to git, please confer Manual:Pywikipediabot/Gerrit first.

The bots projects here has become obsolete, instead we use tools now, in order to do so follow Tools/Help to get an account. Then create your tool (service group).

If you used the toolserver in the past and know how everything used to work there, confer migrating from toolserver for more info.

Now you are ready to start. Login to Labs tools project:

$ ssh USERNAME@login.tools.wmflabs.org

switch to the tool account with

maintainer@tools-login:~$ become toolname
local-toolname@tools-login:~$

Now install/clone the pywikibot code to your tool account as described below.

Install the bot code[edit]

Similar to the instructions given in this mail do:

$ git clone --recursive https://gerrit.wikimedia.org/r/pywikibot/core.git pywikibot-core
$ cd pywikibot-core

Now you have to setup pywikibot. Choose any one of the following processes to configure your system:

  • Execute python generate_user_files.py
  • Run your favorite bot script (e.g. python pwb.py clean_sandbox.py -simulate) since you are doing this in a fresh clone, it will trigger a bunch of questions on how you want to configure your local copy, answer them carefully in order to proceed.
  • If you already have a config file(s) from a previous version, you can copy those existing files into the right places (e.g. pywikibot-compat/).

Further things you might have to do (depending on what bot scripts you want to run) is to setup all externals properly - which still has to be done manually in core

$ cd externals
$ cat README

and follow the instructions there.

You will also have to enter the password for your bot eventually.

Now you have finished the configuration of core and can continue setting up the jobs to execute.

Setup the webspace[edit]

By default, the directory listing on http://tools.wmflabs.org/TOOLNAME is disabled. If you want to allow it for all users, login to your tool account (as already described) and

$ cd ~/public_html
$ echo Options +Indexes >> .htaccess


If you run a bot with the -log option, you will find the log files within the logs/ directory. If you want to allow users to access it from the web, do

$ cd ~/public_html
$ mkdir logs
$ cd logs
$ ln -s ~/pywikibot-core/logs cor

If you want a specific file type to be handled differently by your browser, e.g. .log files like text files, use (confer this)

$ echo AddType text/plain .log >> .htaccess

and (don't forget to) clear your browsers cache afterwards.

Next you might want to consider you cgi-bin directory

$ cd ~/cgi-bin

follow the hints given at wikitech:Nova Resource:Tools/Help#Logs exactly, e.g. even the two commands

$ /usr/bin/python      # valid
$ /usr/bin/env python  # in-vali

work and do the same in shell, only the first one is valid and works here, the second is invalid! Another point to mention is that PHP scripts go into public_html, not cgi-bin. Python scripts on the other hand can be placed in public_html or cgi-bin as you wish. It is recommended to use public_html for documents and keep it listable, whereas cgi-bin should be used for CGI scripts and be protected (not listable).

Setup the job submission[edit]

In order to setup the submission of the jobs you want to execute and use the grid engine you should first consider wikitech:Nova Resource:Tools/Help#Submitting, managing and scheduling jobs on the grid and if you are familiar with the Toolserver and its architecture consult Migrating from toolserver also.

In general labs uses SGE and its commands like qsub et al, this is explained in this document which you should use in order to get an idea which command and what parameters you want to use.

An infinitely running job (e.g. irc-bot) like this (cronie entry from TS submit host):

 06 0 * * * qcronsub -l h_rt=INFINITY -l virtual_free=200M -l arch=lx -N script_wui $HOME/rewrite/pwb.py script_wui.py -log

becomes

 $ jsub -once -continuous -l h_vmem=256M -N script_wui python $HOME/pywikibot-core/pwb.py script_wui.py -log

or shorter

 $ jstart -l h_vmem=256M -N script_wui python $HOME/pywikibot-core/pwb.py script_wui.py -log

the first expression is good for debugging. Memory values smaller than 256MB seam not to work here, since that is the minimum. If you experience problems with your jobs, like e.g.

Fatal Python error: Couldn't create autoTLSkey mapping

you can try increasing the memory value - which is also needed here, because this script uses a second thread for timing and this thread needs memory too. Therefore use finally

 $ jstart -l h_vmem=512M -N script_wui python $HOME/pywikibot-core/pwb.py script_wui.py -log

Now in order to create a crontab follow Scheduling jobs at regular intervals with cron and setup for crontab file like:

$ crontab -e

and enter

PATH=/usr/local/bin:/usr/bin:/bin

06 0 * * * jstart -l h_vmem=512M -N script_wui python $HOME/pywikibot-core/pwb.py script_wui.py -log


Additional configuration[edit]

Furthermore additional tools to support you and your bot at work are available:

Automatic updating git on Wikimedia Labs[edit]

For automatic updating you can make update bash file and put it in root and fill it with these commands, For WMF labs (in your service group):

#!/bin/bash
cd /data/project/yourservicegroup/pywikibot
git pull --all && git submodule update

and then run crontab -e and enter the following to make your bot to run every day at 00:00AM (midnight):

0 0 * * * bash /data/project/yourservicegroup/update >/dev/null 2>&1

Notice: in these codes yourservicegroup is name of your service group (without "-local").