Wikimedia Labs/Tool Labs/Help

Tool account
The primary concept of the Tool Lab's organization is the tool account; at its core this is a unix uid-gid pair named  which is intended to run the actual tool. Maintainers may have more than one tool account, and tool accounts may have more than one maintainer.

Right now, you have to request a tool account from one of the project administrators; the plan is to make available an interface on wikitech where tool maintainers will be able to create them at need.

The unix group has as its members the tool account itself, as well as the user accounts of the maintainers of the tool. Every member of that group has the authorization to sudo to the tool account.

Along with the unix uid, the following resources are provided by default:
 * A home directory on shared storage:
 * A web URI mapped to its :
 * A mysql database for local use (the credentials to which are stored in )
 * Access to the continuous and task queues of the compute grid

maintainer@tools-login:~$ become toolname local-toolname@tools-login:~$
 * Hint: As a convenience, tool maintainers can switch to the tool account with:

Grid engine
Every non-trivial task performed by tools should be dispatched by the grid engine so that a suitable place to run them is found with sufficient resources. Gridengine is highly flexible system for assigning resources to jobs, including parallel processing.

You can find documentation on the website; you may wish to pay particular attention to the,   and   commands which are most important to users.

Simple utilities
For most tasks, helper scripts are provided to abstract away some of the complexities of using the grid engine. Almost all use scenarios are covered with reasonable defaults by the  script:


 * Options include many (but not all) qsub options, along with:
 * : Send errors that occur during the submission to stderr rather than the error output file (the errors while runnning the script always go do the error file).
 * : Request amount of memory for the job. (Where value is number prefixed by 'k', 'm' or 'g')
 * : Only start one job with that name, fail if another is already started or queued.
 * : Start a self-restarting job on the continuous queue (default if invoked as jstart). Please see the section on continuous jobs below.
 * Some of the more useful  supported are:
 * ,, and  : Selects the file used for standard input, output and error of the job, respectively.  By default,   will append stdout and stderr to the files   and   in the tool account's home directory, and will not have standard input.
 * : send standard output and error together to the output file
 * : Normally,  queues up the job and returns immediately.  This allows you to wait for the job to be complete instead.
 * : Start the script in the same directory you invoked  from.
 * : Pick a different job name (see below).

By default, jobs are allowed 256MB of memory; you can request more (or less) with the  option but keep in mind that a job that requests more resources may be penalized in its priority and may have to wait longer before being run.

Job names
By default, jobs have the same name as the program, minus extensions. (For instance, if you had a program named  which you started with , the job's name would be foobot). You can pick a different name for the job when started it (with the  option of   and  ); this name identifies the jobs on statuses, but can also be used to control it.

It's important to note that you can have more than one job, running or queued, baring the same name. Some of the tools that accept a job name may not behave as expected in those cases.

Continuous tasks (such as bots)
Continuous tasks have a dedicated queue, continuous, which has a slightly different setup:
 * Jobs started on that queue are automatically restarted if they, or the node they run on, crash
 * In case of outage or lack of resources, they will be stopped and restarted automatically on a working node
 * only tool accounts can start continuous jobs

The queue will not restart jobs that exited normally (i.e., were not killed) unless they are wrapped in a script to do so; starting a job with the  option of   does so automatically until they exit normally with an exit value of zero, indicating completion.

One would normally start continuous jobs with the  option as well so that they can be managed reliably with   and   utilities.

For convenience, there is an utility to start a continuous bot with reasonable default options: (which is equivalent to  and accepts the same options.  This would start the foobot program in continuous mode if it is not already running, making certain that it is kept running.

Job status
You can see the status of all your running and pending jobs with the  command. If you know that your job can only have on instance runnning (such as when you use the  option when starting it) you can also use the   command to get its job id (or a more verbose status with  ). The latter is particularly useful from scripts or web services.

Web services
Every tool account has a web interface made available (though, in cases of bots with no web interactivity, you may simply wish to have a static page that describes the tool, or a simple status report).

Logs
The access logs for your tool's web interface are placed in the tool account's, in common format. Please note that the web logs are anonymized such that the user's IP address appears to be that of the local host. In general, the privacy policy will not allow logging of personally identifiable information by tool maintainers (including IP addresses); special permission from Foundation legal counsel would be required to get that information.

Error logs, because of limitations of the Apache web server, are not made directly available to tool maintainers. There is a workaround in place for PHP, which allows per-user logging (PHP error logs are placed in the tools account's ), but until a newer version of Apache can be deployed it is recommended that you use your language's facilities to log errors to a file under the tool account's home.

In particular, however, this means that if you have a CGI which is unable to start you will not be able to see the error preventing it without help from a tool labs admin. There are a few common errors you can check against which cover most cases:
 * The CGI's file is not owned by the tool account
 * The CGI's file does not have its execute bits set
 * (You can use the  command to set the script as executable)
 * The CGI is a script and does not start with a Unix "shebang" invocation, or it points to the wrong path:
 * A unix "shebang" is the first line of a script that specifies the program meant to execute that script. It has the form
 * Where the path is, for instance,  for perl scripts.  You can check the path to a language interpreter by using the   command:
 * would output the path to the python interpreter.
 * would output the path to the python interpreter.
 * would output the path to the python interpreter.