User:Kmenger/ToolLabsGuide

What is Tool Labs
Tool Labs is a reliable, scalable hosting environment for community developers working on tools and bots that help users maintain and use wikis. The cloud-based infrastructure was developed by the Wikimedia Foundation and is supported by a dedicated group of Wikimedia Foundation staff and volunteers.

Tool Labs is a part of the Labs project, which is designed to make it easier for developers and system administrators to try out improvements to Wikimedia infrastructure, including MediaWiki, and to do analytics and bot work.

Rationale
Tool Labs was developed in response to the need to support external tools and their developers and maintainers. The system is designed to make it easy for maintainers to share responsibility for their tools and bots, which helps ensure that no useful tool gets ‘orphaned’ when one person needs a break. The system is designed to be reliable, scalable and simple to use, so that developers can hit the ground and start coding.

Features
In addition to providing a well supported hosting environment, Tool Labs provides:
 * support for Web services, continuous bots, and scheduled tasks
 * access to replicated production databases
 * easily shared management of tool accounts, where tools and bots are stored
 * a grid engine for dispatching jobs
 * support for mosh, SSH, SFTP without complicated proxy setup
 * time-travel backups for short-term data recovery
 * version control via Gerrit and Git
 * support for Redis

Architecture and terminology
Tool Labs has essentially four components: the bastion hosts, the grid, the web cluster, and the databases. Users access the system via one of two Tool Lab projects: ‘tools’ or ‘toolsbeta’. To request an account on the ‘tools’ project, where most tool and bot development is hosted and maintained, please see Tools Access Request.

Bastion hosts, grid, web cluster, databases
The four main components of Tool Labs, in a nutshell:

Bastion hosts

The bastion host is where users log in to Tool Labs. Currently, Tool Labs has two bastion hosts:


 * tools-login.wmflabs.org


 * tools-dev.wmflabs.org

The two hosts are functionally identical, but we request that heavy processing (compiles, etc) be done only on tools-dev.wmflabs.org to keep interactive performance on tools-login.wmflabs.org snappy.

The grid

The Tool Labs grid, implemented with Open Grid Engine (the open-source fork of Sun Grid Engine) permits users to submit jobs from either a log-in account on the bastion host or from a Web service. Submitted jobs are added to a work queue, and the system finds a host to execute them. Jobs can be scheduled synchronously or asynchronously, continuously, or simply executed once. If a continuous job fails, the grid will automatically restart the job so that it keeps going.For more information about the grid, please see.

The Web cluster

The Tool Labs Web cluster is fronted by a Web proxy, which supports SSL and is open to the Internet. Any of the servers in the cluster can serve any of the hosted Web tools as Tool Labs uses a shared storage system; the proxy distributes between the Web servers. The cluster uses suPHP to run scripts and CGI, and will soon support WSGI. Note that individual tool accounts have both a  ~/public_html/  and a ~/cgi-bin/ directory in the home directory for storing Web files. For more information, please see .

The databases 

Tool Labs supports two sets of databases: the production replicas and user-created databases, which are used by individual tools. The production replicas follow the same setup as production, and the information that can be accessed from them is the same as that which normal registered users (i.e.: not  +sysop or other types of advanced permissions) can access on-wiki or via the API. Note that some data has been removed from the replicas for privacy reasons.User-created databases can be created by either a user or a tool on the replica servers or on a local ‘tools’ project database.

Projects: Tools and Toolsbeta
Like the rest of Labs, Tool Labs is organized into ‘projects’. Currently, Tool Labs consists of two projects: ‘tools’ and ‘toolsbeta’, which are described in more detail here: The ‘tools’ project is where tools and bots are developed and maintained. ‘toolsbeta’ is used for experiments in the Tool Labs environment itself--things like new systems or experimental versions of system libraries that could affect other users. In general, every tool maintainer should work primarily on the "tools" project, only doing work on toolsbeta when changes to Tool Labs itself need to be tested to support their tool.
 * tools  project: https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools
 * toolsbeta  project: https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta

Instances
Developers working in Tool Labs do not have to create or set up virtual machines (i.e., Lab ‘instances') as the Tool Lab project admins create and manage them. The term will come up in the Labs documentation; otherwise, don’t worry about it.

Tool Labs policies
All tools and bots developed and maintained on Tool Labs must adhere to the terms of use that will be available here when they are finalized:

|Tool Labs > Rules

Specifically, tools must be Private information must be handled carefully, if at all. Note that private user information has been redacted from the replicated databases provided by the system.
 * Open source
 * ·    Open data

As the Tool Labs environment is shared, we ask that you strive not to break things for others, and to be considerate when using system resources.

Individual wiki policies (these differ!)
When developing on Tool Labs, please adhere to the bot policies of the wikis your bot interacts with. Each wiki has its own guidelines and procedures for obtaining approval. The English Wikipedia, for example, requires that a bot be approved by the Bot Approvals Group before it is deployed, and that the bot account be marked with a ‘bot’ flag. See |Wikipedia:Bot policy for more information on the English Wikipedia.

For general information and guidelines, please see |Bot policy