Toolserver:Database access

One of the key features of a Toolserver account is access to a replicated copy of the Wikimedia databases. The databases are stored on two database servers and consist of three clusters: sql-s1, sql-s2, and sql-s3.

Overview
There are three clusters of databases, defined by the Wikimedia Foundation. The Toolserver replicates the clusters as they are. In addition, the server that holds the S1 cluster (rosemary) and the server that holds the S3 cluster (yarrow) also each hold a copy of the Commons database (commonswiki_p) for easier joining to other databases. However, Commons is properly a part of database server cluster 2 (S2), it is not part of S1 or S3.

The servers, for the most part, are real-time copies of the live databases. However, at times there may be replag.

Each Wikimedia project has a specific database name. The general pattern is langcodeproject name_p. The full list is available here (though obviously without the _p's).

By default, your database credentials are stored in $HOME/.my.cnf

A shell script ("sql") exists to make accessing the databases easier.

Command-line / script access
To connect to a specific database (for example, enwiki_p) via a command-line interface, use:

If you are connecting via a script or program, there are two methods of connecting to the appropriate database:
 * 1) Connect to the toolserver database (on any one of our database servers) and run a query like:   This will output an integer that corresponds to the appropriate server. For example, the query just mentioned would output 1 indicating that enwiki_p is located on sql-s1.
 * 2) You can also use the DNS aliases, for example: enwiki-p.db.toolserver.org These are configured to automatically resolve to the correct server for the specified database. Note that "_p" must be converted to "-p" in the DNS aliases.

Batch access
If you plan to run long queries on the database, e.g. for scheduled report jobs, we recommend using batch job scheduling to access the database. This will ensure that the database doesn't become overloaded.

Program access
In most programming languages, it will be sufficient to tell MySQL to use the database credentials found in $HOME/.my.cnf

Below are various examples in a few common programming languages.

phpMyAdmin
Previously, accessing phpMyAdmin on the Toolserver required setting up an SSH tunnel. However, this is no longer the case.

To access phpMyAdmin on the Toolserver, visit https://phpmyadmin.toolserver.org and use your Toolserver username and your LDAP password for the first dialog. When you see the phpmyadmin-login-page use your username and your mysql-password (can be found in the .my.cnf-file in your home).

Database schema
The current MediaWiki database schema is available in the mediawiki svn tree at /trunk/phase3/maintenance/tables.sql. A (possibly out of date) description of the schema is available at mw:Manual:Database layout.

Page text access
FIXME

General caveats
FIXME

Toolserver database
The toolserver (meta-)database is a database called "toolserver", containing information about other databases on the toolserver. It can be read by anyone and is available on all database servers. The host name of the primary copy is sql-toolserver</tt>.

The toolserver database is updated automatically on a daily basis. It is maintained by User:Duesentrieb, but please report any problems to JIRA.

Tables inthe toolserve Database:

wiki
Information about the wikis for which the toolserver has data.

So, for example, to get SQL server, database name and the API URLs of the 10 larges Wikipedias, you could do this:

namespace (deprecated)
This table contains the primary namespace names for the wikis in the wiki table. Use of this table is deprecated, use the namespacename table instead, which covers all valid names for each namespace in each wiki, not just the primary names.

namespacename
This table contains the all valid namespace names for the wikis in the wiki table. Use of this table is preferred over the namespace table, since it covers all valid names for each namespace in each wiki, not just the primary names.

This can be used to build wiki-syntax links from information from the database, e.g.:

language
This table contains language names, by code, in English and the respective language.

For example, to get the languages of the top 10 Wikipedias:

User databases
By default, once a user is created an empty  database is created as well. Users can create additional databases by following the  pattern. For example, if user dab wanted to create a table named, he would name it.

You can allow other users access to your databases by adding the  suffix to the end of the database names. For example, if you wanted to share the previously created image database, you could rename it:

Program access
Simply use  as the hostname rather than   as in the examples above.

Database status files
There are status-files about each cluster (and the user-dbs) at /var/www/status_sclusternumber or  http ://toolserver.org/status_sclusternumber</tt> (e.g., http://toolserver.org/status_s1). In both cases, clusternumber can be an integer from 1 to 6 inclusive.

These files are simple ASCII text files. Lines beginning with " " are comment lines and can safely be ignored. A non-comment-line " " indicates that everything is alright and no known maintenance or service interruption is scheduled.