MySQL queries

MySQL queries allow you to pull data from the Toolserver's replica databases.

Database layout
The database layout is available at the MediaWiki wiki: mw:Manual:Database layout.

There are also two commands you can use to view the layout. SHOW TABLES will show the available tables in a database. DESCRIBE table_name will show the available columns in a specific table.

Clusters
The various databases are stored in three different clusters: S1, S2, and S3. A table of this information is available here.
 * S1 &mdash; enwiki_p
 * S2 &mdash; bgwiki_p, bgwiktionary_p, commonswiki_p, cswiki_p, dewiki_p, enwikiquote_p, enwiktionary_p, eowiki_p, fiwiki_p, idwiki_p, itwiki_p, nlwiki_p, nowiki_p, plwiki_p, ptwiki_p, svwiki_p, thwiki_p, trwiki_p, and zhwiki_p
 * S3 &mdash; all other wikis

Data storage
There are a few tricks to how data is stored in the various tables.
 * 1) Page titles use underscores and never include the namespace prefix.
 * 2) User names use spaces, not underscores.
 * 3) Namespaces are integers. A key to the integers is available here. The actual names for the namespaces can be obtained from the Toolserver database (explained below).

Views
The Toolserver has exact replicas of Wikimedia's databases, however, certain information is restricted to Toolserver users using MySQL views.

For example, the user table does not show things like user_password or user_email to Toolserver users.

Accessing the data
There are a variety of ways to access the databases. From the command line, a shell script exists that automatically selects the correct cluster (S1, S2, or S3) for you. $ sql enwiki_p This command will connect to the appropriate cluster (S1) and give you a MySQL prompt where you can run queries. $ sql enwiki_p < test-query.sql > test-query.txt This command takes a .sql file that contains your query, selects the appropriate cluster, runs the query, and outputs to a text file. The advantage here is that it doesn't add lines around the data (making a table), it instead outputs the data in a tab-delimited format. $ sql enwiki_p < test-query.sql > test-query.txt; gzip test-query.txt This command does the same thing as the command above, but after outputting to a text file, it gzips the data. This can be very helpful if you're dealing with large amounts of data.

The sql shell script is a wrapper around the mysql command. Either method works, though the sql shell script selects the appropriate cluster for you.

If you wish to use the mysql command directly, a sample query would look like this: $ mysql -h sql-s1 enwiki_p -e "DESCRIBE user;" The -h option tells MySQL which host to access (in this case sql-s1). The -e option tells MySQL to echo the results of the query to the terminal. You can also have MySQL output the results to a file. $ mysql -h sql-s1 enwiki_p < test-query.sql > test-query.txt

Writing queries
Because the Toolserver database is read-only, nearly all of the queries you will want to run will be SELECT queries.

This query selects all columns from the user table. More information about MySQL queries are available below (in the example queries) and in the MySQL manual.

Queries end in a semi-colon. If you want to cancel the query, end it in \c. If you want to output in a non-table format, use \g.

Toolserver table
FIXME

Example queries
FIXME