Manual:Database layout/MySQL Optimization/Tutorial



Welcome to the tutorial on MySQL Optimization for MediaWiki, originally given at the Berlin Hackathon 2012. The live tutorial will cover:
 * Why Optimization Matters
 * Why Indices are most important
 * How to avoid Unindexed and Unlimited Queries
 * Using EXPLAIN

The slide deck for this tutorial has specific examples of optimized queries and simple practices that you can use to speed up your queries.

Simple prep you need in order to take this tutorial: For the practice exercise, you will access a database with sample data in Wikimedia Cloud Services. All you need to access it is a Wikimedia developer account and membership in the 'bastion' project (all users who are members of any project are also members of the 'bastion' project). You don't need to be a member of the tutorial project. Before the tutorial, we suggest that you be sure you can access this database. You need to ssh into bastion (ssh bastion.wmflabs.org) and, once you're in, run mysql -h tutorial-mysql -u tutorial commonswiki_partial

You may also want to suggest a query to be used in the demo, via the list below.

Introduction
To many MediaWiki developers, SQL query performance and optimization is shrouded in mystery. Most know that there are efficient and inefficient queries, and that if they write an inefficient query, it will either be noticed during code review, or it will be noticed because it takes down a wiki, which will prompt an ops person to fix the breakage and yell at the developer who caused it. But few people seem to really understand how query performance works.

How can you tell if a query is inefficient? How do you write efficient queries, and avoid inefficient ones? If so few people know this, it must be this difficult, mysterious thing, right? Fortunately, you don't have to be Domas or Tim to understand this. If you understand how a phone book works, you can learn this too.

This tutorial will cover the basics of how database engines in general, and MySQL specifically, execute different kinds of queries, and explain why certain queries are executed more efficiently than others and what role indexes play in this process. We will demonstrate better practices by writing efficient queries, and showing you how to use tables and indexes so they facilitate efficient queries, and discuss common pitfalls that result in inefficient queries and how to address them. We will also demonstrate how to obtain a query analysis from MySQL and how to make sense of it.

Suggest a Query to be Used in the Demo
Below, add a query you want to see optimized in the tutorial Demo (list query suggestions here): Extension:ContributionScores polls the wiki database to locate contributors with the highest contribution volume - this has NOT been tested on a high-volume wiki. The extension is intended for fledgling Wikis looking to add a fun metric for Contributors to see how much they are helping out.
 * 1) ContributionScores query

It is used at translatewiki.net and occasionally causes (very) slow queries there. Example query:

Uses: 2. Batch query vs. many queries Don't have an example off the top of my head, but this may be a less obvious optimisation with potentially big rewards.
 * Special page
 * Includable special page

Feedback and Discussion

 * Collect participants' feedback and questions
 * Reminder to document your discoveries, bugs and optimization tips