Database optimization

Database optimization is an important consideration in MediaWiki coding, particularly when the code is intended for deployment on Wikimedia projects.

General
ORDER BY expression --> filesort == bad. Use COUNT, SUM, GROUP BY, etc.; there is no limit on rows scanned. MAX/MIN of indexed field on entire table is okay. In some cases, denormalize for performance. Add information duplicated from other tables. Use summary tables, counter tables, cache tables, etc. Think about how the DB will run your query. Add indexes where needed. Batch queries (when it makes sense).

Indices
Weigh the advantages of an index in making queries more efficient against the efficiency losses of having to change the index when insertions, deletions, and updates are done. Generally, you shouldn't run unindexed queries. WHERE on rarely false conditions is usually okay. Unindexed ORDER BY (filesort) is never okay.

LIMIT
Use LIMIT to limit number of rows returned. When possible, also limit the number of rows scanned. Remember, LIMIT 10 OFFSET 5000 scans 5010 rows. If you can, use WHERE foo_id >= 12345 instead.

Communication
If you have questions on this topic that are not answered by the documentation, you may wish to go on #wikimedia-tech and talk to binasher or Preilly, both of whom are experts on Wikimedia and performance. If possible, be prepared to explain which queries are involved in the situation. Roan Kattouw, another MySQL optimization expert, answers questions on the wikitech-l mailing list and is sometimes available at #mediawiki.