Proposed Database Schema Changes/October 2004

'''Have now checked this into CVS HEAD for upcoming 1.5. --brion 08:08, 19 Dec 2004 (UTC)'''

The current cur/old table split is kind of ugly; it is hard to work with and slows down aggregate operations by putting large amounts of bulk text into a table that is frequently being looked over for the smaller metadata fields.

Here's a color-coded diagram of the proposed split and how the old fields (on the left) map to the new ones (on the right):
 * [[Image:Database-restructure.png]]

(page_latest points to the rev_id/old_id of the page's current revision.)

Advantages of new structure:
 * Renames don't require changing the revision or text table, only 'page' and link updates.
 * Aggregate operations don't need to slog through text-laden tables.
 * Don't need to merge cur and old manually in things like contribs
 * rev_id will be a unique id for both current and old revisions (and will retain existing old_id values after conversion)
 * and more!

Disadvantages of new structure:
 * Building and maintaining an index for fulltext search is now harder
 * Not particularly; it's been a separate table for a long time. Anyway the MySQL search is untenable and will be replaced at some point soon. --brion 21:08, 19 Mar 2005 (UTC)

Potential changes:
 * If text.old_id and rev_id are decoupled we could avoid storing a second copy of the text on reverts. Not sure if this is worth the trouble.
 * They are now officially decoupled but this isn't used yet.

'''Conversion code is in maintenance/updaters.inc. There's also a pure-SQL updater file in maintenance/archives somewhere, but which may not be safe in a replicated environment due to use of a variable.'''