MediaWiki r11508 - Code Review

Jump to: navigation, search
Repository:MediaWiki
Revision:r11507‎ | r11508 (on ViewVC)‎ | r11509 >
Date:01:45, 26 October 2005
Author:vibber
Status:old
Tags:
Comment:
* (bug 3786) Experimental support for MySQL 4.1/5.0 utf8 charset mode
NOTE: Enabling this may break existing wikis, and still doesn't
work for all Unicode characters due to MySQL limitations.

The $wgDBmysql5 option will run 'SET NAMES utf8' on connection to the
server so it knows we're speaking UTF-8 to it; switching it on/off
can cause interpretation of non-blob columns to change depending
on the charset options set on the server.
Modified paths:

Diff [purge]

Index: trunk/phase3/maintenance/mysql5/tables.sql
@@ -0,0 +1,848 @@
 2+-- Experimental table definitions for MySQL 4.1 and 5.0 with
 3+-- explicit character set support. Not fully tested, may have
 4+-- surprises!
 5+--
 6+-- TODO: Test various fields
 7+-- TODO: Anything else need to be moved to VARBINARY and BLOB?
 8+-- TODO: UCS-2 better than UTF-8?
 9+-- TODO: Find out how to get 4-byte UTF-8 chars into MySQL...
 10+-- An alternate UCS-2 that does UTF-16 conversion would work.
 11+-- TODO: Work on collation usage
 12+
 13+-- ------------------------------------------------------------
 14+
 15+-- SQL to create the initial tables for the MediaWiki database.
 16+-- This is read and executed by the install script; you should
 17+-- not have to run it by itself unless doing a manual install.
 18+
 19+--
 20+-- General notes:
 21+--
 22+-- If possible, create tables as InnoDB to benefit from the
 23+-- superior resiliency against crashes and ability to read
 24+-- during writes (and write during reads!)
 25+--
 26+-- Only the 'searchindex' table requires MyISAM due to the
 27+-- requirement for fulltext index support, which is missing
 28+-- from InnoDB.
 29+--
 30+--
 31+-- The MySQL table backend for MediaWiki currently uses
 32+-- 14-character CHAR or VARCHAR fields to store timestamps.
 33+-- The format is YYYYMMDDHHMMSS, which is derived from the
 34+-- text format of MySQL's TIMESTAMP fields.
 35+--
 36+-- Historically TIMESTAMP fields were used, but abandoned
 37+-- in early 2002 after a lot of trouble with the fields
 38+-- auto-updating.
 39+--
 40+-- The PostgreSQL backend uses DATETIME fields for timestamps,
 41+-- and we will migrate the MySQL definitions at some point as
 42+-- well.
 43+--
 44+--
 45+-- The /*$wgDBprefix*/ comments in this and other files are
 46+-- replaced with the defined table prefix by the installer
 47+-- and updater scripts. If you are installing or running
 48+-- updates manually, you will need to manually insert the
 49+-- table prefix if any when running these scripts.
 50+--
 51+
 52+
 53+--
 54+-- The user table contains basic account information,
 55+-- authentication keys, etc.
 56+--
 57+-- Some multi-wiki sites may share a single central user table
 58+-- between separate wikis using the $wgSharedDB setting.
 59+--
 60+-- Note that when a external authentication plugin is used,
 61+-- user table entries still need to be created to store
 62+-- preferences and to key tracking information in the other
 63+-- tables.
 64+--
 65+CREATE TABLE /*$wgDBprefix*/user (
 66+ user_id int(5) unsigned NOT NULL auto_increment,
 67+
 68+ -- Usernames must be unique, must not be in the form of
 69+ -- an IP address. _Shouldn't_ allow slashes or case
 70+ -- conflicts. Spaces are allowed, and are _not_ converted
 71+ -- to underscores like titles. See the User::newFromName() for
 72+ -- the specific tests that usernames have to pass.
 73+ user_name varchar(255) binary NOT NULL default '',
 74+
 75+ -- Optional 'real name' to be displayed in credit listings
 76+ user_real_name varchar(255) binary NOT NULL default '',
 77+
 78+ -- Password hashes, normally hashed like so:
 79+ -- MD5(CONCAT(user_id,'-',MD5(plaintext_password))), see
 80+ -- wfEncryptPassword() in GlobalFunctions.php
 81+ user_password tinyblob NOT NULL default '',
 82+
 83+ -- When using 'mail me a new password', a random
 84+ -- password is generated and the hash stored here.
 85+ -- The previous password is left in place until
 86+ -- someone actually logs in with the new password,
 87+ -- at which point the hash is moved to user_password
 88+ -- and the old password is invalidated.
 89+ user_newpassword tinyblob NOT NULL default '',
 90+
 91+ -- Note: email should be restricted, not public info.
 92+ -- Same with passwords.
 93+ user_email tinytext NOT NULL default '',
 94+
 95+ -- Newline-separated list of name=value defining the user
 96+ -- preferences
 97+ user_options blob NOT NULL default '',
 98+
 99+ -- This is a timestamp which is updated when a user
 100+ -- logs in, logs out, changes preferences, or performs
 101+ -- some other action requiring HTML cache invalidation
 102+ -- to ensure that the UI is updated.
 103+ user_touched char(14) binary NOT NULL default '',
 104+
 105+ -- A pseudorandomly generated value that is stored in
 106+ -- a cookie when the "remember password" feature is
 107+ -- used (previously, a hash of the password was used, but
 108+ -- this was vulnerable to cookie-stealing attacks)
 109+ user_token char(32) binary NOT NULL default '',
 110+
 111+ -- Initially NULL; when a user's e-mail address has been
 112+ -- validated by returning with a mailed token, this is
 113+ -- set to the current timestamp.
 114+ user_email_authenticated CHAR(14) BINARY,
 115+
 116+ -- Randomly generated token created when the e-mail address
 117+ -- is set and a confirmation test mail sent.
 118+ user_email_token CHAR(32) BINARY,
 119+
 120+ -- Expiration date for the user_email_token
 121+ user_email_token_expires CHAR(14) BINARY,
 122+
 123+ PRIMARY KEY user_id (user_id),
 124+ UNIQUE INDEX user_name (user_name),
 125+ INDEX (user_email_token)
 126+
 127+) TYPE=InnoDB, DEFAULT CHARSET=utf8;
 128+
 129+--
 130+-- User permissions have been broken out to a separate table;
 131+-- this allows sites with a shared user table to have different
 132+-- permissions assigned to a user in each project.
 133+--
 134+-- This table replaces the old user_rights field which used a
 135+-- comma-separated blob.
 136+--
 137+CREATE TABLE /*$wgDBprefix*/user_groups (
 138+ -- Key to user_id
 139+ ug_user int(5) unsigned NOT NULL default '0',
 140+
 141+ -- Group names are short symbolic string keys.
 142+ -- The set of group names is open-ended, though in practice
 143+ -- only some predefined ones are likely to be used.
 144+ --
 145+ -- At runtime $wgGroupPermissions will associate group keys
 146+ -- with particular permissions. A user will have the combined
 147+ -- permissions of any group they're explicitly in, plus
 148+ -- the implicit '*' and 'user' groups.
 149+ ug_group char(16) NOT NULL default '',
 150+
 151+ PRIMARY KEY (ug_user,ug_group),
 152+ KEY (ug_group)
 153+) TYPE=InnoDB, DEFAULT CHARSET=utf8;
 154+
 155+-- Stores notifications of user talk page changes, for the display
 156+-- of the "you have new messages" box
 157+CREATE TABLE /*$wgDBprefix*/user_newtalk (
 158+ -- Key to user.user_id
 159+ user_id int(5) NOT NULL default '0',
 160+ -- If the user is an anonymous user hir IP address is stored here
 161+ -- since the user_id of 0 is ambiguous
 162+ user_ip varchar(40) NOT NULL default '',
 163+ INDEX user_id (user_id),
 164+ INDEX user_ip (user_ip)
 165+) TYPE=InnoDB, DEFAULT CHARSET=utf8;
 166+
 167+
 168+--
 169+-- Core of the wiki: each page has an entry here which identifies
 170+-- it by title and contains some essential metadata.
 171+--
 172+CREATE TABLE /*$wgDBprefix*/page (
 173+ -- Unique identifier number. The page_id will be preserved across
 174+ -- edits and rename operations, but not deletions and recreations.
 175+ page_id int(8) unsigned NOT NULL auto_increment,
 176+
 177+ -- A page name is broken into a namespace and a title.
 178+ -- The namespace keys are UI-language-independent constants,
 179+ -- defined in includes/Defines.php
 180+ page_namespace int NOT NULL,
 181+
 182+ -- The rest of the title, as text.
 183+ -- Spaces are transformed into underscores in title storage.
 184+ page_title varchar(255) binary NOT NULL,
 185+
 186+ -- Comma-separated set of permission keys indicating who
 187+ -- can move or edit the page.
 188+ page_restrictions tinyblob NOT NULL default '',
 189+
 190+ -- Number of times this page has been viewed.
 191+ page_counter bigint(20) unsigned NOT NULL default '0',
 192+
 193+ -- 1 indicates the article is a redirect.
 194+ page_is_redirect tinyint(1) unsigned NOT NULL default '0',
 195+
 196+ -- 1 indicates this is a new entry, with only one edit.
 197+ -- Not all pages with one edit are new pages.
 198+ page_is_new tinyint(1) unsigned NOT NULL default '0',
 199+
 200+ -- Random value between 0 and 1, used for Special:Randompage
 201+ page_random real unsigned NOT NULL,
 202+
 203+ -- This timestamp is updated whenever the page changes in
 204+ -- a way requiring it to be re-rendered, invalidating caches.
 205+ -- Aside from editing this includes permission changes,
 206+ -- creation or deletion of linked pages, and alteration
 207+ -- of contained templates.
 208+ page_touched char(14) binary NOT NULL default '',
 209+
 210+ -- Handy key to revision.rev_id of the current revision.
 211+ -- This may be 0 during page creation, but that shouldn't
 212+ -- happen outside of a transaction... hopefully.
 213+ page_latest int(8) unsigned NOT NULL,
 214+
 215+ -- Uncompressed length in bytes of the page's current source text.
 216+ page_len int(8) unsigned NOT NULL,
 217+
 218+ PRIMARY KEY page_id (page_id),
 219+ UNIQUE INDEX name_title (page_namespace,page_title),
 220+
 221+ -- Special-purpose indexes
 222+ INDEX (page_random),
 223+ INDEX (page_len)
 224+
 225+) TYPE=InnoDB, DEFAULT CHARSET=utf8;
 226+
 227+--
 228+-- Every edit of a page creates also a revision row.
 229+-- This stores metadata about the revision, and a reference
 230+-- to the text storage backend.
 231+--
 232+CREATE TABLE /*$wgDBprefix*/revision (
 233+ rev_id int(8) unsigned NOT NULL auto_increment,
 234+
 235+ -- Key to page_id. This should _never_ be invalid.
 236+ rev_page int(8) unsigned NOT NULL,
 237+
 238+ -- Key to text.old_id, where the actual bulk text is stored.
 239+ -- It's possible for multiple revisions to use the same text,
 240+ -- for instance revisions where only metadata is altered
 241+ -- or a rollback to a previous version.
 242+ rev_text_id int(8) unsigned NOT NULL,
 243+
 244+ -- Text comment summarizing the change.
 245+ -- This text is shown in the history and other changes lists,
 246+ -- rendered in a subset of wiki markup by Linker::formatComment()
 247+ rev_comment tinyblob NOT NULL default '',
 248+
 249+ -- Key to user.user_id of the user who made this edit.
 250+ -- Stores 0 for anonymous edits and for some mass imports.
 251+ rev_user int(5) unsigned NOT NULL default '0',
 252+
 253+ -- Text username or IP address of the editor.
 254+ rev_user_text varchar(255) binary NOT NULL default '',
 255+
 256+ -- Timestamp
 257+ rev_timestamp char(14) binary NOT NULL default '',
 258+
 259+ -- Records whether the user marked the 'minor edit' checkbox.
 260+ -- Many automated edits are marked as minor.
 261+ rev_minor_edit tinyint(1) unsigned NOT NULL default '0',
 262+
 263+ -- Not yet used; reserved for future changes to the deletion system.
 264+ rev_deleted tinyint(1) unsigned NOT NULL default '0',
 265+
 266+ PRIMARY KEY rev_page_id (rev_page, rev_id),
 267+ UNIQUE INDEX rev_id (rev_id),
 268+ INDEX rev_timestamp (rev_timestamp),
 269+ INDEX page_timestamp (rev_page,rev_timestamp),
 270+ INDEX user_timestamp (rev_user,rev_timestamp),
 271+ INDEX usertext_timestamp (rev_user_text,rev_timestamp)
 272+
 273+) TYPE=InnoDB, DEFAULT CHARSET=utf8;
 274+
 275+
 276+--
 277+-- Holds text of individual page revisions.
 278+--
 279+-- Field names are a holdover from the 'old' revisions table in
 280+-- MediaWiki 1.4 and earlier: an upgrade will transform that
 281+-- table into the 'text' table to minimize unnecessary churning
 282+-- and downtime. If upgrading, the other fields will be left unused.
 283+--
 284+CREATE TABLE /*$wgDBprefix*/text (
 285+ -- Unique text storage key number.
 286+ -- Note that the 'oldid' parameter used in URLs does *not*
 287+ -- refer to this number anymore, but to rev_id.
 288+ --
 289+ -- revision.rev_text_id is a key to this column
 290+ old_id int(8) unsigned NOT NULL auto_increment,
 291+
 292+ -- Depending on the contents of the old_flags field, the text
 293+ -- may be convenient plain text, or it may be funkily encoded.
 294+ old_text mediumblob NOT NULL default '',
 295+
 296+ -- Comma-separated list of flags:
 297+ -- gzip: text is compressed with PHP's gzdeflate() function.
 298+ -- utf8: text was stored as UTF-8.
 299+ -- If $wgLegacyEncoding option is on, rows *without* this flag
 300+ -- will be converted to UTF-8 transparently at load time.
 301+ -- object: text field contained a serialized PHP object.
 302+ -- The object either contains multiple versions compressed
 303+ -- together to achieve a better compression ratio, or it refers
 304+ -- to another row where the text can be found.
 305+ old_flags tinyblob NOT NULL default '',
 306+
 307+ PRIMARY KEY old_id (old_id)
 308+
 309+) TYPE=InnoDB, DEFAULT CHARSET=utf8;
 310+
 311+--
 312+-- Holding area for deleted articles, which may be viewed
 313+-- or restored by admins through the Special:Undelete interface.
 314+-- The fields generally correspond to the page, revision, and text
 315+-- fields, with several caveats.
 316+--
 317+CREATE TABLE /*$wgDBprefix*/archive (
 318+ ar_namespace int NOT NULL default '0',
 319+ ar_title varchar(255) binary NOT NULL default '',
 320+
 321+ -- Newly deleted pages will not store text in this table,
 322+ -- but will reference the separately existing text rows.
 323+ -- This field is retained for backwards compatibility,
 324+ -- so old archived pages will remain accessible after
 325+ -- upgrading from 1.4 to 1.5.
 326+ -- Text may be gzipped or otherwise funky.
 327+ ar_text mediumblob NOT NULL default '',
 328+
 329+ -- Basic revision stuff...
 330+ ar_comment tinyblob NOT NULL default '',
 331+ ar_user int(5) unsigned NOT NULL default '0',
 332+ ar_user_text varchar(255) binary NOT NULL,
 333+ ar_timestamp char(14) binary NOT NULL default '',
 334+ ar_minor_edit tinyint(1) NOT NULL default '0',
 335+
 336+ -- See ar_text note.
 337+ ar_flags tinyblob NOT NULL default '',
 338+
 339+ -- When revisions are deleted, their unique rev_id is stored
 340+ -- here so it can be retained after undeletion. This is necessary
 341+ -- to retain permalinks to given revisions after accidental delete
 342+ -- cycles or messy operations like history merges.
 343+ --
 344+ -- Old entries from 1.4 will be NULL here, and a new rev_id will
 345+ -- be created on undeletion for those revisions.
 346+ ar_rev_id int(8) unsigned,
 347+
 348+ -- For newly deleted revisions, this is the text.old_id key to the
 349+ -- actual stored text. To avoid breaking the block-compression scheme
 350+ -- and otherwise making storage changes harder, the actual text is
 351+ -- *not* deleted from the text table, merely hidden by removal of the
 352+ -- page and revision entries.
 353+ --
 354+ -- Old entries deleted under 1.2-1.4 will have NULL here, and their
 355+ -- ar_text and ar_flags fields will be used to create a new text
 356+ -- row upon undeletion.
 357+ ar_text_id int(8) unsigned,
 358+
 359+ KEY name_title_timestamp (ar_namespace,ar_title,ar_timestamp)
 360+
 361+) TYPE=InnoDB, DEFAULT CHARSET=utf8;
 362+
 363+
 364+--
 365+-- Track page-to-page hyperlinks within the wiki.
 366+--
 367+CREATE TABLE /*$wgDBprefix*/pagelinks (
 368+ -- Key to the page_id of the page containing the link.
 369+ pl_from int(8) unsigned NOT NULL default '0',
 370+
 371+ -- Key to page_namespace/page_title of the target page.
 372+ -- The target page may or may not exist, and due to renames
 373+ -- and deletions may refer to different page records as time
 374+ -- goes by.
 375+ pl_namespace int NOT NULL default '0',
 376+ pl_title varchar(255) binary NOT NULL default '',
 377+
 378+ UNIQUE KEY pl_from(pl_from,pl_namespace,pl_title),
 379+ KEY (pl_namespace,pl_title)
 380+
 381+) TYPE=InnoDB, DEFAULT CHARSET=utf8;
 382+
 383+
 384+--
 385+-- Track links to images *used inline*
 386+-- We don't distinguish live from broken links here, so
 387+-- they do not need to be changed on upload/removal.
 388+--
 389+CREATE TABLE /*$wgDBprefix*/imagelinks (
 390+ -- Key to page_id of the page containing the image / media link.
 391+ il_from int(8) unsigned NOT NULL default '0',
 392+
 393+ -- Filename of target image.
 394+ -- This is also the page_title of the file's description page;
 395+ -- all such pages are in namespace 6 (NS_IMAGE).
 396+ il_to varchar(255) binary NOT NULL default '',
 397+
 398+ UNIQUE KEY il_from(il_from,il_to),
 399+ KEY (il_to)
 400+
 401+) TYPE=InnoDB, DEFAULT CHARSET=utf8;
 402+
 403+--
 404+-- Track category inclusions *used inline*
 405+-- This tracks a single level of category membership
 406+-- (folksonomic tagging, really).
 407+--
 408+CREATE TABLE /*$wgDBprefix*/categorylinks (
 409+ -- Key to page_id of the page defined as a category member.
 410+ cl_from int(8) unsigned NOT NULL default '0',
 411+
 412+ -- Name of the category.
 413+ -- This is also the page_title of the category's description page;
 414+ -- all such pages are in namespace 14 (NS_CATEGORY).
 415+ cl_to varchar(255) binary NOT NULL default '',
 416+
 417+ -- The title of the linking page, or an optional override
 418+ -- to determine sort order. Sorting is by binary order, which
 419+ -- isn't always ideal, but collations seem to be an exciting
 420+ -- and dangerous new world in MySQL...
 421+ --
 422+ -- For MySQL 4.1+ with charset set to utf8, the sort key *index*
 423+ -- needs cut to be smaller than 1024 bytes (at 3 bytes per char).
 424+ -- To sort properly on the shorter key, this field needs to be
 425+ -- the same shortness.
 426+ cl_sortkey varchar(86) binary NOT NULL default '',
 427+
 428+ -- This isn't really used at present. Provided for an optional
 429+ -- sorting method by approximate addition time.
 430+ cl_timestamp timestamp NOT NULL,
 431+
 432+ UNIQUE KEY cl_from(cl_from,cl_to),
 433+
 434+ -- We always sort within a given category...
 435+ KEY cl_sortkey(cl_to,cl_sortkey),
 436+
 437+ -- Not really used?
 438+ KEY cl_timestamp(cl_to,cl_timestamp)
 439+
 440+) TYPE=InnoDB, DEFAULT CHARSET=utf8;
 441+
 442+--
 443+-- Contains a single row with some aggregate info
 444+-- on the state of the site.
 445+--
 446+CREATE TABLE /*$wgDBprefix*/site_stats (
 447+ -- The single row should contain 1 here.
 448+ ss_row_id int(8) unsigned NOT NULL,
 449+
 450+ -- Total number of page views, if hit counters are enabled.
 451+ ss_total_views bigint(20) unsigned default '0',
 452+
 453+ -- Total number of edits performed.
 454+ ss_total_edits bigint(20) unsigned default '0',
 455+
 456+ -- An approximate count of pages matching the following criteria:
 457+ -- * in namespace 0
 458+ -- * not a redirect
 459+ -- * contains the text '[['
 460+ -- See Article::isCountable() in includes/Article.php
 461+ ss_good_articles bigint(20) unsigned default '0',
 462+
 463+ -- Total pages, theoretically equal to SELECT COUNT(*) FROM page; except faster
 464+ ss_total_pages bigint(20) default '-1',
 465+
 466+ -- Number of users, theoretically equal to SELECT COUNT(*) FROM user;
 467+ ss_users bigint(20) default '-1',
 468+
 469+ -- Deprecated, no longer updated as of 1.5
 470+ ss_admins int(10) default '-1',
 471+
 472+ UNIQUE KEY ss_row_id (ss_row_id)
 473+
 474+) TYPE=InnoDB;
 475+
 476+--
 477+-- Stores an ID for every time any article is visited;
 478+-- depending on $wgHitcounterUpdateFreq, it is
 479+-- periodically cleared and the page_counter column
 480+-- in the page table updated for the all articles
 481+-- that have been visited.)
 482+--
 483+CREATE TABLE /*$wgDBprefix*/hitcounter (
 484+ hc_id INTEGER UNSIGNED NOT NULL
 485+) TYPE=HEAP MAX_ROWS=25000;
 486+
 487+
 488+--
 489+-- The internet is full of jerks, alas. Sometimes it's handy
 490+-- to block a vandal or troll account.
 491+--
 492+CREATE TABLE /*$wgDBprefix*/ipblocks (
 493+ -- Primary key, introduced for privacy.
 494+ ipb_id int(8) NOT NULL auto_increment,
 495+
 496+ -- Blocked IP address in dotted-quad form or user name.
 497+ ipb_address varchar(40) binary NOT NULL default '',
 498+
 499+ -- Blocked user ID or 0 for IP blocks.
 500+ ipb_user int(8) unsigned NOT NULL default '0',
 501+
 502+ -- User ID who made the block.
 503+ ipb_by int(8) unsigned NOT NULL default '0',
 504+
 505+ -- Text comment made by blocker.
 506+ ipb_reason tinyblob NOT NULL default '',
 507+
 508+ -- Creation (or refresh) date in standard YMDHMS form.
 509+ -- IP blocks expire automatically.
 510+ ipb_timestamp char(14) binary NOT NULL default '',
 511+
 512+ -- Indicates that the IP address was banned because a banned
 513+ -- user accessed a page through it. If this is 1, ipb_address
 514+ -- will be hidden, and the block identified by block ID number.
 515+ ipb_auto tinyint(1) NOT NULL default '0',
 516+
 517+ -- Time at which the block will expire.
 518+ ipb_expiry char(14) binary NOT NULL default '',
 519+
 520+ PRIMARY KEY ipb_id (ipb_id),
 521+ INDEX ipb_address (ipb_address),
 522+ INDEX ipb_user (ipb_user)
 523+
 524+) TYPE=InnoDB, DEFAULT CHARSET=utf8;
 525+
 526+
 527+--
 528+-- Uploaded images and other files.
 529+--
 530+CREATE TABLE /*$wgDBprefix*/image (
 531+ -- Filename.
 532+ -- This is also the title of the associated description page,
 533+ -- which will be in namespace 6 (NS_IMAGE).
 534+ img_name varchar(255) binary NOT NULL default '',
 535+
 536+ -- File size in bytes.
 537+ img_size int(8) unsigned NOT NULL default '0',
 538+
 539+ -- For images, size in pixels.
 540+ img_width int(5) NOT NULL default '0',
 541+ img_height int(5) NOT NULL default '0',
 542+
 543+ -- Extracted EXIF metadata stored as a serialized PHP array.
 544+ img_metadata mediumblob NOT NULL,
 545+
 546+ -- For images, bits per pixel if known.
 547+ img_bits int(3) NOT NULL default '0',
 548+
 549+ -- Media type as defined by the MEDIATYPE_xxx constants
 550+ img_media_type ENUM("UNKNOWN", "BITMAP", "DRAWING", "AUDIO", "VIDEO", "MULTIMEDIA", "OFFICE", "TEXT", "EXECUTABLE", "ARCHIVE") default NULL,
 551+
 552+ -- major part of a MIME media type as defined by IANA
 553+ -- see http://www.iana.org/assignments/media-types/
 554+ img_major_mime ENUM("unknown", "application", "audio", "image", "text", "video", "message", "model", "multipart") NOT NULL default "unknown",
 555+
 556+ -- minor part of a MIME media type as defined by IANA
 557+ -- the minor parts are not required to adher to any standard
 558+ -- but should be consistent throughout the database
 559+ -- see http://www.iana.org/assignments/media-types/
 560+ img_minor_mime varchar(32) NOT NULL default "unknown",
 561+
 562+ -- Description field as entered by the uploader.
 563+ -- This is displayed in image upload history and logs.
 564+ img_description tinyblob NOT NULL default '',
 565+
 566+ -- user_id and user_name of uploader.
 567+ img_user int(5) unsigned NOT NULL default '0',
 568+ img_user_text varchar(255) binary NOT NULL default '',
 569+
 570+ -- Time of the upload.
 571+ img_timestamp char(14) binary NOT NULL default '',
 572+
 573+ PRIMARY KEY img_name (img_name),
 574+
 575+ -- Used by Special:Imagelist for sort-by-size
 576+ INDEX img_size (img_size),
 577+
 578+ -- Used by Special:Newimages and Special:Imagelist
 579+ INDEX img_timestamp (img_timestamp)
 580+
 581+) TYPE=InnoDB, DEFAULT CHARSET=utf8;
 582+
 583+--
 584+-- Previous revisions of uploaded files.
 585+-- Awkwardly, image rows have to be moved into
 586+-- this table at re-upload time.
 587+--
 588+CREATE TABLE /*$wgDBprefix*/oldimage (
 589+ -- Base filename: key to image.img_name
 590+ oi_name varchar(255) binary NOT NULL default '',
 591+
 592+ -- Filename of the archived file.
 593+ -- This is generally a timestamp and '!' prepended to the base name.
 594+ oi_archive_name varchar(255) binary NOT NULL default '',
 595+
 596+ -- Other fields as in image...
 597+ oi_size int(8) unsigned NOT NULL default 0,
 598+ oi_width int(5) NOT NULL default 0,
 599+ oi_height int(5) NOT NULL default 0,
 600+ oi_bits int(3) NOT NULL default 0,
 601+ oi_description tinyblob NOT NULL default '',
 602+ oi_user int(5) unsigned NOT NULL default '0',
 603+ oi_user_text varchar(255) binary NOT NULL default '',
 604+ oi_timestamp char(14) binary NOT NULL default '',
 605+
 606+ INDEX oi_name (oi_name(10))
 607+
 608+) TYPE=InnoDB, DEFAULT CHARSET=utf8;
 609+
 610+
 611+--
 612+-- Primarily a summary table for Special:Recentchanges,
 613+-- this table contains some additional info on edits from
 614+-- the last few days, see Article::editUpdates()
 615+--
 616+CREATE TABLE /*$wgDBprefix*/recentchanges (
 617+ rc_id int(8) NOT NULL auto_increment,
 618+ rc_timestamp varchar(14) binary NOT NULL default '',
 619+ rc_cur_time varchar(14) binary NOT NULL default '',
 620+
 621+ -- As in revision
 622+ rc_user int(10) unsigned NOT NULL default '0',
 623+ rc_user_text varchar(255) binary NOT NULL default '',
 624+
 625+ -- When pages are renamed, their RC entries do _not_ change.
 626+ rc_namespace int NOT NULL default '0',
 627+ rc_title varchar(255) binary NOT NULL default '',
 628+
 629+ -- as in revision...
 630+ rc_comment varchar(255) binary NOT NULL default '',
 631+ rc_minor tinyint(3) unsigned NOT NULL default '0',
 632+
 633+ -- Edits by user accounts with the 'bot' rights key are
 634+ -- marked with a 1 here, and will be hidden from the
 635+ -- default view.
 636+ rc_bot tinyint(3) unsigned NOT NULL default '0',
 637+
 638+ rc_new tinyint(3) unsigned NOT NULL default '0',
 639+
 640+ -- Key to page_id (was cur_id prior to 1.5).
 641+ -- This will keep links working after moves while
 642+ -- retaining the at-the-time name in the changes list.
 643+ rc_cur_id int(10) unsigned NOT NULL default '0',
 644+
 645+ -- rev_id of the given revision
 646+ rc_this_oldid int(10) unsigned NOT NULL default '0',
 647+
 648+ -- rev_id of the prior revision, for generating diff links.
 649+ rc_last_oldid int(10) unsigned NOT NULL default '0',
 650+
 651+ -- These may no longer be used, with the new move log.
 652+ rc_type tinyint(3) unsigned NOT NULL default '0',
 653+ rc_moved_to_ns tinyint(3) unsigned NOT NULL default '0',
 654+ rc_moved_to_title varchar(255) binary NOT NULL default '',
 655+
 656+ -- If the Recent Changes Patrol option is enabled,
 657+ -- users may mark edits as having been reviewed to
 658+ -- remove a warning flag on the RC list.
 659+ -- A value of 1 indicates the page has been reviewed.
 660+ rc_patrolled tinyint(3) unsigned NOT NULL default '0',
 661+
 662+ -- Recorded IP address the edit was made from, if the
 663+ -- $wgPutIPinRC option is enabled.
 664+ rc_ip char(15) NOT NULL default '',
 665+
 666+ PRIMARY KEY rc_id (rc_id),
 667+ INDEX rc_timestamp (rc_timestamp),
 668+ INDEX rc_namespace_title (rc_namespace, rc_title),
 669+ INDEX rc_cur_id (rc_cur_id),
 670+ INDEX new_name_timestamp(rc_new,rc_namespace,rc_timestamp),
 671+ INDEX rc_ip (rc_ip)
 672+
 673+) TYPE=InnoDB, DEFAULT CHARSET=utf8;
 674+
 675+CREATE TABLE /*$wgDBprefix*/watchlist (
 676+ -- Key to user.user_id
 677+ wl_user int(5) unsigned NOT NULL,
 678+
 679+ -- Key to page_namespace/page_title
 680+ -- Note that users may watch pages which do not exist yet,
 681+ -- or existed in the past but have been deleted.
 682+ wl_namespace int NOT NULL default '0',
 683+ wl_title varchar(255) binary NOT NULL default '',
 684+
 685+ -- Timestamp when user was last sent a notification e-mail;
 686+ -- cleared when the user visits the page.
 687+ wl_notificationtimestamp varchar(14) binary,
 688+
 689+ UNIQUE KEY (wl_user, wl_namespace, wl_title),
 690+ KEY namespace_title (wl_namespace,wl_title)
 691+
 692+) TYPE=InnoDB, DEFAULT CHARSET=utf8;
 693+
 694+
 695+--
 696+-- Used by the math module to keep track
 697+-- of previously-rendered items.
 698+--
 699+CREATE TABLE /*$wgDBprefix*/math (
 700+ -- Binary MD5 hash of the latex fragment, used as an identifier key.
 701+ math_inputhash varbinary(16) NOT NULL,
 702+
 703+ -- Not sure what this is, exactly...
 704+ math_outputhash varbinary(16) NOT NULL,
 705+
 706+ -- texvc reports how well it thinks the HTML conversion worked;
 707+ -- if it's a low level the PNG rendering may be preferred.
 708+ math_html_conservativeness tinyint(1) NOT NULL,
 709+
 710+ -- HTML output from texvc, if any
 711+ math_html text,
 712+
 713+ -- MathML output from texvc, if any
 714+ math_mathml text,
 715+
 716+ UNIQUE KEY math_inputhash (math_inputhash)
 717+
 718+) TYPE=InnoDB, DEFAULT CHARSET=utf8;
 719+
 720+--
 721+-- When using the default MySQL search backend, page titles
 722+-- and text are munged to strip markup, do Unicode case folding,
 723+-- and prepare the result for MySQL's fulltext index.
 724+--
 725+-- This table must be MyISAM; InnoDB does not support the needed
 726+-- fulltext index.
 727+--
 728+CREATE TABLE /*$wgDBprefix*/searchindex (
 729+ -- Key to page_id
 730+ si_page int(8) unsigned NOT NULL,
 731+
 732+ -- Munged version of title
 733+ si_title varchar(255) NOT NULL default '',
 734+
 735+ -- Munged version of body text
 736+ si_text mediumtext NOT NULL default '',
 737+
 738+ UNIQUE KEY (si_page),
 739+ FULLTEXT si_title (si_title),
 740+ FULLTEXT si_text (si_text)
 741+
 742+) TYPE=MyISAM, DEFAULT CHARSET=utf8;
 743+
 744+--
 745+-- Recognized interwiki link prefixes
 746+--
 747+CREATE TABLE /*$wgDBprefix*/interwiki (
 748+ -- The interwiki prefix, (e.g. "Meatball", or the language prefix "de")
 749+ iw_prefix char(32) NOT NULL,
 750+
 751+ -- The URL of the wiki, with "$1" as a placeholder for an article name.
 752+ -- Any spaces in the name will be transformed to underscores before
 753+ -- insertion.
 754+ iw_url char(127) NOT NULL,
 755+
 756+ -- A boolean value indicating whether the wiki is in this project
 757+ -- (used, for example, to detect redirect loops)
 758+ iw_local BOOL NOT NULL,
 759+
 760+ -- Boolean value indicating whether interwiki transclusions are allowed.
 761+ iw_trans TINYINT(1) NOT NULL DEFAULT 0,
 762+
 763+ UNIQUE KEY iw_prefix (iw_prefix)
 764+
 765+) TYPE=InnoDB, DEFAULT CHARSET=utf8;
 766+
 767+--
 768+-- Used for caching expensive grouped queries
 769+--
 770+CREATE TABLE /*$wgDBprefix*/querycache (
 771+ -- A key name, generally the base name of of the special page.
 772+ qc_type char(32) NOT NULL,
 773+
 774+ -- Some sort of stored value. Sizes, counts...
 775+ qc_value int(5) unsigned NOT NULL default '0',
 776+
 777+ -- Target namespace+title
 778+ qc_namespace int NOT NULL default '0',
 779+ qc_title char(255) binary NOT NULL default '',
 780+
 781+ KEY (qc_type,qc_value)
 782+
 783+) TYPE=InnoDB, DEFAULT CHARSET=utf8;
 784+
 785+--
 786+-- For a few generic cache operations if not using Memcached
 787+--
 788+CREATE TABLE /*$wgDBprefix*/objectcache (
 789+ keyname char(255) binary not null default '',
 790+ value mediumblob,
 791+ exptime datetime,
 792+ unique key (keyname),
 793+ key (exptime)
 794+
 795+) TYPE=InnoDB, DEFAULT CHARSET=utf8;
 796+
 797+-- For article validation
 798+CREATE TABLE /*$wgDBprefix*/validate (
 799+ val_user int(11) NOT NULL default '0',
 800+ val_page int(11) unsigned NOT NULL default '0',
 801+ val_revision int(11) unsigned NOT NULL default '0',
 802+ val_type int(11) unsigned NOT NULL default '0',
 803+ val_value int(11) default '0',
 804+ val_comment varchar(255) NOT NULL default '',
 805+ val_ip varchar(20) NOT NULL default '',
 806+ KEY val_user (val_user,val_revision)
 807+) TYPE=InnoDB, DEFAULT CHARSET=utf8;
 808+
 809+
 810+CREATE TABLE /*$wgDBprefix*/logging (
 811+ -- Symbolic keys for the general log type and the action type
 812+ -- within the log. The output format will be controlled by the
 813+ -- action field, but only the type controls categorization.
 814+ log_type char(10) NOT NULL default '',
 815+ log_action char(10) NOT NULL default '',
 816+
 817+ -- Timestamp. Duh.
 818+ log_timestamp char(14) NOT NULL default '19700101000000',
 819+
 820+ -- The user who performed this action; key to user_id
 821+ log_user int unsigned NOT NULL default 0,
 822+
 823+ -- Key to the page affected. Where a user is the target,
 824+ -- this will point to the user page.
 825+ log_namespace int NOT NULL default 0,
 826+ log_title varchar(255) binary NOT NULL default '',
 827+
 828+ -- Freeform text. Interpreted as edit history comments.
 829+ log_comment varchar(255) NOT NULL default '',
 830+
 831+ -- LF separated list of miscellaneous parameters
 832+ log_params blob NOT NULL default '',
 833+
 834+ KEY type_time (log_type, log_timestamp),
 835+ KEY user_time (log_user, log_timestamp),
 836+ KEY page_time (log_namespace, log_title, log_timestamp)
 837+
 838+) TYPE=InnoDB, DEFAULT CHARSET=utf8;
 839+
 840+CREATE TABLE /*$wgDBprefix*/trackbacks (
 841+ tb_id integer AUTO_INCREMENT PRIMARY KEY,
 842+ tb_page integer REFERENCES page(page_id) ON DELETE CASCADE,
 843+ tb_title varchar(255) NOT NULL,
 844+ tb_url varchar(255) NOT NULL,
 845+ tb_ex text,
 846+ tb_name varchar(255),
 847+
 848+ INDEX (tb_page)
 849+) TYPE=InnoDB, DEFAULT CHARSET=utf8;
Property changes on: trunk/phase3/maintenance/mysql5/tables.sql
___________________________________________________________________
Name: svn:keywords
1850 + Author Date Id Revision
Name: svn:eol-style
2851 + native
Index: trunk/phase3/config/index.php
@@ -382,6 +382,7 @@
383383 $conf->DBpassword = importPost( "DBpassword" );
384384 $conf->DBpassword2 = importPost( "DBpassword2" );
385385 $conf->DBprefix = importPost( "DBprefix" );
 386+ $conf->DBmysql5 = (importPost( "DBmysql5" ) == "true") ? "true" : "false";
386387 $conf->RootPW = importPost( "RootPW" );
387388 $conf->LanguageCode = importPost( "LanguageCode", "en" );
388389 $conf->SysopName = importPost( "SysopName", "WikiSysop" );
@@ -570,6 +571,14 @@
571572 <a href='http://dev.mysql.com/doc/mysql/en/old-client.html'
572573 >http://dev.mysql.com/doc/mysql/en/old-client.html</a> for help.</b>";
573574 }
 575+ if( $wgDBmysql5 ) {
 576+ if( $mysqlNewAuth ) {
 577+ print "; enabling MySQL 4.1/5.0 charset mode";
 578+ } else {
 579+ print "; <b class='error'>MySQL 4.1/5.0 charset mode enabled,
 580+ but older version detected; will likely fail.</b>";
 581+ }
 582+ }
574583 print "</li>\n";
575584
576585 if ($conf->DBtype == 'mysql') {
@@ -627,7 +636,13 @@
628637 # FIXME: Check for errors
629638 print "<li>Creating tables...";
630639 if ($conf->DBtype == 'mysql') {
631 - dbsource( "../maintenance/tables.sql", $wgDatabase );
 640+ if( $wgDBmysql5 ) {
 641+ print " using MySQL 5 table defs...";
 642+ dbsource( "../maintenance/mysql5/tables.sql", $wgDatabase );
 643+ } else {
 644+ print " using MySQL 4 table defs...";
 645+ dbsource( "../maintenance/tables.sql", $wgDatabase );
 646+ }
632647 dbsource( "../maintenance/interwiki.sql", $wgDatabase );
633648 } else {
634649 dbsource( "../maintenance/oracle/tables.sql", $wgDatabase );
@@ -960,6 +975,20 @@
961976
962977 <p>Avoid exotic characters; something like <tt>mw_</tt> is good.</p>
963978 </div>
 979+
 980+ <div class="config-input"><label class="column">Database charset</label>
 981+ <div>Select one:</div>
 982+ <ul class="plain">
 983+ <li><?php aField( $conf, "DBmysql5", "Backwards-compatible UTF-8", "radio", "false" ); ?></li>
 984+ <li><?php aField( $conf, "DBmysql5", "Experimental MySQL 4.1/5.0 UTF-8", "radio", "true" ); ?></li>
 985+ </ul>
 986+ </div>
 987+ <p class="config-desc">
 988+ <b>EXPERIMENTAL:</b> You can enable explicit Unicode charset support
 989+ for MySQL 4.1 and 5.0 servers. This is not well tested and may
 990+ cause things to break. <b>If upgrading an older installation, leave
 991+ in backwards-compatible mode.</b>
 992+ </p>
964993
965994 <div class="config-input">
966995 <?php
@@ -1158,6 +1187,9 @@
11591188 \$wgDBprefix = \"{$slconf['DBprefix']}\";
11601189 \$wgDBtype = \"{$slconf['DBtype']}\";
11611190
 1191+# Experimental charset support for MySQL 4.1/5.0.
 1192+\$wgDBmysql5 = {$conf->DBmysql5};
 1193+
11621194 ## Shared memory settings
11631195 \$wgMainCacheType = $cacheType;
11641196 \$wgMemCachedServers = $mcservers;
Index: trunk/phase3/includes/DefaultSettings.php
@@ -473,6 +473,24 @@
474474 $wgDBmysql4 = true;
475475
476476 /**
 477+ * Set to true to engage MySQL 4.1/5.0 charset-related features;
 478+ * for now will just cause sending of 'SET NAMES=utf8' on connect.
 479+ *
 480+ * WARNING: THIS IS EXPERIMENTAL!
 481+ *
 482+ * May break if you're not using the table defs from mysql5/tables.sql.
 483+ * May break if you're upgrading an existing wiki if set differently.
 484+ * Broken symptoms likely to include incorrect behavior with page titles,
 485+ * usernames, comments etc containing non-ASCII characters.
 486+ * Might also cause failures on the object cache and other things.
 487+ *
 488+ * Even correct usage may cause failures with Unicode supplementary
 489+ * characters (those not in the Basic Multilingual Plane) unless MySQL
 490+ * has enhanced their Unicode support.
 491+ */
 492+$wgDBmysql5 = false;
 493+
 494+/**
477495 * Other wikis on this site, can be administered from a single developer
478496 * account.
479497 * Array, interwiki prefix => database name
Index: trunk/phase3/includes/Database.php
@@ -274,6 +274,14 @@
275275 if ( !$success ) {
276276 $this->reportConnectionError();
277277 }
 278+
 279+ global $wgDBmysql5;
 280+ if( $wgDBmysql5 ) {
 281+ // Tell the server we're communicating with it in UTF-8.
 282+ // This may engage various charset conversions.
 283+ $this->query( 'SET NAMES utf8' );
 284+ }
 285+
278286 $this->mOpened = $success;
279287 return $success;
280288 }
Index: trunk/phase3/RELEASE-NOTES
@@ -167,6 +167,9 @@
168168 * Added @import for [[MediaWiki:Common.css]] to all skins
169169 * Removed hardcoded Norwegian (no) project namespaces
170170 * (bug 3735) Fix to run under MySQL 5's strict mode
 171+* (bug 3786) Experimental support for MySQL 4.1/5.0 utf8 charset mode
 172+ NOTE: Enabling this may break existing wikis, and still doesn't
 173+ work for all Unicode characters due to MySQL limitations.
171174
172175
173176 === Caveats ===

Past revisions this follows-up on

Rev.Commit summaryAuthorDate
r11507* (bug 3786) Experimental support for MySQL 4.1/5.0 utf8 charset mode...vibber01:10, 26 October 2005

Status & tagging log

  • 01:58, 13 October 2010 ^demon (talk | contribs) changed the status of r11508 [removed: new added: old]